These are representative sample records from Science.gov related to your search topic.
For comprehensive and current results, perform a real-time search at Science.gov.
1

Genomic sequencing in cancer.  

PubMed

Genomic sequencing has provided critical insights into the etiology of both simple and complex diseases. The enormous reductions in cost for whole genome sequencing have allowed this technology to gain increasing use. Whole genome analysis has impacted research of complex diseases including cancer by allowing the systematic analysis of entire genomes in a single experiment, thereby facilitating the discovery of somatic and germline mutations, and identification of the insertions, deletions, and structural rearrangements, including translocations and inversions, in novel disease genes. Whole-genome sequencing can be used to provide the most comprehensive characterization of the cancer genome, the complexity of which we are only beginning to understand. Hence in this review, we focus on whole-genome sequencing in cancer. PMID:23178448

Tuna, Musaffe; Amos, Christopher I

2013-11-01

2

Science Originals: Sequencing Cancer Genomes: Targeted Cancer Therapies  

NSDL National Science Digital Library

Applying DNA sequencing to cancer genomes is providing insights that have allowed researchers to turn some cancers into chronic diseases rather than deadly ones. Still, the ultimate goal is to kill the cancer.

Robert Frederick (AAAS;)

2011-03-25

3

Advances in understanding cancer genomes through second-generation sequencing  

Microsoft Academic Search

Cancers are caused by the accumulation of genomic alterations. Therefore, analyses of cancer genome sequences and structures provide insights for understanding cancer biology, diagnosis and therapy. The application of second-generation DNA sequencing technologies (also known as next-generation sequencing) — through whole-genome, whole-exome and whole-transcriptome approaches — is allowing substantial advances in cancer genomics. These methods are facilitating an increase in

Stacey Gabriel; Gad Getz; Matthew Meyerson

2010-01-01

4

Perspectives of integrative cancer genomics in next generation sequencing era.  

PubMed

The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

2012-06-01

5

Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology  

PubMed Central

Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

2014-01-01

6

Genome Sequencing Centers  

Cancer.gov

The Cancer Genome Atlas (TCGA) Genome Sequencing Centers (GSCs) perform large-scale DNA sequencing using the latest sequencing technologies. Supported by the National Human Genome Research Institute (NHGRI) large-scale sequencing program, the GSCs generate the enormous volume of data required by TCGA, while continually improving existing technologies and methods to expand the frontier of what can be achieved in cancer genome sequencing.

7

Identifying driver mutations in sequenced cancer genomes: computational approaches to enable  

E-print Network

and treatment of cancer. Challenges of cancer genome sequencing and analysis Cancer is driven largely by somatic]. Unfortunately, highly recurrent muta- tions with a corresponding drug treatment are unknown for most cancer somatic mutations, a subset of which present new targets for cancer diagnostics and treatment [5

Raphael, Ben J.

8

Genome Sequencing and Analysis of the Tasmanian Devil and Its Transmissible Cancer  

PubMed Central

Summary The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations. PaperClip PMID:22341448

Murchison, Elizabeth P.; Schulz-Trieglaff, Ole B.; Ning, Zemin; Alexandrov, Ludmil B.; Bauer, Markus J.; Fu, Beiyuan; Hims, Matthew; Ding, Zhihao; Ivakhno, Sergii; Stewart, Caitlin; Ng, Bee Ling; Wong, Wendy; Aken, Bronwen; White, Simon; Alsop, Amber; Becq, Jennifer; Bignell, Graham R.; Cheetham, R. Keira; Cheng, William; Connor, Thomas R.; Cox, Anthony J.; Feng, Zhi-Ping; Gu, Yong; Grocock, Russell J.; Harris, Simon R.; Khrebtukova, Irina; Kingsbury, Zoya; Kowarsky, Mark; Kreiss, Alexandre; Luo, Shujun; Marshall, John; McBride, David J.; Murray, Lisa; Pearse, Anne-Maree; Raine, Keiran; Rasolonjatovo, Isabelle; Shaw, Richard; Tedder, Philip; Tregidgo, Carolyn; Vilella, Albert J.; Wedge, David C.; Woods, Gregory M.; Gormley, Niall; Humphray, Sean; Schroth, Gary; Smith, Geoffrey; Hall, Kevin; Searle, Stephen M.J.; Carter, Nigel P.; Papenfuss, Anthony T.; Futreal, P. Andrew; Campbell, Peter J.; Yang, Fengtang; Bentley, David R.; Evers, Dirk J.; Stratton, Michael R.

2012-01-01

9

Trastuzumab and beyond: sequencing cancer genomes and predicting molecular networks  

Microsoft Academic Search

Life diversity can now be clearly explored with the next-generation DNA sequencing technology, allowing the discovery of genetic variants among individuals, patients and tumors. However, beyond causal mutations catalog completion, systems medicine is essential to link genotype to phenotypic cancer diversity towards personalized medicine. Despite advances with traditional single genes molecular research, including rare mutations in BRCA1\\/2 and CDH1 for

D H Roukos

2011-01-01

10

Discrepancies in cancer genomic sequencing highlight opportunities for driver mutation discovery.  

PubMed

Cancer genome sequencing is being used at an increasing rate to identify actionable driver mutations that can inform therapeutic intervention strategies. A comparison of two of the most prominent cancer genome sequencing databases from different institutes (Cancer Cell Line Encyclopedia and Catalogue of Somatic Mutations in Cancer) revealed marked discrepancies in the detection of missense mutations in identical cell lines (57.38% conformity). The main reason for this discrepancy is inadequate sequencing of GC-rich areas of the exome. We have therefore mapped over 400 regions of consistent inadequate sequencing (cold-spots) in known cancer-causing genes and kinases, in 368 of which neither institute finds mutations. We demonstrate, using a newly identified PAK4 mutation as proof of principle, that specific targeting and sequencing of these GC-rich cold-spot regions can lead to the identification of novel driver mutations in known tumor suppressors and oncogenes. We highlight that cross-referencing between genomic databases is required to comprehensively assess genomic alterations in commonly used cell lines and that there are still significant opportunities to identify novel drivers of tumorigenesis in poorly sequenced areas of the exome. Finally, we assess other reasons for the observed discrepancy, such as variations in dbSNP filtering and the acquisition/loss of mutations, to give explanations as to why there is a discrepancy in pharmacogenomic studies, given recent concerns with poor reproducibility of data. Cancer Res; 74(22); 6390-6. ©2014 AACR. PMID:25256751

Hudson, Andrew M; Yates, Tim; Li, Yaoyong; Trotter, Eleanor W; Fawdar, Shameem; Chapman, Phil; Lorigan, Paul; Biankin, Andrew; Miller, Crispin J; Brognard, John

2014-11-15

11

Detection and Mapping of Amplified DNA Sequences in Breast Cancer by Comparative Genomic Hybridization  

Microsoft Academic Search

Comparative genomic hybridization was applied to 5 breast cancer cell lines and 33 primary tumors to discover and map regions of the genome with increased DNA-sequence copy-number. Two-thirds of primary tumors and almost all cell lines showed increased DNA-sequence copy-number affecting a total of 26 chromosomal subregions. Most of these loci were distinct from those of currently known amplified genes

Anne Kallioniemi; Olli-Pekka Kallioniemi; Jim Piper; Minna Tanner; Trond Stokke; Ling Chen; Helene S. Smith; Dan Pinkel; Joe W. Gray; Frederic M. Waldman

1994-01-01

12

Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer.  

PubMed

Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as "stitchers," to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication-licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

Morrison, Carl D; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C; Johnson, Candace S; Trump, Donald L

2014-02-11

13

The Cancer Genome Atlas (TCGA)  

Cancer.gov

The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

14

SomatiCA: Identifying, Characterizing and Quantifying Somatic Copy Number Aberrations from Cancer Genome Sequencing Data  

PubMed Central

Whole genome sequencing of matched tumor-normal sample pairs is becoming routine in cancer research. However, analysis of somatic copy-number changes from sequencing data is still challenging because of insufficient sequencing coverage, unknown tumor sample purity and subclonal heterogeneity. Here we describe a computational framework, named SomatiCA, which explicitly accounts for tumor purity and subclonality in the analysis of somatic copy-number profiles. Taking read depths (RD) and lesser allele frequencies (LAF) as input, SomatiCA will output 1) admixture rate for each tumor sample, 2) somatic allelic copy-number for each genomic segment, 3) fraction of tumor cells with subclonal change in each somatic copy number aberration (SCNA), and 4) a list of substantial genomic aberration events including gain, loss and LOH. SomatiCA is available as a Bioconductor R package at http://www.bioconductor.org/packages/2.13/bioc/html/SomatiCA.html. PMID:24265680

Chen, Mengjie; Gunel, Murat; Zhao, Hongyu

2013-01-01

15

Clinical genomics information management software linking cancer genome sequence and clinical decisions.  

PubMed

Using sequencing information to guide clinical decision-making requires coordination of a diverse set of people and activities. In clinical genomics, the process typically includes sample acquisition, template preparation, genome data generation, analysis to identify and confirm variant alleles, interpretation of clinical significance, and reporting to clinicians. We describe a software application developed within a clinical genomics study, to support this entire process. The software application tracks patients, samples, genomic results, decisions and reports across the cohort, monitors progress and sends reminders, and works alongside an electronic data capture system for the trial's clinical and genomic data. It incorporates systems to read, store, analyze and consolidate sequencing results from multiple technologies, and provides a curated knowledge base of tumor mutation frequency (from the COSMIC database) annotated with clinical significance and drug sensitivity to generate reports for clinicians. By supporting the entire process, the application provides deep support for clinical decision making, enabling the generation of relevant guidance in reports for verification by an expert panel prior to forwarding to the treating physician. PMID:23603536

Watt, Stuart; Jiao, Wei; Brown, Andrew M K; Petrocelli, Teresa; Tran, Ben; Zhang, Tong; McPherson, John D; Kamel-Reid, Suzanne; Bedard, Philippe L; Onetto, Nicole; Hudson, Thomas J; Dancey, Janet; Siu, Lillian L; Stein, Lincoln; Ferretti, Vincent

2013-09-01

16

Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine  

PubMed Central

High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein sequence or structure. Finally, we review techniques to identify recurrent combinations of somatic mutations, including approaches that examine mutations in known pathways or protein-interaction networks, as well as de novo approaches that identify combinations of mutations according to statistical patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer. PMID:24479672

2014-01-01

17

U87MG Decoded: The Genomic Sequence of a Cytogenetically Aberrant Human Cancer Cell Line  

PubMed Central

U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30× genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date. PMID:20126413

Eskin, Ascia; Lee, Hane; Merriman, Barry; Nelson, Stanley F.

2010-01-01

18

Clonal evolution in breast cancer revealed by single nucleus genome sequencing.  

PubMed

Sequencing studies of breast tumour cohorts have identified many prevalent mutations, but provide limited insight into the genomic diversity within tumours. Here we developed a whole-genome and exome single cell sequencing approach called nuc-seq that uses G2/M nuclei to achieve 91% mean coverage breadth. We applied this method to sequence single normal and tumour nuclei from an oestrogen-receptor-positive (ER(+)) breast cancer and a triple-negative ductal carcinoma. In parallel, we performed single nuclei copy number profiling. Our data show that aneuploid rearrangements occurred early in tumour evolution and remained highly stable as the tumour masses clonally expanded. In contrast, point mutations evolved gradually, generating extensive clonal diversity. Using targeted single-molecule sequencing, many of the diverse mutations were shown to occur at low frequencies (<10%) in the tumour mass. Using mathematical modelling we found that the triple-negative tumour cells had an increased mutation rate (13.3×), whereas the ER(+) tumour cells did not. These findings have important implications for the diagnosis, therapeutic treatment and evolution of chemoresistance in breast cancer. PMID:25079324

Wang, Yong; Waters, Jill; Leung, Marco L; Unruh, Anna; Roh, Whijae; Shi, Xiuqing; Chen, Ken; Scheet, Paul; Vattathil, Selina; Liang, Han; Multani, Asha; Zhang, Hong; Zhao, Rui; Michor, Franziska; Meric-Bernstam, Funda; Navin, Nicholas E

2014-08-14

19

Next-generation sequencing reveals frequent consistent genomic alterations in small cell undifferentiated lung cancer  

PubMed Central

Aims Small cell lung cancer (SCLC) carries a poor prognosis, and the systemic therapies currently used as treatments are only modestly effective, as demonstrated by a low 5-year survival at only ?5%. In this retrospective collected from March 2013 to study, we performed comprehensive genomic profiling of 98 small cell undifferentiated lung cancer (SCLC) samples to identify potential targets of therapy not currently searched for in routine clinical practice. Methods DNA from 98 SCLC was sequenced to high, uniform coverage (Illumina HiSeq 2500) and analysed for all classes of genomic alterations. Results A total of 386 alterations were identified for an average of 3.9 alterations per tumour (range 1–10). Fifty-two (53%) of cases harboured at least 1 actionable alteration with the potential to personalise therapy including base substitutions, amplifications or homozygous deletions in RICTOR (10%), KIT (7%), PIK3CA (6%), EGFR (5%), PTEN (5%), KRAS (5%), MCL1 (4%), FGFR1 (4%), BRCA2, (4%), TSC1 (3%), NF1 (3%), EPHA3 (3%) and CCND1. The most common non-actionable genomic alterations were alterations in TP53 (86% of SCLC cases), RB1 (54%) and MLL2 (17%). Conclusions Greater than 50% of the SCLC cases harboured at least one actionable alteration. Given the limited treatment options and poor prognosis of patients with SCLC, comprehensive genomic profiling has the potential to identify new treatment paradigms and meet an unmet clinical need for this disease. PMID:24978188

Ross, J S; Wang, K; Elkadi, O R; Tarasen, A; Foulke, L; Sheehan, C E; Otto, G A; Palmer, G; Yelensky, R; Lipson, D; Chmielecki, J; Ali, S M; Elvin, J; Morosini, D; Miller, V A; Stephens, P J

2014-01-01

20

Identification of gains and losses of DNA sequences in primary bladder cancer by comparative genomic hybridization.  

PubMed

Comparative genomic hybridization (CGH) makes it possible to detect losses and gains of DNA sequences along all chromosomes in a tumor specimen based on the hybridization of differentially labeled tumor and normal DNA to normal human metaphase chromosomes. In this study, CGH analysis was applied to the identification of genomic imbalances in 26 bladder cancers in order to gain information on the genetic events underlying the development and progression of this malignancy. Losses affecting 11p, 11q, 8p, 9, 17p, 3p, and 12q were all seen in more than 20% of the tumors. The minimal common region of loss in each chromosome was identified based on the analysis of overlapping deletions in different tumors. Gains of DNA sequences were most often found at chromosomal regions distinct from the locations of currently known oncogenes. The bands involved in more than 10% of the tumors were 8q21, 13q21-q34, 1q31, 3q24-q26, and 1p22. In conclusion, these CGH data highlight several previously unreported genetic alterations in bladder cancer. Further detailed studies of these regions with specific molecular genetic techniques may lead to the identification of tumor suppressor genes and oncogenes that play an important role in bladder tumorigenesis. PMID:7536461

Kallioniemi, A; Kallioniemi, O P; Citro, G; Sauter, G; DeVries, S; Kerschmann, R; Caroll, P; Waldman, F

1995-03-01

21

Cancer Genome Anatomy Project  

NSDL National Science Digital Library

The National Cancer Institute has launched the Cancer Genome Anatomy Project to "achieve a comprehensive molecular characterization of normal, precancerous, and malignant cells." Sequenced genes are held as library entries in a database and are available for downloading (fasta format). Each cDNA library entry may include biological source, number of sequences, and library construction detail information. Thousands of gene sequences are available for over 15 cancers, including breast, colon, and prostrate. Contact information for donating or obtaining tissue samples for research purposes is provided.

1997-01-01

22

Genomic Datasets for Cancer Research  

Cancer.gov

A variety of datasets from genome-wide association studies of cancer and other genotype-phenotype studies, including sequencing and molecular diagnostic assays, are available to approved investigators through the Extramural National Cancer Institute (NCI) Data Access Committee (DAC).

23

A study based on whole-genome sequencing yields a rare variant at 8q24 associated with prostate cancer  

PubMed Central

Western countries, prostate cancer is the most prevalent cancer of men, and one of the leading causes of cancer-related death in men. Several genome-wide association studies have yielded numerous common variants conferring risk of prostate cancer. In the present study we analyzed 32.5 million variants discovered by whole-genome sequencing 1,795 Icelanders. One variant was found to be associated with prostate cancer in European populations: rs188140481[A] (OR = 2.90, Pcomb = 6.2×10?34) located on 8q24, with an average risk allele control frequency of 0.54%. This variant is only very weakly correlated (r2 ? 0.06) with previously reported risk variants on 8q24, and remains significant after adjustment for all of them. Carriers of rs188140481[A] were diagnosed with prostate cancer 1.26 years younger than non-carriers (P = 0.0059). We also report results for the previously described HOXB13 mutation (rs138213197[T]), confirming it as prostate cancer risk variant in populations from all over Europe. PMID:23104005

Gudmundsson, Julius; Sulem, Patrick; Gudbjartsson, Daniel F.; Masson, Gisli; Agnarsson, Bjarni A.; Benediktsdottir, Kristrun R.; Sigurdsson, Asgeir; Magnusson, Olafur Th.; Gudjonsson, Sigurjon A.; Magnusdottir, Droplaug N.; Johannsdottir, Hrefna; Helgadottir, Hafdis Th.; Stacey, Simon N.; Jonasdottir, Adalbjorg; Olafsdottir, Stefania B.; Thorleifsson, Gudmar; Jonasson, Jon G.; Tryggvadottir, Laufey; Navarrete, Sebastian; Fuertes, Fernando; Helfand, Brian T.; Hu, Qiaoyan; Csiki, Irma E.; Mates, Ioan N.; Jinga, Viorel; Aben, Katja K. H.; van Oort, Inge M.; Vermeulen, Sita H.; Donovan, Jenny L.; Hamdy, Freddy C.; Ng, Chi-Fai; Chiu, Peter K.F.; Lau, Kin-Mang; Ng, Maggie C.Y.; Gulcher, Jeffrey R.; Kong, Augustine; Catalona, William J.; Mayordomo, Jose I.; Einarsson, Gudmundur V.; Barkardottir, Rosa B.; Jonsson, Eirikur; Mates, Dana; Neal, David E.; Kiemeney, Lambertus A.; Thorsteinsdottir, Unnur; Rafnar, Thorunn; Stefansson, Kari

2013-01-01

24

Genome-wide identification and expression analysis of microRNA involved in small cell lung cancer via deep sequencing.  

PubMed

Small cell lung cancer is a major cause of mortality worldwide. microRNAs (miRNAs) are involved in various biological processes through regulating gene expression. In the present study, to identify the miRNAs involved in human small cell lung cancer at the genome-wide level, Solexa sequencing was employed to sequence two small RNA (sRNA) libraries from small cell lung cancer tissues (LC sRNA library) and the corresponding normal tissues (NT sRNA library). Deep sequencing of the two sRNA libraries identified a number of conserved miRNAs and differential expression analysis of these miRNAs revealed 81 miRNAs differentially expressed in small cell lung cancer, of which more than half were downregulated. The expression trends determined by sequencing were validated by reverse transcription-quantitative polymerase chain reaction analysis. The annotations for the targets of these miRNAs were predicted. This study provides valuable information for understanding the regulatory mechanisms of miRNAs involved in human small cell lung cancer. PMID:25190105

Yan, Chunhua; Shi, Xiaodong; Wang, Qiushi; Wang, Yue; Liu, Yaxin; Zhang, Xiaofei; Yang, Yuandi; Lv, Fuzhen; Shao, Yuxia

2014-11-01

25

Prenatal Whole Genome Sequencing  

PubMed Central

With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means. PMID:22777977

Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

2014-01-01

26

From human genome to cancer genome: The first decade  

PubMed Central

The realization that cancer progression required the participation of cellular genes provided one of several key rationales, in 1986, for embarking on the human genome project. Only with a reference genome sequence could the full spectrum of somatic changes leading to cancer be understood. Since its completion in 2003, the human reference genome sequence has fulfilled its promise as a foundational tool to illuminate the pathogenesis of cancer. Herein, we review the key historical milestones in cancer genomics since the completion of the genome, and some of the novel discoveries that are shaping our current understanding of cancer. PMID:23817046

Wheeler, David A.; Wang, Linghua

2013-01-01

27

The Cancer Genome Atlas completes detailed ovarian cancer analysis:  

Cancer.gov

An analysis of genomic changes in ovarian cancer has provided the most comprehensive and integrated view of cancer genes for any cancer type to date. Ovarian serous adenocarcinoma tumors from 500 patients were examined by The Cancer Genome Atlas (TCGA) Research Network. TCGA researchers completed whole-exome sequencing, which examines the protein-coding regions of the genome, on an unprecedented 316 tumors.

28

Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges  

PubMed Central

Accurate detection of somatic copy number variations (CNVs) is an essential part of cancer genome analysis, and plays an important role in oncotarget identifications. Next generation sequencing (NGS) holds the promise to revolutionize somatic CNV detection. In this review, we provide an overview of current analytic tools used for CNV detection in NGS-based cancer studies. We summarize the NGS data types used for CNV detection, decipher the principles for data preprocessing, segmentation, and interpretation, and discuss the challenges in somatic CNV detection. This review aims to provide a guide to the analytic tools used in NGS-based cancer CNV studies, and to discuss the important factors that researchers need to consider when analyzing NGS data for somatic CNV detections. PMID:24240121

Liu, Biao; Morrison, Carl D.; Johnson, Candace S.; Trump, Donald L.; Qin, Maochun; Conroy, Jeffrey C.; Wang, Jianmin; Liu, Song

2013-01-01

29

The Pediatric Cancer Genome Project  

PubMed Central

The St. Jude Children’s Research Hospital–Washington University Pediatric Cancer Genome Project (PCGP) is participating in the international effort to identify somatic mutations that drive cancer. These cancer genome sequencing efforts will not only yield an unparalleled view of the altered signaling pathways in cancer but should also identify new targets against which novel therapeutics can be developed. Although these projects are still deep in the phase of generating primary DNA sequence data, important results are emerging and valuable community resources are being generated that should catalyze future cancer research. We describe here the rationale for conducting the PCGP, present some of the early results of this project and discuss the major lessons learned and how these will affect the application of genomic sequencing in the clinic. PMID:22641210

Downing, James R; Wilson, Richard K; Zhang, Jinghui; Mardis, Elaine R; Pui, Ching-Hon; Ding, Li; Ley, Timothy J; Evans, William E

2013-01-01

30

Understanding Cancer Series: Cancer Genomics  

MedlinePLUS

... Cancer Statistics Research & Funding News About NCI Understanding Cancer Series Posted: 01/28/2005 Reviewed: 09/01/ ... Dictionary Search for Clinical Trials NCI Publications Español Cancer Genomics Slide Number and Title What Is the ...

31

Exploration of liver cancer genomes.  

PubMed

Liver cancer is the third leading cause of cancer-related death worldwide. Advances in sequencing technologies have enabled the examination of liver cancer genomes at high resolution; somatic mutations, structural alterations, HBV integration, RNA editing and retrotransposon changes have been comprehensively identified. Furthermore, integrated analyses of trans-omics data (genome, transcriptome and methylome data) have identified multiple critical genes and pathways implicated in hepatocarcinogenesis. These analyses have uncovered potential therapeutic targets, including growth factor signalling, WNT signalling, the NFE2L2-mediated oxidative pathway and chromatin modifying factors, and paved the way for new molecular classifications for clinical application. The aetiological factors associated with liver cancer are well understood; however, their effects on the accumulation of somatic changes and the influence of ethnic variation in risk factors still remain unknown. The international collaborations of cancer genome sequencing projects are expected to contribute to an improved understanding of risk evaluation, diagnosis and therapy for this cancer. PMID:24473361

Shibata, Tatsuhiro; Aburatani, Hiroyuki

2014-06-01

32

Evolution of the cancer genome  

PubMed Central

The advent of massively parallel sequencing technologies has allowed the characterization of cancer genomes at an unprecedented resolution. Investigation of the mutational landscape of tumours is providing new insights into cancer genome evolution, laying bare the interplay of somatic mutation, adaptation of clones to their environment and natural selection. These studies have demonstrated the extent of the heterogeneity of cancer genomes, have allowed inferences to be made about the forces that act on nascent cancer clones as they evolve and have shown insight into the mutational processes that generate genetic variation. Here we review our emerging understanding of the dynamic evolution of the cancer genome and of the implications for basic cancer biology and the development of antitumour therapy. PMID:23044827

Yates, Lucy R.; Campbell, Peter J.

2013-01-01

33

Whole-genome sequencing analysis of phenotypic heterogeneity and anticipation in Li-Fraumeni cancer predisposition syndrome.  

PubMed

The Li-Fraumeni syndrome (LFS) and its variant form (LFL) is a familial predisposition to multiple forms of childhood, adolescent, and adult cancers associated with germ-line mutation in the TP53 tumor suppressor gene. Individual disparities in tumor patterns are compounded by acceleration of cancer onset with successive generations. It has been suggested that this apparent anticipation pattern may result from germ-line genomic instability in TP53 mutation carriers, causing increased DNA copy-number variations (CNVs) with successive generations. To address the genetic basis of phenotypic disparities of LFS/LFL, we performed whole-genome sequencing (WGS) of 13 subjects from two generations of an LFS kindred. Neither de novo CNV nor significant difference in total CNV was detected in relation with successive generations or with age at cancer onset. These observations were consistent with an experimental mouse model system showing that trp53 deficiency in the germ line of father or mother did not increase CNV occurrence in the offspring. On the other hand, individual records on 1,771 TP53 mutation carriers from 294 pedigrees were compiled to assess genetic anticipation patterns (International Agency for Research on Cancer TP53 database). No strictly defined anticipation pattern was observed. Rather, in multigeneration families, cancer onset was delayed in older compared with recent generations. These observations support an alternative model for apparent anticipation in which rare variants from noncarrier parents may attenuate constitutive resistance to tumorigenesis in the offspring of TP53 mutation carriers with late cancer onset. PMID:25313051

Ariffin, Hany; Hainaut, Pierre; Puzio-Kuter, Anna; Choong, Soo Sin; Chan, Adelyne Sue Li; Tolkunov, Denis; Rajagopal, Gunaretnam; Kang, Wenfeng; Lim, Leon Li Wen; Krishnan, Shekhar; Chen, Kok-Siong; Achatz, Maria Isabel; Karsa, Mawar; Shamsani, Jannah; Levine, Arnold J; Chan, Chang S

2014-10-28

34

Genome Sequence Databases (Overview): Sequencing and Assembly  

SciTech Connect

From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

Lapidus, Alla L.

2009-01-01

35

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

Schadt, Christopher Warren [ORNL; Baker, Scott [Pacific Northwest National Laboratory (PNNL); Thykaer, Jette [Pacific Northwest National Laboratory (PNNL); Adney, William S [National Renewable Energy Laboratory (NREL); Brettin, Tom [Los Alamos National Laboratory (LANL); Brockman, Fred [Pacific Northwest National Laboratory (PNNL); Dhaeseleer, Patrick [Lawrence Livermore National Laboratory (LLNL); Martinez, A diego [Los Alamos National Laboratory (LANL); Miller, R michael [Argonne National Laboratory (ANL); Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Torok, Tamas [U.S. Department of Energy, Joint Genome Institute; Tuskan, Gerald A [ORNL; Bennett, Joan [Rutgers University; Berka, Randy [Novozymes, Inc; Briggs, Steven [University of California, San Diego; Heitman, Joseph [Duke University; Rizvi, L [Royal Ontario Museum; Taylor, John [University of California, Berkeley; Turgeon, Gillian [Cornell University; Werner-Washburne, Maggie [University of New Mexico, Albuquerque; Himmel, Michael [ORNL

2008-01-01

36

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions. Published by Elsevier Ltd on behalf of The British Mycological Society.

Baker, Scott [Pacific Northwest National Laboratory (PNNL); Thykaer, Jette [Pacific Northwest National Laboratory (PNNL); Adney, William S [National Renewable Energy Laboratory (NREL); Brettin, Tom [Los Alamos National Laboratory (LANL); Brockman, Fred [Pacific Northwest National Laboratory (PNNL); Dhaeseleer, Patrick [Lawrence Livermore National Laboratory (LLNL); Martinez, A diego [Los Alamos National Laboratory (LANL); Miller, R michael [Argonne National Laboratory (ANL); Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Schadt, Christopher Warren [ORNL; Torok, Tamas [U.S. Department of Energy, Joint Genome Institute; Tuskan, Gerald A [ORNL; Bennett, Joan [Rutgers University; Berka, Randy [Novozymes, Inc; Briggs, Steven [University of California, San Diego; Heitman, Joseph [Duke University; Taylor, John [University of California, Berkeley; Turgeon, Gillian [Cornell University; Werner-Washburne, Maggie [University of New Mexico, Albuquerque; Himmel, Michael E [National Renewable Energy Laboratory (NREL)

2008-01-01

37

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

Baker, Scott E.; Thykaer, Jette; Adney, William S.; Brettin, T.; Brockman, Fred J.; D'haeseleer, Patrik; Martinez, Antonio D.; Miller, R. M.; Rokhsar, Daniel S.; Schadt, Christopher W.; Torok, Tamas; Tuskan, Gerald; Bennett, Joan W.; Berka, Randy; Briggs, Steve; Heitman, Joseph; Taylor, John; Turgeon, Barbara G.; Werner-Washburne, Maggie; Himmel, Michael E.

2008-09-30

38

Whole-exome sequencing combined with functional genomics reveals novel candidate driver cancer genes in endometrial cancer.  

PubMed

Endometrial cancer is the most common gynecological malignancy, with more than 280,000 cases occurring annually worldwide. Although previous studies have identified important common somatic mutations in endometrial cancer, they have primarily focused on a small set of known cancer genes and have thus provided a limited view of the molecular basis underlying this disease. Here we have developed an integrated systems-biology approach to identifying novel cancer genes contributing to endometrial tumorigenesis. We first performed whole-exome sequencing on 13 endometrial cancers and matched normal samples, systematically identifying somatic alterations with high precision and sensitivity. We then combined bioinformatics prioritization with high-throughput screening (including both shRNA-mediated knockdown and expression of wild-type and mutant constructs) in a highly sensitive cell viability assay. Our results revealed 12 potential driver cancer genes including 10 tumor-suppressor candidates (ARID1A, INHBA, KMO, TTLL5, GRM8, IGFBP3, AKTIP, PHKA2, TRPS1, and WNT11) and two oncogene candidates (ERBB3 and RPS6KC1). The results in the "sensor" cell line were recapitulated by siRNA-mediated knockdown in endometrial cancer cell lines. Focusing on ARID1A, we integrated mutation profiles with functional proteomics in 222 endometrial cancer samples, demonstrating that ARID1A mutations frequently co-occur with mutations in the phosphatidylinositol 3-kinase (PI3K) pathway and are associated with PI3K pathway activation. siRNA knockdown in endometrial cancer cell lines increased AKT phosphorylation supporting ARID1A as a novel regulator of PI3K pathway activity. Our study presents the first unbiased view of somatic coding mutations in endometrial cancer and provides functional evidence for diverse driver genes and mutations in this disease. PMID:23028188

Liang, Han; Cheung, Lydia W T; Li, Jie; Ju, Zhenlin; Yu, Shuangxing; Stemke-Hale, Katherine; Dogruluk, Turgut; Lu, Yiling; Liu, Xiuping; Gu, Chao; Guo, Wei; Scherer, Steven E; Carter, Hannah; Westin, Shannon N; Dyer, Mary D; Verhaak, Roeland G W; Zhang, Fan; Karchin, Rachel; Liu, Chang-Gong; Lu, Karen H; Broaddus, Russell R; Scott, Kenneth L; Hennessy, Bryan T; Mills, Gordon B

2012-11-01

39

Cancer: Genomics of metastasis  

Microsoft Academic Search

Cancer cells that invade other parts of the body do so by accumulating genomic aberrations. Analysis of the genomic differences between primary and metastatic tumours should aid the understanding of this process.

Joe Gray

2010-01-01

40

Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with high-throughput sequencing  

Microsoft Academic Search

BACKGROUND: Cancer cells undergo massive alterations to their DNA methylation patterns that result in aberrant gene expression and malignant phenotypes. However, the mechanisms that underlie methylome changes are not well understood nor is the genomic distribution of DNA methylation changes well characterized. RESULTS: Here, we performed methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq) to obtain whole-genome DNA methylation profiles

Yoshinao Ruike; Yukako Imanaka; Fumiaki Sato; Kazuharu Shimizu; Gozoh Tsujimoto

2010-01-01

41

Next-generation sequencing for the diagnosis of hereditary breast and ovarian cancer using genomic capture targeting multiple candidate genes.  

PubMed

To optimize the molecular diagnosis of hereditary breast and ovarian cancer (HBOC), we developed a next-generation sequencing (NGS)-based screening based on the capture of a panel of genes involved, or suspected to be involved in HBOC, on pooling of indexed DNA and on paired-end sequencing in an Illumina GAIIx platform, followed by confirmation by Sanger sequencing or MLPA/QMPSF. The bioinformatic pipeline included CASAVA, NextGENe, CNVseq and Alamut-HT. We validated this procedure by the analysis of 59 patients' DNAs harbouring SNVs, indels or large genomic rearrangements of BRCA1 or BRCA2. We also conducted a blind study in 168 patients comparing NGS versus Sanger sequencing or MLPA analyses of BRCA1 and BRCA2. All mutations detected by conventional procedures were detected by NGS. We then screened, using three different versions of the capture set, a large series of 708 consecutive patients. We detected in these patients 69 germline deleterious alterations within BRCA1 and BRCA2, and 4 TP53 mutations in 468 patients also tested for this gene. We also found 36 variations inducing either a premature codon stop or a splicing defect among other genes: 5/708 in CHEK2, 3/708 in RAD51C, 1/708 in RAD50, 7/708 in PALB2, 3/708 in MRE11A, 5/708 in ATM, 3/708 in NBS1, 1/708 in CDH1, 3/468 in MSH2, 2/468 in PMS2, 1/708 in BARD1, 1/468 in PMS1 and 1/468 in MLH3. These results demonstrate the efficiency of NGS in performing molecular diagnosis of HBOC. Detection of mutations within other genes than BRCA1 and BRCA2 highlights the genetic heterogeneity of HBOC. PMID:24549055

Castéra, Laurent; Krieger, Sophie; Rousselin, Antoine; Legros, Angélina; Baumann, Jean-Jacques; Bruet, Olivia; Brault, Baptiste; Fouillet, Robin; Goardon, Nicolas; Letac, Olivier; Baert-Desurmont, Stéphanie; Tinat, Julie; Bera, Odile; Dugast, Catherine; Berthet, Pascaline; Polycarpe, Florence; Layet, Valérie; Hardouin, Agnes; Frébourg, Thierry; Vaur, Dominique

2014-11-01

42

Comparative effectiveness of next generation genomic sequencing for disease diagnosis: Design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes.  

PubMed

Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of results visit, and findings not associated with colorectal cancer (incidental findings) during a second return of results visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

Gallego, Carlos J; Bennette, Caroline S; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M; Bennett, Robin L; Dorschner, Michael O; Tarczy-Hornoch, Peter; Grady, William M; Fullerton, S Malia; Trinidad, Susan B; Regier, Dean A; Nickerson, Deborah A; Burke, Wylie; Patrick, Donald L; Jarvik, Gail P; Veenstra, David L

2014-09-01

43

Comparative effectiveness of next generation genomic sequencing for disease diagnosis: Design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes?  

PubMed Central

Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of result visit, and findings not associated with colorectal cancer (incidental findings) during a second return of result visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care. PMID:24997220

Gallego, Carlos J.; Bennette, Caroline S.; Heagerty, Patrick; Comstock, Bryan; Horike-Pyne, Martha; Hisama, Fuki; Amendola, Laura M.; Bennett, Robin L.; Dorschner, Michael O.; Tarczy-Hornoch, Peter; Grady, William M.; Fullerton, S. Malia; Trinidad, Susan B.; Regier, Dean A.; Nickerson, Deborah A.; Burke, Wylie; Patrick, Donald L.; Jarvik, Gail P.; Veenstra, David L.

2014-01-01

44

Cancer genomics and pathology: all together now.  

PubMed

Cancer develops from a single cell with stepwise accumulation of genomic alterations. Recent innovative sequencing technologies have made it possible to sequence the full cancer genome. Cancer genome sequencing has been productive and helpful in the discovery of novel cancer genes. It also has revealed previously unknown but intriguing features of the cancer genome such as chromothripsis and kataegis. However, careful comparison of these studies has suggested that analyses of most tumors still seem to be incomplete, and histopathological diagnosis/classification will be essential for refining these data. Based on the improvement of technology and the completion of the cancer gene catalog, genetic diagnosis, such as examination of all potentially druggable mutations, of individual cancers will be performed routinely together with histological diagnosis. Pathologists will play a central role in both interpreting these patho-molecular diagnoses for oncologists, and the process of decision-making necessary for individualized medicine. PMID:23005591

Shibata, Tatsuhiro

2012-10-01

45

Whole-genome sequences of DA and F344 rats with different susceptibilities to arthritis, autoimmunity, inflammation and cancer.  

PubMed

DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301

Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S

2013-08-01

46

The Trichomonas vaginalis Genome Sequencing Project  

NSDL National Science Digital Library

The Institute for Genomic Research (TIGR) in 2003 released the first draft assembly of the Trichomonas vaginalis_genome, available through this website to the academic and not-for-profit research community for noncommercial use only. TIGR will release more data at regular intervals during the sequencing project, which should help researchers better understand this widespread parasite and its role in HIV infection, neo-natal disorders, predisposition to cervical cancer, and of course, vaginitis. The website also includes background information on T. vaginalis, as well as a link to TIGR's sequencing project for Entamoeba histolytica -- a closely related organism.

47

Understanding Cancer Series: Cancer Genomics  

Cancer.gov

Understanding Cancer and Related Topics Understanding Cancer Genomics The art in this tutorial is copyrighted and may not be reused for commercial gain. Please do not remove the NCI logo or the copyright mark from any slide. These tutorials may be copied only if they are distributed free of charge for educational purposes.

48

Whole-Genome Sequencing of Asian Lung Cancers: Second-Hand Smoke Unlikely to Be Responsible for Higher Incidence of Lung Cancer among Asian Never-Smokers.  

PubMed

Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status. Cancer Res; 74(21); 6071-81. ©2014 AACR. PMID:25189529

Krishnan, Vidhya G; Ebert, Philip J; Ting, Jason C; Lim, Elaine; Wong, Swee-Seong; Teo, Audrey S M; Yue, Yong G; Chua, Hui-Hoon; Ma, Xiwen; Loh, Gary S L; Lin, Yuhao; Tan, Joanna H J; Yu, Kun; Zhang, Shenli; Reinhard, Christoph; Tan, Daniel S W; Peters, Brock A; Lincoln, Stephen E; Ballinger, Dennis G; Laramie, Jason M; Nilsen, Geoffrey B; Barber, Thomas D; Tan, Patrick; Hillmer, Axel M; Ng, Pauline C

2014-11-01

49

Integrating sequence, evolution and functional genomics in regulatory genomics  

PubMed Central

With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

2009-01-01

50

Sequencing Complex Genomic Regions  

SciTech Connect

Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

Eichler, Evan [University of Washington

2009-05-28

51

Sequencing Complex Genomic Regions  

SciTech Connect

Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

Eichler, Evan [University of Washington

2009-05-28

52

NCI Center for Cancer Genomics  

Cancer.gov

NCI’s Center for Cancer Genomics applies genome science to better diagnose and treat cancer patients. The Center supports research to identify the genetic drivers of cancer and to advance the adoption of precise tumor diagnosis and treatment.

53

Somatic retrotransposition in the cancer genome  

E-print Network

Cancer is a complex disease of the genome exhibiting myriad somatic mutations, from single nucleotide changes to various chromosomal rearrangements. The technological advances of next-generation sequencing enable high-throughput ...

Helman, Elena

2014-01-01

54

The genomic complexity of primary human prostate cancer  

E-print Network

Prostate cancer is the second most common cause of male cancer deaths in the United States. However, the full range of prostate cancer genomic alterations is incompletely characterized. Here we present the complete sequence ...

Carter, Scott Lambert

55

Endometrial and acute myeloid leukemia cancer genomes characterized  

Cancer.gov

The characterization of acute myeloid leukemia and endometrial cancer are the latest results of The Cancer Genome Atlas Research Network’s efforts to sequence the genomes of 20 major cancers. The photo above shows technicians from The Genome Institute at Washington University in St. Louis.

56

Meeting Highlights: Genome Sequencing and Biology 2001  

PubMed Central

We bring you a report from the CSHL Genome Sequencing and Biology Meeting, which has a long and prestigious history. This year there were sessions on large-scale sequencing and analysis, polymorphisms (covering discovery and technologies and mapping and analysis), comparative genomics of mammalian and model organism genomes, functional genomics and bioinformatics. PMID:18628920

2001-01-01

57

The Sequence of the Human Genome  

Microsoft Academic Search

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

2001-01-01

58

Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer  

Microsoft Academic Search

Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53,

Paul T. Spellman; Laura Heiser; Joe W. Gray

2009-01-01

59

Plant genome sequencing - applications for crop improvement.  

PubMed

It is over 10 years since the genome sequence of the first crop was published. Since then, the number of crop genomes sequenced each year has increased steadily. The amazing pace at which genome sequences are becoming available is largely due to the improvement in sequencing technologies both in terms of cost and speed. Modern sequencing technologies allow the sequencing of multiple cultivars of smaller crop genomes at a reasonable cost. Though many of the published genomes are considered incomplete, they nevertheless have proved a valuable tool to understand important crop traits such as fruit ripening, grain traits and flowering time adaptation. PMID:24679255

Bolger, Marie E; Weisshaar, Bernd; Scholz, Uwe; Stein, Nils; Usadel, Björn; Mayer, Klaus F X

2014-04-01

60

Genome sequencing of lymphoid malignancies.  

PubMed

Our understanding of the pathogenesis of lymphoid malignancies has been transformed by next-generation sequencing. The studies in this review have used whole-genome, exome, and transcriptome sequencing to identify recurring structural genetic alterations and sequence mutations that target key cellular pathways in acute lymphoblastic leukemia (ALL) and the lymphomas. Although each tumor type is characterized by a unique genomic landscape, several cellular pathways are mutated in multiple tumor types-transcriptional regulation of differentiation, antigen receptor signaling, tyrosine kinase and Ras signaling, and epigenetic modifications-and individual genes are mutated in multiple tumors, notably TCF3, NOTCH1, MYD88, and BRAF. In addition to providing fundamental insights into tumorigenesis, these studies have also identified potential new markers for diagnosis, risk stratification, and therapeutic intervention. Several genetic alterations are intuitively "druggable" with existing agents, for example, kinase-activating lesions in high-risk B-cell ALL, NOTCH1 in both leukemia and lymphoma, and BRAF in hairy cell leukemia. Future sequencing efforts are required to comprehensively define the genetic basis of all lymphoid malignancies, examine the relative roles of germline and somatic variation, dissect the genetic basis of clonal heterogeneity, and chart a course for clinical sequencing and translation to improved therapeutic outcomes. PMID:24041576

Mullighan, Charles G

2013-12-01

61

Understanding Cancer Series: Cancer Genomics  

Cancer.gov

Understanding Cancer Genomics These PowerPoint slides are not locked files. You can mix and match slides from different tutorials as you prepare your own lectures. In the Notes section, you will find explanations of the graphics. The art in this tutorial is copyrighted and may not be reused for commercial gain. Please do not remove the NCI logo or the copyright mark from any slide. These tutorials may be copied only if they are distributed free of charge for educational purposes.

62

Genome Sequences of 65 Helicobacter pylori Strains Isolated from Asymptomatic Individuals and Patients with Gastric Cancer, Peptic Ulcer Disease, or Gastritis  

PubMed Central

Helicobacter pylori, inhabitant of the gastric mucosa of over half of the world population, with decreasing prevalence in the U.S., has been associated with a variety of gastric pathologies. However, the majority of H. pylori infected individuals remain asymptomatic and negative correlations between H. pylori and allergic diseases have been reported. Comprehensive genome characterization of H. pylori populations from different human host backgrounds including healthy individuals provides the exciting potential to generate new insights into the open question whether human health outcome is associated with specific H. pylori genotypes or dependent on other environmental factors. We report the genome sequences of 65 Helicobacter pylori isolates from individuals with gastric cancer, preneoplastic lesions, peptic ulcer disease, gastritis, and from asymptomatic adults. Isolates were collected from multiple locations in North America (USA and Canada) as well as from Columbia and Japan. The availability of these H. pylori genome sequences from individuals with distinct clinical presentations provides the research community with a resource for detailed investigations into genetic elements that correlate either positively or negatively with the epidemiology, human host adaptation and gastric pathogenesis, and will aid in the characterization of strains that may favor the development of specific pathology, including gastric cancer. PMID:23661595

Blanchard, Thomas G.; Czinn, Steven J.; Correa, Pelayo; Nakazawa, Teruko; Keelan, Monika; Morningstar, Lindsay; Santana-Cruz, Ivette; Maroo, Ankit; McCracken, Carri; Shefchek, Kent; Daugherty, Sean; Song, Yang; Fraser, Claire M.; Fricke, W. Florian

2013-01-01

63

Genome sequences of 65 Helicobacter pylori strains isolated from asymptomatic individuals and patients with gastric cancer, peptic ulcer disease, or gastritis.  

PubMed

Helicobacter pylori, inhabitant of the gastric mucosa of over half of the world population, with decreasing prevalence in the U.S., has been associated with a variety of gastric pathologies. However, the majority of H. pylori-infected individuals remain asymptomatic, and negative correlations between H. pylori and allergic diseases have been reported. Comprehensive genome characterization of H. pylori populations from different human host backgrounds including healthy individuals provides the exciting potential to generate new insights into the open question whether human health outcome is associated with specific H. pylori genotypes or dependent on other environmental factors. We report the genome sequences of 65 H. pylori isolates from individuals with gastric cancer, preneoplastic lesions, peptic ulcer disease, gastritis, and from asymptomatic adults. Isolates were collected from multiple locations in North America (USA and Canada) as well as from Columbia and Japan. The availability of these H. pylori genome sequences from individuals with distinct clinical presentations provides the research community with a resource for detailed investigations into genetic elements that correlate either positively or negatively with the epidemiology, human host adaptation, and gastric pathogenesis and will aid in the characterization of strains that may favor the development of specific pathology, including gastric cancer. PMID:23661595

Blanchard, Thomas G; Czinn, Steven J; Correa, Pelayo; Nakazawa, Teruko; Keelan, Monika; Morningstar, Lindsay; Santana-Cruz, Ivette; Maroo, Ankit; McCracken, Carri; Shefchek, Kent; Daugherty, Sean; Song, Yang; Fraser, Claire M; Fricke, W Florian

2013-07-01

64

On the sequencing of the human genome  

Microsoft Academic Search

Two recent papers using different approaches reported draft sequences of the human genome. The international Human Genome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics adopted the whole-genome shotgun (WGS) approach. Here, we analyze whether the latter paper provides a meaningful test of the WGS approach on a mammalian genome. In the Celera paper, the authors did not

Robert H. Waterston; Eric S. Lander; John E. Sulston

2002-01-01

65

DNA Methylation of Cancer Genome  

PubMed Central

DNA methylation plays an important role in regulating normal development and carcinogenesis. Current understanding of the biological roles of DNA methylation is limited to its role in the regulation of gene transcription, genomic imprinting, genomic stability, and X chromosome inactivation. In the past 2 decades, a large number of changes have been identified in cancer epigenomes when compared with normals. These alterations fall into two main categories, namely, hypermethylation of tumor suppressor genes and hypomethylation of oncogenes or heterochromatin, respectively. Aberrant methylation of genes controlling the cell cycle, proliferation, apoptosis, metastasis, drug resistance, and intracellular signaling has been identified in multiple cancer types. Recent advancements in whole-genome analysis of methylome have yielded numerous differentially methylated regions, the functions of which are largely unknown. With the development of high resolution tiling microarrays and high throughput DNA sequencing, more cancer methylomes will be profiled, facilitating the identification of new candidate genes or ncRNAs that are related to oncogenesis, new prognostic markers, and the discovery of new target genes for cancer therapy.† PMID:19960550

Cheung, Hoi-Hung; Lee, Tin-Lap; Rennert, Owen M.; Chan, Wai-Yee

2010-01-01

66

Functional genomics and cancer drug target discovery.  

PubMed

The recent development of technologies for whole-genome sequencing, copy number analysis and expression profiling enables the generation of comprehensive descriptions of cancer genomes. However, although the structural analysis and expression profiling of tumors and cancer cell lines can allow the identification of candidate molecules that are altered in the malignant state, functional analyses are necessary to confirm such genes as oncogenes or tumor suppressors. Moreover, recent research suggests that tumor cells also depend on synthetic lethal targets, which are not mutated or amplified in cancer genomes; functional genomics screening can facilitate the discovery of such targets. This review provides an overview of the tools available for the study of functional genomics, and discusses recent research involving the use of these tools to identify potential novel drug targets in cancer. PMID:20521217

Moody, Susan E; Boehm, Jesse S; Barbie, David A; Hahn, William C

2010-06-01

67

The UCSC Cancer Genomics Browser: update 2013.  

PubMed

The UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) is a set of web-based tools to display, investigate and analyse cancer genomics data and its associated clinical information. The browser provides whole-genome to base-pair level views of several different types of genomics data, including some next-generation sequencing platforms. The ability to view multiple datasets together allows users to make comparisons across different data and cancer types. Biological pathways, collections of genes, genomic or clinical information can be used to sort, aggregate and zoom into a group of samples. We currently display an expanding set of data from various sources, including 201 datasets from 22 TCGA (The Cancer Genome Atlas) cancers as well as data from Cancer Cell Line Encyclopedia and Stand Up To Cancer. New features include a completely redesigned user interface with an interactive tutorial and updated documentation. We have also added data downloads, additional clinical heatmap features, and an updated Tumor Image Browser based on Google Maps. New security features allow authenticated users access to private datasets hosted by several different consortia through the public website. PMID:23109555

Goldman, Mary; Craft, Brian; Swatloski, Teresa; Ellrott, Kyle; Cline, Melissa; Diekhans, Mark; Ma, Singer; Wilks, Chris; Stuart, Josh; Haussler, David; Zhu, Jingchun

2013-01-01

68

Utilization of the Human Genome Sequence Localizes Human Papillomavirus Type 16 DNA Integrated into the TNFAIP2 Gene in a Fatal Cervical Cancer from a 39YearOld Woman  

Microsoft Academic Search

Purpose: The purpose of our study was to characterize a human papillomavirus (HPV) 16 DNA integration in the genome of a rapidly progressive, lethal cervical cancer in a 39-year-old woman. Experimental Design: An HPV 16 integration site from cervical cancer tissue was cloned and analyzed using South- ern blot hybridization, nucleotide sequencing, fluorescence in situ hybridization analysis for chromosomal localization

Mark H. Einstein; Yvette Cruz; Mustafa K. El-Awady; Nicolas C. Popescu; Joseph A. DiPaolo; Marc van Ranst; Anna S. Kadish; Seymour Romney; Carolyn D. Runowicz; Robert D. Burk

69

TCGA Announces Launch of the Cancer Genomics Hub  

Cancer.gov

The Cancer Genome Atlas (TCGA) announces the beta launch of the Cancer Genomics Hub (CGHub) as the new secure repository for storing, cataloging, and accessing cancer genome sequences and alignments from TCGA. CGHub is managed by the University of California, Santa Cruz (UCSC), under a subcontract from SAIC-Frederick and will replace the function of the NCBI Sequence Read Archive for the TCGA program.

70

NIH Launches Comprehensive Effort to Explore Cancer Genomics  

Cancer.gov

The National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), both part of the National Institutes of Health (NIH), today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing.

71

Draft Genome Sequence of Lactobacillus rhamnosus 2166.  

PubMed

In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains. PMID:24558254

Karlyshev, Andrey V; Melnikov, Vyacheslav G; Kosarev, Igor V; Abramov, Vyacheslav M

2014-01-01

72

Understanding Cancer Series: Cancer Genome Project  

Cancer.gov

This tutorial explains how the Cancer Genome Anatomy Project (CGAP) studies the molecular changes that occur in cancer genomes and shares this information with all scientists. The information in NCI's CGAP databases is being used to improve the diagnosis and treatment of cancer.

73

The breast cancer genome - a key for better oncology  

E-print Network

Abstract Molecular classification has added important knowledge to breast cancer biology, but has yet to be implemented as a clinical standard. Full sequencing of breast cancer genomes could potentially refine classification and give a more complete...

Vollan, Hans Kristian Moen; Caldas, Carlos

2011-11-30

74

Value of a newly sequenced bacterial genome  

PubMed Central

Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2014-01-01

75

NIH Launches Comprehensive Effort to Explore Cancer Genomics: The Cancer Genome Atlas Begins With Three-Year, $100 Million Pilot  

Cancer.gov

The National Cancer Institute and the National Human Genome Research Institute today launched a comprehensive effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, especially large-scale genome sequencing. Questions and Answers

76

Whole genome sequencing reveals potential targets for therapy in patients with refractory KRAS mutated metastatic colorectal cancer  

PubMed Central

Background The outcome of patients with metastatic colorectal carcinoma (mCRC) following first line therapy is poor, with median survival of less than one year. The purpose of this study was to identify candidate therapeutically targetable somatic events in mCRC patient samples by whole genome sequencing (WGS), so as to obtain targeted treatment strategies for individual patients. Methods Four patients were recruited, all of whom had received?>?2 prior therapy regimens. Percutaneous needle biopsies of metastases were performed with whole blood collection for the extraction of constitutional DNA. One tumor was not included in this study as the quality of tumor tissue was not sufficient for further analysis. WGS was performed using Illumina paired end chemistry on HiSeq2000 sequencing systems, which yielded coverage of greater than 30X for all samples. NGS data were processed and analyzed to detect somatic genomic alterations including point mutations, indels, copy number alterations, translocations and rearrangements. Results All 3 tumor samples had KRAS mutations, while 2 tumors contained mutations in the APC gene and the PIK3CA gene. Although we did not identify a TCF7L2-VTI1A translocation, we did detect a TCF7L2 mutation in one tumor. Among the other interesting mutated genes was INPPL1, an important gene involved in PI3 kinase signaling. Functional studies demonstrated that inhibition of INPPL1 reduced growth of CRC cells, suggesting that INPPL1 may promote growth in CRC. Conclusions Our study further supports potential molecularly defined therapeutic contexts that might provide insights into treatment strategies for refractory mCRC. New insights into the role of INPPL1 in colon tumor cell growth have also been identified. Continued development of appropriate targeted agents towards specific events may be warranted to help improve outcomes in CRC. PMID:24943349

2014-01-01

77

Patterns of somatic mutation in human cancer genomes  

Microsoft Academic Search

Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274megabases (Mb) of DNA corresponding to the

Christopher Greenman; Philip Stephens; Raffaella Smith; Gillian L. Dalgliesh; Christopher Hunter; Graham Bignell; Helen Davies; Jon Teague; Adam Butler; Claire Stevens; Sarah Edkins; Sarah O'Meara; Imre Vastrik; Esther E. Schmidt; Tim Avis; Syd Barthorpe; Gurpreet Bhamra; Gemma Buck; Bhudipa Choudhury; Jody Clements; Jennifer Cole; Ed Dicks; Simon Forbes; Kris Gray; Kelly Halliday; Rachel Harrison; Katy Hills; Jon Hinton; Andy Jenkinson; David Jones; Andy Menzies; Tatiana Mironenko; Janet Perry; Keiran Raine; Dave Richardson; Rebecca Shepherd; Alexandra Small; Calli Tofts; Jennifer Varian; Tony Webb; Sofie West; Sara Widaa; Andy Yates; Daniel P. Cahill; David N. Louis; Peter Goldstraw; Andrew G. Nicholson; Francis Brasseur; Leendert Looijenga; Barbara L. Weber; Yoke-Eng Chiew; Anna Defazio; Mel F. Greaves; Anthony R. Green; Peter Campbell; Ewan Birney; Douglas F. Easton; Georgia Chenevix-Trench; Min-Han Tan; Sok Kean Khoo; Bin Tean Teh; Siu Tsan Yuen; Suet Yi Leung; Richard Wooster; P. Andrew Futreal; Michael R. Stratton

2007-01-01

78

Genome-tools: a flexible package for genome sequence analysis.  

PubMed

Genome-tools is a Perl module, a set of programs, and a user interface that facilitates access to genome sequence information. The package is flexible, extensible, and designed to be accessible and useful to both nonprogrammers and programmers. Any relatively well-annotated genome available with standard GenBank genome files may be used with genome-tools. A simple Web-based front end permits searching any available genome with an intuitive interface. Flexible design choices also make it simple to handle revised versions of genome annotation files as they change. In addition, programmers can develop cross-genomic tools and analyses with minimal additional overhead by combining genome-tools modules with newly written modules. Genome-tools runs on any computer platform for which Perl is available, including Unix, Microsoft Windows, and Mac OS. By simplifying the access to large amounts of genomic data, genome-tools may be especially useful for molecular biologists looking at newly sequenced genomes, for which few informatics tools are available. The genome-tools Web interface is accessible at http://genome-tools.sourceforge.net, and the source code is available at http://sourceforge.net/projects/genome-tools. PMID:12503321

Lee, William; Chen, Swaine L

2002-12-01

79

Marsupial Genome Sequences: Providing Insight into Evolution and Disease  

PubMed Central

Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

Deakin, Janine E.

2012-01-01

80

Cancer Genomics Research Laboratory  

Cancer.gov

CGR’s high throughput laboratory is equipped with state-of-the-art laboratory equipment and automation systems for a large number of applications. CGR supports DCEG in all stages of cancer research from planning to publishing, including experimental design and project management, sample handling, genotyping and sequencing assay design and execution, development and implementation of bioinformatic pipelines, and downstream scientific research and analytical support.

81

Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,  

E-print Network

technologies we will use? WHAT SPECIES DO WE SEQUENCE? In a world with .260,000 known plant spe- ciesCOMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant, Arabidopsis thaliana, was published ,6 years ago (Arabidopsis Genome Initiative, 2000). Since

Purugganan, Michael D.

82

Single cell analysis of cancer genomes.  

PubMed

Genomic studies have provided key insights into how cancers develop, evolve, metastasize and respond to treatment. Cancers result from an interplay between mutation, selection and clonal expansions. In solid tumours, this Darwinian competition between subclones is also influenced by topological factors. Recent advances have made it possible to study cancers at the single cell level. These methods represent important tools to dissect cancer evolution and provide the potential to considerably change both cancer research and clinical practice. Here we discuss state-of-the-art methods for the isolation of a single cell, whole-genome and whole-transcriptome amplification of the cell's nucleic acids, as well as microarray and massively parallel sequencing analysis of such amplification products. We discuss the strengths and the limitations of the techniques, and explore single-cell methodologies for future cancer research, as well as diagnosis and treatment of the disease. PMID:24531336

Van Loo, Peter; Voet, Thierry

2014-02-01

83

Using the Potato Genome Sequence! Robin Buell!  

E-print Network

funding through National Science Foundation 6 #12;With so many potatoes with lots of variation-what genomics & post-genomic biology genomes genera 2002 2010 3 #12;So, you say you can sequence-Now what% of assembly anchored to genetic map 15 #12;16 #12;What are we interested in annotating? Genes

Douches, David S.

84

Whole Genome and Transcriptome Sequencing of a B3 Thymoma  

PubMed Central

Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina) and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37). Copy number (CN) aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X) was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs) and 2 insertion/deletions (INDELs) were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma. PMID:23577124

Petrini, Iacopo; Rajan, Arun; Pham, Trung; Voeller, Donna; Davis, Sean; Gao, James; Wang, Yisong; Giaccone, Giuseppe

2013-01-01

85

Genomic rearrangements in inherited disease and cancer  

Microsoft Academic Search

Genomic rearrangements in inherited disease and cancer involve gross alterations of chromosomes or large chromosomal regions and can take the form of deletions, duplications, insertions, inversions or translocations. The characterization of a considerable number of rearrangement breakpoints has now been accomplished at the nucleotide sequence level, thereby providing an invaluable resource for the detailed study of the mutational mechanisms which

Jian-Min Chen; David N. Cooper; Claude Férec; Hildegard Kehrer-Sawatzki; George P. Patrinos

2010-01-01

86

Next generation sequencing of viral RNA genomes  

PubMed Central

Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

2013-01-01

87

Human genetics and genomics a decade after the release of the draft sequence of the human genome  

PubMed Central

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605

2011-01-01

88

Genomic sequencing of Pleistocene cave bears  

SciTech Connect

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

2005-04-01

89

The genome sequence of Drosophila melanogaster.  

SciTech Connect

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

NONE

2000-03-24

90

Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes  

PubMed Central

Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. PMID:22174807

Barthelson, Roger; McFarlin, Adam J.; Rounsley, Steven D.; Young, Sarah

2011-01-01

91

Initial sequencing and analysis of the human genome  

E-print Network

Initial sequencing and analysis of the human genome International Human Genome Sequencing a draft sequence of the human genome. We also present an initial analysis of the data, describing some genome. The draft genome sequence was generated from a physical map covering more than 96

Eddy, Sean

92

High-throughput bisulfite sequencing in mammalian genomes  

Microsoft Academic Search

DNA methylation is a critical epigenetic mark that is essential for mammalian development and aberrant in many diseases including cancer. Over the past decade multiple methods have been developed and applied to characterize its genome-wide distribution. Of these, reduced representation bisulfite sequencing (RRBS) generates nucleotide resolution DNA methylation bisulfite sequencing libraries that enrich for CpG-dense regions by methylation-insensitive restriction digestion.

Zachary D. Smith; Hongcang Gu; Christoph Bock; Andreas Gnirke; Alexander Meissner

2009-01-01

93

The Characterization of Twenty Sequenced Human Genomes  

PubMed Central

We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten “case” genomes from individuals with severe hemophilia A and ten “control” genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways. PMID:20838461

Maia, Jessica M.; Zhu, Mingfu; Smith, Jason P.; Cirulli, Elizabeth T.; Fellay, Jacques; Dickson, Samuel P.; Gumbs, Curtis E.; Heinzen, Erin L.; Need, Anna C.; Ruzzo, Elizabeth K.; Singh, Abanish; Campbell, C. Ryan; Hong, Linda K.; Lornsen, Katharina A.; McKenzie, Alexander M.; Sobreira, Nara L. M.; Hoover-Fong, Julie E.; Milner, Joshua D.; Ottman, Ruth; Haynes, Barton F.; Goedert, James J.; Goldstein, David B.

2010-01-01

94

Genome Sequence of Serratia plymuthica V4  

PubMed Central

Serratia spp. are gammaproteobacteria and members of the family Enterobacteriaceae. Here, we announce the genome sequence of Serratia plymuthica strain V4, which produces the siderophore serratiochelin and antimicrobial compounds. PMID:24831138

Cleto, S.; Van der Auwera, G.; Almeida, C.; Vieira, M. J.; Vlamakis, H.

2014-01-01

95

The Human Genome Project: Sequencing the Future  

E-print Network

.The immediate response was considerable skepticism about the technological capability to sequence the genome later, genome technologies and data are revolutionizing biology and providing a vital thrust carbon dioxide to counter global warming and ensure U.S. energy security by reducing our dependence

96

A Complete Neandertal Mitochondrial Genome Sequence Determined  

E-print Network

unequiv- ocally establishes that the Neandertal mtDNA falls outside the variation of extant human mt the variation of modern human mtDNA. Since the mtDNA genome is maternally inherited with- out recombination (mt) genome sequence was reconstructed from a 38,000 year-old Neander- tal individual with 8341 mtDNA

Good, Jeffrey M.

97

Complete Genome Sequence of Equid Herpesvirus 3  

PubMed Central

Equid herpesvirus 3 (EHV-3) is a member of the subfamily Alphaherpesvirinae that causes equine coital exanthema. Here, we report the first complete genome sequence of EHV-3. The 151,601-nt genome encodes 76 distinct genes like other equine alphaherpesviruses, but genetically, EHV-3 is significantly more divergent. PMID:25278519

Vissani, Aldana; Tordoya, Maria Silva; Muylkens, Benoit; Thiry, Etienne; Maes, Piet; Matthijnssens, Jelle; Barrandeguy, Maria; Van Ranst, Marc

2014-01-01

98

Genome evolution during progression to breast cancer.  

PubMed

Cancer evolution involves cycles of genomic damage, epigenetic deregulation, and increased cellular proliferation that eventually culminate in the carcinoma phenotype. Early neoplasias, which are often found concurrently with carcinomas and are histologically distinguishable from normal breast tissue, are less advanced in phenotype than carcinomas and are thought to represent precursor stages. To elucidate their role in cancer evolution we performed comparative whole-genome sequencing of early neoplasias, matched normal tissue, and carcinomas from six patients, for a total of 31 samples. By using somatic mutations as lineage markers we built trees that relate the tissue samples within each patient. On the basis of these lineage trees we inferred the order, timing, and rates of genomic events. In four out of six cases, an early neoplasia and the carcinoma share a mutated common ancestor with recurring aneuploidies, and in all six cases evolution accelerated in the carcinoma lineage. Transition spectra of somatic mutations are stable and consistent across cases, suggesting that accumulation of somatic mutations is a result of increased ancestral cell division rather than specific mutational mechanisms. In contrast to highly advanced tumors that are the focus of much of the current cancer genome sequencing, neither the early neoplasia genomes nor the carcinomas are enriched with potentially functional somatic point mutations. Aneuploidies that occur in common ancestors of neoplastic and tumor cells are the earliest events that affect a large number of genes and may predispose breast tissue to eventual development of invasive carcinoma. PMID:23568837

Newburger, Daniel E; Kashef-Haghighi, Dorna; Weng, Ziming; Salari, Raheleh; Sweeney, Robert T; Brunner, Alayne L; Zhu, Shirley X; Guo, Xiangqian; Varma, Sushama; Troxell, Megan L; West, Robert B; Batzoglou, Serafim; Sidow, Arend

2013-07-01

99

Genome evolution during progression to breast cancer  

PubMed Central

Cancer evolution involves cycles of genomic damage, epigenetic deregulation, and increased cellular proliferation that eventually culminate in the carcinoma phenotype. Early neoplasias, which are often found concurrently with carcinomas and are histologically distinguishable from normal breast tissue, are less advanced in phenotype than carcinomas and are thought to represent precursor stages. To elucidate their role in cancer evolution we performed comparative whole-genome sequencing of early neoplasias, matched normal tissue, and carcinomas from six patients, for a total of 31 samples. By using somatic mutations as lineage markers we built trees that relate the tissue samples within each patient. On the basis of these lineage trees we inferred the order, timing, and rates of genomic events. In four out of six cases, an early neoplasia and the carcinoma share a mutated common ancestor with recurring aneuploidies, and in all six cases evolution accelerated in the carcinoma lineage. Transition spectra of somatic mutations are stable and consistent across cases, suggesting that accumulation of somatic mutations is a result of increased ancestral cell division rather than specific mutational mechanisms. In contrast to highly advanced tumors that are the focus of much of the current cancer genome sequencing, neither the early neoplasia genomes nor the carcinomas are enriched with potentially functional somatic point mutations. Aneuploidies that occur in common ancestors of neoplastic and tumor cells are the earliest events that affect a large number of genes and may predispose breast tissue to eventual development of invasive carcinoma. PMID:23568837

Newburger, Daniel E.; Kashef-Haghighi, Dorna; Weng, Ziming; Salari, Raheleh; Sweeney, Robert T.; Brunner, Alayne L.; Zhu, Shirley X.; Guo, Xiangqian; Varma, Sushama; Troxell, Megan L.; West, Robert B.; Batzoglou, Serafim; Sidow, Arend

2013-01-01

100

Genome sequence and analysis of Lactobacillus helveticus  

PubMed Central

The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

2013-01-01

101

Genomic Resources for Cancer Epidemiology  

Cancer.gov

The goal of the 1000 genomes project is to provide a comprehensive resource on human genetic variation. The Project is sequencing the genomes of approximately 2,500 samples at 4x coverage, to provide data on genetic variants with frequencies of at least 1% in the populations studied.

102

Intraspecies sequence comparisons for annotating genomes.  

PubMed

Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intraspecies sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents, and a set of genomic intervals were amplified, resequenced, and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C. intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom. It also raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. PMID:15545499

Boffelli, Dario; Weer, Claire V; Weng, Li; Lewis, Keith D; Shoukry, Malak I; Pachter, Lior; Keys, David N; Rubin, Edward M

2004-12-01

103

POSTDOCTORAL POSITION IN BIOINFORMATICS AND EVOLUTIONARY GENOMICS: Next generation sequencing and analysis of complex polyploid genomes  

E-print Network

POSTDOCTORAL POSITION IN BIOINFORMATICS AND EVOLUTIONARY GENOMICS: Next generation sequencing and analysis of complex polyploid genomes The research group Genome Evolution and Speciation (Team) to work on the analysis of genome and transcriptome sequence data (generated using 454 Roche

Rennes, Université de

104

Genomics: Drugs, diabetes and cancer  

PubMed Central

Summary Variation in a genomic region that contains the cancer-a ssociated gene ATM affects a patient’s response to the diabetes drug metformin. Two experts discuss the implications for understanding diabetes and the link to cancer. PMID:21331030

Birnbaum, Morris J.; Shaw, Reuben J.

2014-01-01

105

Pervasive sequence patents cover the entire human genome  

PubMed Central

The scope and eligibility of patents for genetic sequences have been debated for decades, but a critical case regarding gene patents (Association of Molecular Pathologists v. Myriad Genetics) is now reaching the US Supreme Court. Recent court rulings have supported the assertion that such patents can provide intellectual property rights on sequences as small as 15 nucleotides (15mers), but an analysis of all current US patent claims and the human genome presented here shows that 15mer sequences from all human genes match at least one other gene. The average gene matches 364 other genes as 15mers; the breast-cancer-associated gene BRCA1 has 15mers matching at least 689 other genes. Longer sequences (1,000 bp) still showed extensive cross-gene matches. Furthermore, 15mer-length claims from bovine and other animal patents could also claim as much as 84% of the genes in the human genome. In addition, when we expanded our analysis to full-length patent claims on DNA from all US patents to date, we found that 41% of the genes in the human genome have been claimed. Thus, current patents for both short and long nucleotide sequences are extraordinarily non-specific and create an uncertain, problematic liability for genomic medicine, especially in regard to targeted re-sequencing and other sequence diagnostic assays. PMID:23522065

2013-01-01

106

Genome-wide Approaches for Cancer Gene Discovery  

PubMed Central

One of the central aims of cancer research is to identify and characterize cancer-causing alterations in cancer genomes. In recent years, unprecedented advances in genome-wide sequencing, functional genomics technologies of RNA interference screens and methods to evaluate three-dimensional chromatin organization in vivo have resulted in important discoveries regarding human cancer. The cancer causing genes identified from these new genome-wide technologies have also provided opportunities for effective and personalized cancer therapy. In this review, we describe some of the most recent technologies for cancer gene discovery. We also provide specific examples where these technologies have proven remarkably successful in uncovering important cancer-causing alterations. PMID:21757246

Lizardi, Paul M.; Forloni, Matteo; Wajapeyee, Narendra

2011-01-01

107

Sequencing error correction without a reference genome  

PubMed Central

Background Next (second) generation sequencing is an increasingly important tool for many areas of molecular biology, however, care must be taken when interpreting its output. Even a low error rate can cause a large number of errors due to the high number of nucleotides being sequenced. Identifying sequencing errors from true biological variants is a challenging task. For organisms without a reference genome this difficulty is even more challenging. Results We have developed a method for the correction of sequencing errors in data from the Illumina Solexa sequencing platforms. It does not require a reference genome and is of relevance for microRNA studies, unsequenced genomes, variant detection in ultra-deep sequencing and even for RNA-Seq studies of organisms with sequenced genomes where RNA editing is being considered. Conclusions The derived error model is novel in that it allows different error probabilities for each position along the read, in conjunction with different error rates depending on the particular nucleotides involved in the substitution, and does not force these effects to behave in a multiplicative manner. The model provides error rates which capture the complex effects and interactions of the three main known causes of sequencing error associated with the Illumina platforms. PMID:24350580

2013-01-01

108

Systematic genome sequence differences among leaf cells within individual trees  

PubMed Central

Background Even in the age of next-generation sequencing (NGS), it has been unclear whether or not cells within a single organism have systematically distinctive genomes. Resolving this question, one of the most basic biological problems associated with DNA mutation rates, can assist efforts to elucidate essential mechanisms of cancer. Results Using genome profiling (GP), we detected considerable systematic variation in genome sequences among cells in individual woody plants. The degree of genome sequence difference (genomic distance) varied systematically from the bottom to the top of the plant, such that the greatest divergence was observed between leaf genomes from uppermost branches and the remainder of the tree. This systematic variation was observed within both Yoshino cherry and Japanese beech trees. Conclusions As measured by GP, the genomic distance between two cells within an individual organism was non-negligible, and was correlated with physical distance (i.e., branch-to-branch distance). This phenomenon was assumed to be the result of accumulation of mutations from each cell division, implying that the degree of divergence is proportional to the number of generations separating the two cells. PMID:24548431

2014-01-01

109

Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements  

Microsoft Academic Search

As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments

Aaron C. E. Darling; Bob Mau; Frederick R. Blattner; Nicole T. Perna

2004-01-01

110

Genomic sequence analysis tools: a user's guide.  

PubMed

The wealth of information from various genome sequencing projects provides the biologist with a new perspective from which to analyze, and design experiments with, mammalian systems. The complexity of the information, however, requires new software tools, and numerous such tools are now available. Which type and which specific system is most effective depends, in part, upon how much sequence is to be analyzed and with what level of experimental support. Here we survey a number of mammalian genomic sequence analysis systems with respect to the data they provide and the ease of their use. The hope is to aid the experimental biologist in choosing the most appropriate tool for their analyses. PMID:11226611

Fortna, A; Gardiner, K

2001-03-01

111

Whole Genome Sequence of a Turkish Individual  

PubMed Central

Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ?1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123?2,122,671 or 1?1.5) and transition/transversion ratios (2,383,204?1,154,590 or 2.06?1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1?1.09 insertion/deletion ratio), ranging from ?52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale. PMID:24416366

Dogan, Haluk; Can, Handan; Otu, Hasan H.

2014-01-01

112

Finishing the euchromatic sequence of the human genome  

Microsoft Academic Search

The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and

2004-01-01

113

Genome sequence of the palaeopolyploid soybean.  

PubMed

Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70% more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78% of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75% of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties. PMID:20075913

Schmutz, Jeremy; Cannon, Steven B; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L; Song, Qijian; Thelen, Jay J; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D; Yu, Yeisoo; Sakurai, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T; Wing, Rod A; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C; Jackson, Scott A

2010-01-14

114

Understanding the development of human bladder cancer by using a whole-organ genomic mapping strategy  

Microsoft Academic Search

The search for the genomic sequences involved in human cancers can be greatly facilitated by maps of genomic imbalances identifying the involved chromosomal regions, particularly those that participate in the development of occult preneoplastic conditions that progress to clinically aggressive invasive cancer. The integration of such regions with human genome sequence variation may provide valuable clues about their overall structure

Tadeusz Majewski; Sangkyou Lee; Joon Jeong; Dong-Sup Yoon; Andrzej Kram; Mi-Sook Kim; Tomasz Tuziak; Jolanta Bondaruk; Sooyong Lee; Weon-Seo Park; Kuang S Tang; Woonbok Chung; Lanlan Shen; Saira S Ahmed; Dennis A Johnston; H Barton Grossman; Colin P Dinney; Jain-Hua Zhou; R Alan Harris; Carrie Snyder; Slawomir Filipek; Steven A Narod; Patrice Watson; Henry T Lynch; Adi Gazdar; Menashe Bar-Eli; Xifeng F Wu; David J McConkey; Keith Baggerly; Jean-Pierre Issa; William F Benedict; Steven E Scherer; Bogdan Czerniak

2008-01-01

115

Using comparative genomics to reorder the human genome sequence into a virtual sheep genome  

Microsoft Academic Search

BACKGROUND: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? RESULTS: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the

Brian P Dalrymple; Ewen F Kirkness; Mikhail Nefedov; Sean McWilliam; Abhirami Ratnakumar; Wes Barris; Shaying Zhao; Jyoti Shetty; Jillian F Maddox; Margaret O'Grady; Frank Nicholas; Allan M Crawford; Tim Smith; Pieter J de Jong; John McEwan; V Hutton Oddy; Noelle E Cockett

2007-01-01

116

Functional genomics to explore cancer cell vulnerabilities.  

PubMed

Our understanding of glioblastoma multiforme (GBM), the most common form of primary brain cancer, has been significantly advanced by recent efforts to characterize the cancer genome using unbiased high-throughput sequencing analyses. While these studies have documented hundreds of mutations, gene copy alterations, and chromosomal abnormalities, only a subset of these alterations are likely to impact tumor initiation or maintenance. Furthermore, genes that are not altered at the genomic level may play essential roles in tumor initiation and maintenance. Identification of these genes is critical for therapeutic development and investigative methodologies that afford insight into biological function. This requirement has largely been fulfilled with the emergence of RNA interference (RNAi) and high-throughput screening technology. In this article, the authors discuss the application of genome-wide, high-throughput RNAi-based genetic screening as a powerful tool for the rapid and cost-effective identification of genes essential for cancer proliferation and survival. They describe how these technologies have been used to identify genes that are themselves selectively lethal to cancer cells, or synthetically lethal with other oncogenic mutations. The article is intended to provide a platform for how RNAi libraries might contribute to uncovering glioma cell vulnerabilities and provide information that is highly complementary to the structural characterization of the glioblastoma genome. The authors emphasize that unbiased, systems-level structural and functional genetic approaches are complementary efforts that should facilitate the identification of genes involved in the pathogenesis of GBM and permit the identification of novel drug targets. PMID:20043720

Kahle, Kristopher T; Kozono, David; Ng, Kimberly; Hsieh, Grace; Zinn, Pascal O; Nitta, Masayuki; Chen, Clark C

2010-01-01

117

Melanoma genome sequencing reveals frequent PREX2 mutations  

PubMed Central

Melanoma is notable for its metastatic propensity, lethality in the advanced setting, and association with ultraviolet (UV) exposure early in life1. To obtain a comprehensive genomic view of melanoma, we sequenced the genomes of 25 metastatic melanomas and matched germline DNA. A wide range of point mutation rates was observed: lowest in melanomas whose primaries arose on non-UV exposed hairless skin of the extremities (3 and 14 per Mb genome), intermediate in those originating from hair-bearing skin of the trunk (range = 5 to 55 per Mb), and highest in a patient with a documented history of chronic sun exposure (111 per Mb). Analysis of whole-genome sequence data identified PREX2 - a PTEN-interacting protein and negative regulator of PTEN in breast cancer2 - as a significantly mutated gene with a mutation frequency of approximately 14% in an independent extension cohort of 107 human melanomas. PREX2 mutations are biologically relevant, as ectopic expression of mutant PREX2 accelerated tumor formation of immortalized human melanocytes in vivo. Thus, whole-genome sequencing of human melanoma tumors revealed genomic evidence of UV pathogenesis and discovered a new recurrently mutated gene in melanoma. PMID:22622578

Berger, Michael F.; Hodis, Eran; Heffernan, Timothy P.; Deribe, Yonathan Lissanu; Lawrence, Michael S.; Protopopov, Alexei; Ivanova, Elena; Watson, Ian R.; Nickerson, Elizabeth; Ghosh, Papia; Zhang, Hailei; Zeid, Rhamy; Ren, Xiaojia; Cibulskis, Kristian; Sivachenko, Andrey Y.; Wagle, Nikhil; Sucker, Antje; Sougnez, Carrie; Onofrio, Robert; Ambrogio, Lauren; Auclair, Daniel; Fennell, Timothy; Carter, Scott L.; Drier, Yotam; Stojanov, Petar; Singer, Meredith A.; Voet, Douglas; Jing, Rui; Saksena, Gordon; Barretina, Jordi; Ramos, Alex H.; Pugh, Trevor J.; Stransky, Nicolas; Parkin, Melissa; Winckler, Wendy; Mahan, Scott; Ardlie, Kristin; Baldwin, Jennifer; Wargo, Jennifer; Schadendorf, Dirk; Meyerson, Matthew; Gabriel, Stacey B.; Golub, Todd R.; Wagner, Stephan N.; Lander, Eric S.; Getz, Gad; Chin, Lynda; Garraway, Levi A.

2012-01-01

118

Melanoma genome sequencing reveals frequent PREX2 mutations.  

PubMed

Melanoma is notable for its metastatic propensity, lethality in the advanced setting and association with ultraviolet exposure early in life. To obtain a comprehensive genomic view of melanoma in humans, we sequenced the genomes of 25 metastatic melanomas and matched germline DNA. A wide range of point mutation rates was observed: lowest in melanomas whose primaries arose on non-ultraviolet-exposed hairless skin of the extremities (3 and 14 per megabase (Mb) of genome), intermediate in those originating from hair-bearing skin of the trunk (5-55 per Mb), and highest in a patient with a documented history of chronic sun exposure (111 per Mb). Analysis of whole-genome sequence data identified PREX2 (phosphatidylinositol-3,4,5-trisphosphate-dependent Rac exchange factor 2)--a PTEN-interacting protein and negative regulator of PTEN in breast cancer--as a significantly mutated gene with a mutation frequency of approximately 14% in an independent extension cohort of 107 human melanomas. PREX2 mutations are biologically relevant, as ectopic expression of mutant PREX2 accelerated tumour formation of immortalized human melanocytes in vivo. Thus, whole-genome sequencing of human melanoma tumours revealed genomic evidence of ultraviolet pathogenesis and discovered a new recurrently mutated gene in melanoma. PMID:22622578

Berger, Michael F; Hodis, Eran; Heffernan, Timothy P; Deribe, Yonathan Lissanu; Lawrence, Michael S; Protopopov, Alexei; Ivanova, Elena; Watson, Ian R; Nickerson, Elizabeth; Ghosh, Papia; Zhang, Hailei; Zeid, Rhamy; Ren, Xiaojia; Cibulskis, Kristian; Sivachenko, Andrey Y; Wagle, Nikhil; Sucker, Antje; Sougnez, Carrie; Onofrio, Robert; Ambrogio, Lauren; Auclair, Daniel; Fennell, Timothy; Carter, Scott L; Drier, Yotam; Stojanov, Petar; Singer, Meredith A; Voet, Douglas; Jing, Rui; Saksena, Gordon; Barretina, Jordi; Ramos, Alex H; Pugh, Trevor J; Stransky, Nicolas; Parkin, Melissa; Winckler, Wendy; Mahan, Scott; Ardlie, Kristin; Baldwin, Jennifer; Wargo, Jennifer; Schadendorf, Dirk; Meyerson, Matthew; Gabriel, Stacey B; Golub, Todd R; Wagner, Stephan N; Lander, Eric S; Getz, Gad; Chin, Lynda; Garraway, Levi A

2012-05-24

119

Complete genomic sequence of turkey coronavirus  

Microsoft Academic Search

Turkey coronavirus (TCoV), one of the least characterized of all known coronaviruses, was isolated from an outbreak of acute enteritis in young turkeys in Ontario, Canada, and the full-length genomic sequence was determined. The full-length genome was 27,632 nucleotides plus the 3? poly(A) tail. Two open reading frames, ORFs 1a and 1b, resided in the first two thirds of the

M. H. Gomaa; J. R. Barta; D. Ojkic; D. Yoo

2008-01-01

120

Understanding genomic alterations in cancer genomes using an integrative network approach.  

PubMed

In recent years, cancer genome sequencing and other high-throughput studies of cancer genomes have generated many notable discoveries. In this review, novel genomic alteration mechanisms, such as chromothripsis (chromosomal crisis) and kataegis (mutation storms), and their implications for cancer are discussed. Genomic alterations spur cancer genome evolution. Thus, the relationship between cancer clonal evolution and cancer stems cells is commented. The key question in cancer biology concerns how these genomic alterations support cancer development and metastasis in the context of biological functioning. Thus far, efforts such as pathway analysis have improved the understanding of the functional contributions of genetic mutations and DNA copy number variations to cancer development, progression and metastasis. However, the known pathways correspond to a small fraction, plausibly 5-10%, of somatic mutations and genes with an altered copy number. To develop a comprehensive understanding of the function of these genomic alterations in cancer, an integrative network framework is proposed and discussed. Finally, the challenges and the directions of studying cancer omic data using an integrative network approach are commented. PMID:23266571

Wang, Edwin

2013-11-01

121

Sequencing the AML Genome, Transcriptome, and Epigenome.  

PubMed

Leukemia is a disease that develops as a result of changes in the genomes of hematopoietic cells, a fact first appreciated by microscopic examination of the bone marrow cell chromosomes of affected patients. These studies revealed that specific subtypes of leukemia diagnoses correlated with specific chromosomal abnormalities, such as the t(15;17) of acute promyelocytic leukemia and the t(9;22) of chronic myeloid leukemia. Over time, our genomic characterization of hematologic malignancies has moved beyond the resolution of the microscope to that of individual nucleotides in the analysis of whole-genome sequencing (WGS) data using state-of-the-art massively parallel sequencing (MPS) instruments and algorithmic analyses of the resulting data. In addition to studying the genomic sequence alterations that occur in patients' genomes, these same instruments can decode the methylation landscape of the leukemia genome and the resulting RNA expression landscape of the leukemia transcriptome. Broad correlative analyses can then integrate these 3 data types to better inform researchers and clinicians about the biology of individual acute myeloid leukemia (AML) cases, facilitating improvements in care and prognosis. PMID:25311738

Mardis, Elaine R

2014-10-01

122

Sequencing the public health genome.  

PubMed

The exposome paradigm provides a new approach for conceptualizing and analyzing the impact of single exposures on health outcomes. This article describes the methods used to sequence the public health exposome and implications for the dynamic, multi-dimensional data information system developed by investigators at Meharry Medical College. PMID:23395949

Juarez, Paul

2013-02-01

123

Noninvasive fetal genome sequencing: a primer  

PubMed Central

We recently demonstrated whole genome sequencing of a human fetus using only parental DNA samples and plasma from the pregnant mother. This proof-of-concept study demonstrated how samples obtained noninvasively in the first or second trimester can be analyzed to yield a highly accurate and substantially complete genetic profile of the fetus, including both inherited and de novo variation. Here, we revisit our original study from a clinical standpoint, provide an overview of the scientific approach, and describe opportunities and challenges along the path towards clinical adoption of noninvasive fetal whole genome sequencing (NIFWGS). PMID:23553552

Snyder, Matthew W.; Simmons, LaVone E.; Kitzman, Jacob O.; Santillan, Donna A.; Santillan, Mark K.; Gammill, Hilary S.; Shendure, Jay

2013-01-01

124

Assigning genomic sequences to CATH  

Microsoft Academic Search

We report the latest release (version 1.6) of the CATH protein domains database (http:\\/\\/www.biochem.ucl. ac.uk\\/bsm\\/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo- logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and

Frances M. G. Pearl; David Lee; James E. Bray; Ian Sillitoe; Annabel E. Todd; Andrew P. Harrison; Janet M. Thornton; Christine A. Orengo

2000-01-01

125

International network of cancer genome projects  

Microsoft Academic Search

The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and\\/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic

Thomas J. Hudson; Warwick Anderson; Axel Aretz; Anna D. Barker; Cindy Bell; Rosa R. Bernabé; M. K. Bhan; Iiro Eerola; Daniela S. Gerhard; Alan Guttmacher; Mark Guyer; Fiona M. Hemsley; Jennifer L. Jennings; David Kerr; Peter Klatt; Patrik Kolar; Jun Kusuda; Frank Laplace; Youyong Lu; Gerd Nettekoven; Brad Ozenberger; Jane Peterson; T. S. Rao; Jacques Remacle; Alan J. Schafer; Tatsuhiro Shibata; Michael R. Stratton; Joseph G. Vockley; Koichi Watanabe; Huanming Yang; Martin Bobrow; Anne Cambon-Thomsen; Lynn G. Dressler; Stephanie O. M. Dyke; Yann Joly; Kazuto Kato; Karen L. Kennedy; Pilar Nicolás; Michael J. Parker; Emmanuelle Rial-Sebbag; Carlos M. Romeo-Casabona; Kenna M. Shaw; Susan Wallace; Georgia L. Wiesner; Andrew V. Biankin; Christian Chabannon; Lynda Chin; Bruno Clément; Enrique de Alava; Françoise Degos; Martin L. Ferguson; Peter Geary; D. Neil Hayes; Amber L. Johns; Arek Kasprzyk; Hidewaki Nakagawa; Robert Penny; Miguel A. Piris; Rajiv Sarin; Aldo Scarpa; Hiroyuki Aburatani; Mónica Bayés; David D. L. Bowtell; Peter J. Campbell; Xavier Estivill; Ivo Gut; Martin Hirst; Carlos López-Otín; Partha Majumder; Marco Marra; John D. McPherson; Zemin Ning; Xose S. Puente; Yijun Ruan; Hendrik G. Stunnenberg; Harold Swerdlow; Victor E. Velculescu; Richard K. Wilson; Hong H. Xue; Paul T. Spellman; Gary D. Bader; Paul C. Boutros; Paul Flicek; Gad Getz; Roderic Guigó; Guangwu Guo; David Haussler; Simon Heath; Tim J. Hubbard; Tao Jiang; Steven M. Jones; Qibin Li; Nuria López-Bigas; Ruibang Luo; Lakshmi Muthuswamy; B. F. Francis Ouellette; John V. Pearson; Victor Quesada; Benjamin J. Raphael; Chris Sander; Terence P. Speed; Joshua M. Stuart; Jon W. Teague; Yasushi Totoki; Tatsuhiko Tsunoda; Alfonso Valencia; David A. Wheeler; Honglong Wu; Shancen Zhao; Mark Lathrop; Gilles Thomas; Myles Axton; Chris Gunter; Linda J. Miller; Junjun Zhang; Syed A. Haider; Jianxin Wang; Christina K. Yung; Anthony Cross; Yong Liang; Saravanamuttu Gnaneshan; Jonathan Guberman; Don R. C. Chalmers; Karl W. Hasel; Terry S. H. Kaan; William W. Lowrance; Tohru Masui; Laura Lyman Rodriguez; Catherine Vergely; Nicole Cloonan; Anna Defazio; James R. Eshleman; Dariush Etemadmoghadam; Brooke A. Gardiner; James G. Kench; Robert L. Sutherland; Margaret A. Tempero; Nicola J. Waddell; Steve Gallinger; Ming-Sound Tsao; Patricia A. Shaw; Gloria M. Petersen; Debabrata Mukhopadhyay; Ronald A. Depinho; Sarah Thayer; Kamran Shazand; Timothy Beck; Michelle Sam; Lee Timms; Jiafu Ji; Xiuqing Zhang; Feng Chen; Xueda Hu; Guangyu Zhou; Qi Yang; Geng Tian; Lianhai Zhang; Xiaofang Xing; Xianghong Li; Zhenggang Zhu; Yingyan Yu; Jun Yu; Jörg Tost; Paul Brennan; Ivana Holcatova; David Zaridze; Alvis Brazma; Lars Egevad; Egor Prokhortchouk; Rosamonde Elizabeth Banks; Mathias Uhlén; Juris Viksna; Fredrik Ponten; Ewan Birney; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A. Foekens; Sancha Martin; Jorge S. Reis-Filho; Andrea L. Richardson; Christos Sotiriou; Marc van de Vijver; Daniel Birnbaum; Hélène Blanche; Pascal Boucher; Sandrine Boyault; Jocelyne D. Masson-Jacquemier; Iris Pauporté; Xavier Pivot; Anne Vincent-Salomon; Eric Tabone; Charles Theillet; Paulette Bioulac-Sage; Thomas Decaens; Dominique Franco; Marta Gut; Didier Samuel; Benedikt Brors; Jan O. Korbel; Andrey Korshunov; Pablo Landgraf; Hans Lehrach; Stefan Pfister; Bernhard Radlwimmer; Guido Reifenberger; Michael D. Taylor; Paolo Pederzoli; Rita T. Lawlor; Massimo Delledonne; Alberto Bardelli; Thomas Gress; David Klimstra; Yusuke Nakamura; Satoru Miyano; Akihiro Fujimoto; Silvia de Sanjosé; Emili Montserrat; Marcos González-Díaz; Pedro Jares; Heinz Himmelbaue; Samuel Aparicio; Laura van't Veer; Douglas F. Easton; Francis S. Collins; Carolyn C. Compton; Eric S. Lander; Wylie Burke; Anthony R. Green; Olli P. Kallioniemi; Timothy J. Ley; Edison T. Liu; Brandon J. Wainwright

2010-01-01

126

Complete genome sequence of Caulobacter crescentus  

PubMed Central

The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living ?-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus. PMID:11259647

Nierman, William C.; Feldblyum, Tamara V.; Laub, Michael T.; Paulsen, Ian T.; Nelson, Karen E.; Eisen, Jonathan; Heidelberg, John F.; Alley, M. R. K.; Ohta, Noriko; Maddock, Janine R.; Potocka, Isabel; Nelson, William C.; Newton, Austin; Stephens, Craig; Phadke, Nikhil D.; Ely, Bert; DeBoy, Robert T.; Dodson, Robert J.; Durkin, A. Scott; Gwinn, Michelle L.; Haft, Daniel H.; Kolonay, James F.; Smit, John; Craven, M. B.; Khouri, Hoda; Shetty, Jyoti; Berry, Kristi; Utterback, Teresa; Tran, Kevin; Wolf, Alex; Vamathevan, Jessica; Ermolaeva, Maria; White, Owen; Salzberg, Steven L.; Venter, J. Craig; Shapiro, Lucy; Fraser, Claire M.

2001-01-01

127

Complete genome sequence of Caulobacter crescentus.  

PubMed

The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living alpha-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus. PMID:11259647

Nierman, W C; Feldblyum, T V; Laub, M T; Paulsen, I T; Nelson, K E; Eisen, J A; Heidelberg, J F; Alley, M R; Ohta, N; Maddock, J R; Potocka, I; Nelson, W C; Newton, A; Stephens, C; Phadke, N D; Ely, B; DeBoy, R T; Dodson, R J; Durkin, A S; Gwinn, M L; Haft, D H; Kolonay, J F; Smit, J; Craven, M B; Khouri, H; Shetty, J; Berry, K; Utterback, T; Tran, K; Wolf, A; Vamathevan, J; Ermolaeva, M; White, O; Salzberg, S L; Venter, J C; Shapiro, L; Fraser, C M; Eisen, J

2001-03-27

128

Assessment of Whole Genome Amplification for Sequence Capture and Massively Parallel Sequencing  

PubMed Central

Exome sequence capture and massively parallel sequencing can be combined to achieve inexpensive and rapid global analyses of the functional sections of the genome. The difficulties of working with relatively small quantities of genetic material, as may be necessary when sharing tumor biopsies between collaborators for instance, can be overcome using whole genome amplification. However, the potential drawbacks of using a whole genome amplification technology based on random primers in combination with sequence capture followed by massively parallel sequencing have not yet been examined in detail, especially in the context of mutation discovery in tumor material. In this work, we compare mutations detected in sequence data for unamplified DNA, whole genome amplified DNA, and RNA originating from the same tumor tissue samples from 16 patients diagnosed with non-small cell lung cancer. The results obtained provide a comprehensive overview of the merits of these techniques for mutation analysis. We evaluated the identified genetic variants, and found that most (74%) of them were observed in both the amplified and the unamplified sequence data. Eighty-nine percent of the variations found by WGA were shared with unamplified DNA. We demonstrate a strategy for avoiding allelic bias by including RNA-sequencing information. PMID:24409309

Hasmats, Johanna; Green, Henrik; Orear, Cedric; Validire, Pierre; Huss, Mikael; Kaller, Max; Lundeberg, Joakim

2014-01-01

129

Decoding the fine-scale structure of a breast cancer genome and transcriptome  

Microsoft Academic Search

A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome

Stanislav Volik; Benjamin J. Raphael; Guiqing Huang; Michael R. Stratton; Graham Bignel; John Murnane; John H. Brebner; Krystyna Bajsarowicz; Pamela L. Paris; Quanzhou Tao; David Kowbel; Anna Lapuk; Dmitri A. Shagin; Irina A. Shagina; Joe W. Gray; Jan-Fang Cheng; Pieter J. de Jong; Pavel Pevzner; Colin Collins

2006-01-01

130

Multilocus sequence typing of total-genome-sequenced bacteria.  

PubMed

Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole

2012-04-01

131

Genome sequencing and analysis of Aspergillus oryzae  

Microsoft Academic Search

The genome of Aspergillus oryzae, a fungus important for the production of traditional fermented foods and beverages in Japan, has been sequenced. The ability to secrete large amounts of proteins and the development of a transformation system have facilitated the use of A. oryzae in modern biotechnology. Although both A. oryzae and Aspergillus flavus belong to the section Flavi of

Masayuki Machida; Kiyoshi Asai; Motoaki Sano; Toshihiro Tanaka; Toshitaka Kumagai; Goro Terai; Ken-Ichi Kusumoto; Toshihide Arima; Osamu Akita; Yutaka Kashiwagi; Keietsu Abe; Katsuya Gomi; Hiroyuki Horiuchi; Katsuhiko Kitamoto; Tetsuo Kobayashi; Michio Takeuchi; David W. Denning; James E. Galagan; William C. Nierman; Jiujiang Yu; David B. Archer; Joan W. Bennett; Deepak Bhatnagar; Thomas E. Cleveland; Natalie D. Fedorova; Osamu Gotoh; Hiroshi Horikawa; Akira Hosoyama; Masayuki Ichinomiya; Rie Igarashi; Kazuhiro Iwashita; Praveen Rao Juvvadi; Masashi Kato; Yumiko Kato; Taishin Kin; Akira Kokubun; Hiroshi Maeda; Noriko Maeyama; Jun-Ichi Maruyama; Hideki Nagasaki; Tasuku Nakajima; Ken Oda; Kinya Okada; Ian Paulsen; Kazutoshi Sakamoto; Toshihiko Sawano; Mikio Takahashi; Kumiko Takase; Yasunobu Terabayashi; Jennifer R. Wortman; Osamu Yamada; Youhei Yamagata; Hideharu Anazawa; Yoji Hata; Yoshinao Koide; Takashi Komori; Yasuji Koyama; Toshitaka Minetoki; Sivasundaram Suharnan; Akimitsu Tanaka; Katsumi Isono; Satoru Kuhara; Naotake Ogasawara; Hisashi Kikuchi

2005-01-01

132

Genome Sequence of Corynebacterium ulcerans Strain 210932  

PubMed Central

In this work, we present the complete genome sequence of Corynebacterium ulcerans strain 210932, isolated from a human. The species is an emergent pathogen that infects a variety of wild and domesticated animals and humans. It is associated with a growing number of cases of a diphtheria-like disease around the world. PMID:25428977

Viana, Marcus Vinicius Canário; de Jesus Benevides, Leandro; Batista Mariano, Diego Cesar; de Souza Rocha, Flávia; Bagano Vilas Boas, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Alves Dorella, Fernanda; Gomes Leal, Carlos Augusto; Fiorini de Carvalho, Alex; Silva, Artur; de Castro Soares, Siomar; Pereira Figueiredo, Henrique Cesar; Guimarães, Luis Carlos

2014-01-01

133

Reconstruction of Ancestral Genomic Sequences Using Likelihood  

E-print Network

algorithm and the FPT into an algorithm with arbitrary good approximation guarantee (PTAS). We tested our for reconstructing the ancestral genomes for a set of lentiviruses (relatives of HIV). Availability: Supplementary sequences evolve. Based on our beliefs we state an optimization criterion by which the "correct" ancestral

Lagergren, Jens

134

Tracking adaptive evolutionary events in genomic sequences  

PubMed Central

As more gene and genomic sequences from an increasing assortment of species become available, new pictures of evolution are emerging. Improved methods can pinpoint where positive and negative selection act in individual codons in specific genes on specific branches of phylogenetic trees. Positive selection appears to be important in the interaction between genotype, protein structure, function, and organismal phenotype. PMID:12093382

Liberles, David A; Wayne, Marta L

2002-01-01

135

Draft Genome Sequence of Virgibacillus halodenitrificans 1806  

PubMed Central

Virgibacillus halodenitrificans 1806 is an endospore-forming halophilic bacterium isolated from salterns in Korea. Here, we report the draft genome sequence of V. halodenitrificans 1806, which may reveal the molecular basis of osmoadaptation and insights into carbon and anaerobic metabolism in moderate halophiles. PMID:23105070

Lee, Sang-Jae; Lee, Yong-Jik; Jeong, Haeyoung; Lee, Sang Jun; Lee, Han-Seung; Pan, Jae-Gu

2012-01-01

136

Genome Sequence of Bacillus licheniformis WX-02  

PubMed Central

Bacillus licheniformis is an important bacterium that has been used extensively for large-scale industrial production of exoenzymes and peptide antibiotics. B. licheniformis WX-02 produces poly-gamma-glutamate increasingly when fermented under stress conditions. Here its genome sequence (4,270,104 bp, with G+C content of 46.06%), which comprises a circular chromosome, is announced. PMID:22689245

Yangtse, Wuming; Zhou, Yinhua; Lei, Yang; Qiu, Yimin; Wei, Xuetuan; Ji, Zhixia; Qi, Gaofu; Yong, Yangchun; Chen, Lingling

2012-01-01

137

Genome sequence of Bacillus licheniformis WX-02.  

PubMed

Bacillus licheniformis is an important bacterium that has been used extensively for large-scale industrial production of exoenzymes and peptide antibiotics. B. licheniformis WX-02 produces poly-gamma-glutamate increasingly when fermented under stress conditions. Here its genome sequence (4,270,104 bp, with G+C content of 46.06%), which comprises a circular chromosome, is announced. PMID:22689245

Yangtse, Wuming; Zhou, Yinhua; Lei, Yang; Qiu, Yimin; Wei, Xuetuan; Ji, Zhixia; Qi, Gaofu; Yong, Yangchun; Chen, Lingling; Chen, Shouwen

2012-07-01

138

NIH researchers complete whole-exome sequencing of skin cancer;  

Cancer.gov

A team led by researchers at NIH is the first to systematically survey the landscape of the melanoma genome, the DNA code of the deadliest form of skin cancer. The researchers have made surprising new discoveries using whole-exome sequencing, an approach that decodes the 1-2 percent of the genome that contains protein-coding genes.

139

Genome variation discovery with high-throughput sequencing data  

E-print Network

-throughput sequencing (HTS) technologies is enabling sequencing of human genomes at a signifi- cantly lower cost and copy-number variants from these mappings. Keywords: high-throughput sequencing; genome variation/Solexa and AB SOLiD, are able to sequence a full human genome per week at a cost 200-fold less than previous

Toronto, University of

140

Genome-wide analysis of noncoding regulatory mutations in cancer.  

PubMed

Cancer primarily develops because of somatic alterations in the genome. Advances in sequencing have enabled large-scale sequencing studies across many tumor types, emphasizing the discovery of alterations in protein-coding genes. However, the protein-coding exome comprises less than 2% of the human genome. Here we analyze the complete genome sequences of 863 human tumors from The Cancer Genome Atlas and other sources to systematically identify noncoding regions that are recurrently mutated in cancer. We use new frequency- and sequence-based approaches to comprehensively scan the genome for noncoding mutations with potential regulatory impact. These methods identify recurrent mutations in regulatory elements upstream of PLEKHS1, WDR74 and SDHD, as well as previously identified mutations in the TERT promoter. SDHD promoter mutations are frequent in melanoma and are associated with reduced gene expression and poor prognosis. The non-protein-coding cancer genome remains widely unexplored, and our findings represent a step toward targeting the entire genome for clinical purposes. PMID:25261935

Weinhold, Nils; Jacobsen, Anders; Schultz, Nikolaus; Sander, Chris; Lee, William

2014-11-01

141

Agaricus bisporus genome sequence: a commentary.  

PubMed

The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and ?-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium. PMID:23558250

Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

2013-06-01

142

The Wellcome Trust Sanger Institute: The Cancer Genome Project  

NSDL National Science Digital Library

Supported by the Wellcome Trust Sanger Institute, the Cancer Genome Project (CGP) "is using the human genome sequence and high throughput mutation detection techniques to identify somatically acquired sequence variants/mutations and hence identify genes critical in the development of human cancers. This initiative will ultimately provide the paradigm for the detection of germline mutations in non-neoplastic human genetic diseases through genome-wide mutation detection approaches." The CGP website links to a number of Data Resources including the Cancer Gene Census, Cancer Cell Line Project, Catalogue of Somatic Mutations in Cancer (reported on in the March 4, 2005 NSDL Scout Report for Life Sciences), Somatic Mutations in Protein Kinase Genes, and more. The site also contains an extensive listing of publications from 1998 to 2004 with links to PubMed Abstracts.

2005-11-11

143

What are we learning from the cancer genome?  

PubMed Central

Massively parallel approaches to nucleic acid sequencing have matured from proof-of-concept to commercial products during the past 5 years. These technologies are now widely accessible, increasingly affordable, and have already exerted a transformative influence on the study of human cancer. Here, we review new features of cancer genomes that are being revealed by large-scale applications of these technologies. We focus on those insights most likely to affect future clinical practice. Foremost among these lessons, we summarize the formidable genetic heterogeneity within given cancer types that is appreciable with higher resolution profiling and larger sample sets. We discuss the inherent challenges of defining driving genomic events in a given cancer genome amidst thousands of other somatic events. Finally, we explore the organizational, regulatory and societal challenges impeding precision cancer medicine based on genomic profiling from assuming its place as standard-of-care. PMID:22965149

Collisson, Eric A.; Cho, Raymond J.; Gray, Joe W.

2013-01-01

144

Defining Genome Project Standards in a New Era of Sequencing  

SciTech Connect

Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

Chain, Patrick [DOE-JGI

2009-05-27

145

Whole-genome sequencing in bacteriology: state of the art  

PubMed Central

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

Dark, Michael J

2013-01-01

146

Genome Sequencing Center Tour Videos and Classroom Activities  

NSDL National Science Digital Library

A video tour of the Washington University Genome Sequencing CenterâÂÂsupplemented by additional films and classroom activitiesâÂÂcan help advanced high school students and college undergraduates understand the classical techniques of genome sequencing.

Sarah Elgin (Washington University;)

2010-05-28

147

The Norway spruce genome sequence and conifer genome evolution.  

PubMed

Conifers have dominated forests for more than 200?million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000?base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding. PMID:23698360

Nystedt, Björn; Street, Nathaniel R; Wetterbom, Anna; Zuccolo, Andrea; Lin, Yao-Cheng; Scofield, Douglas G; Vezzi, Francesco; Delhomme, Nicolas; Giacomello, Stefania; Alexeyenko, Andrey; Vicedomini, Riccardo; Sahlin, Kristoffer; Sherwood, Ellen; Elfstrand, Malin; Gramzow, Lydia; Holmberg, Kristina; Hällman, Jimmie; Keech, Olivier; Klasson, Lisa; Koriabine, Maxim; Kucukoglu, Melis; Käller, Max; Luthman, Johannes; Lysholm, Fredrik; Niittylä, Totte; Olson, Ake; Rilakovic, Nemanja; Ritland, Carol; Rosselló, Josep A; Sena, Juliana; Svensson, Thomas; Talavera-López, Carlos; Theißen, Günter; Tuominen, Hannele; Vanneste, Kevin; Wu, Zhi-Qiang; Zhang, Bo; Zerbe, Philipp; Arvestad, Lars; Bhalerao, Rishikesh; Bohlmann, Joerg; Bousquet, Jean; Garcia Gil, Rosario; Hvidsten, Torgeir R; de Jong, Pieter; MacKay, John; Morgante, Michele; Ritland, Kermit; Sundberg, Björn; Thompson, Stacey Lee; Van de Peer, Yves; Andersson, Björn; Nilsson, Ove; Ingvarsson, Pär K; Lundeberg, Joakim; Jansson, Stefan

2013-05-30

148

Prostate cancer genomics  

Microsoft Academic Search

The molecular processes contributing to cancer of the human prostate gland are under intensive investigation. Methods used\\u000a for discovering genetic alterations involved in prostate neoplasia include family studies designed to map hereditary disease\\u000a loci, chromosomal studies to identify aberrations that may locate oncogenes or tumor suppressor genes, and comprehensive gene\\u000a expression studies. These studies determine how various molecular signaling pathways

Paul E. Li; Peter S. Nelson

2001-01-01

149

Genome Sequencing Reveals a Phage in Helicobacter pylori  

PubMed Central

ABSTRACT Helicobacter pylori chronically infects the gastric mucosa in more than half of the human population; in a subset of this population, its presence is associated with development of severe disease, such as gastric cancer. Genomic analysis of several strains has revealed an extensive H. pylori pan-genome, likely to grow as more genomes are sampled. Here we describe the draft genome sequence (63 contigs; 26× mean coverage) of H. pylori strain B45, isolated from a patient with gastric mucosa-associated lymphoid tissue (MALT) lymphoma. The major finding was a 24.6-kb prophage integrated in the bacterial genome. The prophage shares most of its genes (22/27) with prophage region II of Helicobacter acinonychis strain Sheeba. After UV treatment of liquid cultures, circular DNA carrying the prophage integrase gene could be detected, and intracellular tailed phage-like particles were observed in H. pylori cells by transmission electron microscopy, indicating that phage production can be induced from the prophage. PCR amplification and sequencing of the integrase gene from 341 H. pylori strains from different geographic regions revealed a high prevalence of the prophage (21.4%). Phylogenetic reconstruction showed four distinct clusters in the integrase gene, three of which tended to be specific for geographic regions. Our study implies that phages may play important roles in the ecology and evolution of H. pylori. PMID:22086490

Lehours, Philippe; Vale, Filipa F.; Bjursell, Magnus K.; Melefors, Ojar; Advani, Reza; Glavas, Steve; Guegueniat, Julia; Gontier, Etienne; Lacomme, Sabrina; Alves Matos, Antonio; Menard, Armelle; Megraud, Francis; Engstrand, Lars; Andersson, Anders F.

2011-01-01

150

Initial sequencing and comparative analysis of the mouse genome  

Microsoft Academic Search

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

2002-01-01

151

The diploid genome sequence of an Asian individual  

Microsoft Academic Search

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the

Jun Wang; Wei Wang; Ruiqiang Li; Yingrui Li; Geng Tian; Laurie Goodman; Wei Fan; Junqing Zhang; Jun Li; Juanbin Zhang; Yiran Guo; Binxiao Feng; Heng Li; Yao Lu; Xiaodong Fang; Huiqing Liang; Zhenglin Du; Dong Li; Yiqing Zhao; Yujie Hu; Zhenzhen Yang; Hancheng Zheng; Ines Hellmann; Michael Inouye; John Pool; Xin Yi; Jing Zhao; Jinjie Duan; Yan Zhou; Junjie Qin; Lijia Ma; Guoqing Li; Zhentao Yang; Guojie Zhang; Bin Yang; Chang Yu; Fang Liang; Wenjie Li; Shaochuan Li; Dawei Li; Peixiang Ni; Jue Ruan; Qibin Li; Hongmei Zhu; Dongyuan Liu; Zhike Lu; Ning Li; Guangwu Guo; Jianguo Zhang; Jia Ye; Lin Fang; Qin Hao; Quan Chen; Yu Liang; Yeyang Su; A. San; Cuo Ping; Shuang Yang; Fang Chen; Li Li; Ke Zhou; Hongkun Zheng; Yuanyuan Ren; Ling Yang; Guohua Yang; Zhuo Li; Xiaoli Feng; Karsten Kristiansen; Gane Ka-Shu Wong; Rasmus Nielsen; Richard Durbin; Lars Bolund; Xiuqing Zhang; Songgang Li; Huanming Yang; Jian Wang

2008-01-01

152

The genome sequence of Schizosaccharomyces pombe  

Microsoft Academic Search

We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

2002-01-01

153

Draft Genome Sequence of Rubrivivax gelatinosus CBS  

PubMed Central

Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N2 as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H2. We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium. PMID:22628496

Hu, Pingsha; Lang, Juan; Wawrousek, Karen; Yu, Jianping; Maness, Pin-Ching

2012-01-01

154

Draft genome sequence of Rubrivivax gelatinosus CBS.  

PubMed

Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N(2) as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H(2). We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium. PMID:22628496

Hu, Pingsha; Lang, Juan; Wawrousek, Karen; Yu, Jianping; Maness, Pin-Ching; Chen, Jin

2012-06-01

155

Complete Genome Sequences of 138 Mycobacteriophages  

PubMed Central

Bacteriophages are the most numerous biological entities in the biosphere, and although their genetic diversity is high, it remains ill defined. Mycobacteriophages—the viruses of mycobacterial hosts—provide insights into this diversity as well as tools for manipulating Mycobacterium tuberculosis. We report here the complete genome sequences of 138 new mycobacteriophages, which—together with the 83 mycobacteriophages previously reported—represent the largest collection of phages known to infect a single common host, Mycobacterium smegmatis mc2 155. PMID:22282335

2012-01-01

156

Genome sequence of Halobacterium species NRC-1  

PubMed Central

We report the complete sequence of an extreme halophile, Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-bp genome containing 91 insertion sequences representing 12 families and organized into a large chromosome and 2 related minichromosomes. The Halobacterium NRC-1 genome codes for 2,630 predicted proteins, 36% of which are unrelated to any previously reported. Analysis of the genome sequence shows the presence of pathways for uptake and utilization of amino acids, active sodium-proton antiporter and potassium uptake systems, sophisticated photosensory and signal transduction pathways, and DNA replication, transcription, and translation systems resembling more complex eukaryotic organisms. Whole proteome comparisons show the definite archaeal nature of this halophile with additional similarities to the Gram-positive Bacillus subtilis and other bacteria. The ease of culturing Halobacterium and the availability of methods for its genetic manipulation in the laboratory, including construction of gene knockouts and replacements, indicate this halophile can serve as an excellent model system among the archaea. PMID:11016950

Ng, Wailap Victor; Kennedy, Sean P.; Mahairas, Gregory G.; Berquist, Brian; Pan, Min; Shukla, Hem Dutt; Lasky, Stephen R.; Baliga, Nitin S.; Thorsson, Vesteinn; Sbrogna, Jennifer; Swartzell, Steven; Weir, Douglas; Hall, John; Dahl, Timothy A.; Welti, Russell; Goo, Young Ah; Leithauser, Brent; Keller, Kim; Cruz, Randy; Danson, Michael J.; Hough, David W.; Maddocks, Deborah G.; Jablonski, Peter E.; Krebs, Mark P.; Angevine, Christine M.; Dale, Heather; Isenbarger, Thomas A.; Peck, Ronald F.; Pohlschroder, Mechthild; Spudich, John L.; Jung, Kwang-Hwan; Alam, Maqsudul; Freitas, Tracey; Hou, Shaobin; Daniels, Charles J.; Dennis, Patrick P.; Omer, Arina D.; Ebhardt, Holger; Lowe, Todd M.; Liang, Ping; Riley, Monica; Hood, Leroy; DasSarma, Shiladitya

2000-01-01

157

The Predictive Capacity of Personal Genome Sequencing  

PubMed Central

New DNA sequencing methods will soon make it possible to identify all germline variants in any individual at a reasonable cost. However, the ability of whole-genome sequencing to predict predisposition to common diseases in the general population is unknown. To estimate this predictive capacity, we use the concept of a “genometype”. A specific genometype represents the genomes in the population conferring a specific level of genetic risk for a specified disease. Using this concept, we estimated the capacity of whole-genome sequencing to identify individuals at clinically significant risk for 24 different diseases. Our estimates were derived from the analysis of large numbers of monozygotic twin pairs; twins of a pair share the same genometype and therefore identical genetic risk factors. Our analyses indicate that: (i) for 23 of the 24 diseases, the majority of individuals will receive negative test results, (ii) these negative test results will, in general, not be very informative, as the risk of developing 19 of the 24 diseases in those who test negative will still be, at minimum, 50 - 80% of that in the general population, and (iii) on the positive side, in the best-case scenario more than 90% of tested individuals might be alerted to a clinically significant predisposition to at least one disease. These results have important implications for the valuation of genetic testing by industry, health insurance companies, public policy makers and consumers. PMID:22472521

Roberts, Nicholas J.; Vogelstein, Joshua T.; Parmigiani, Giovanni; Kinzler, Kenneth W.; Vogelstein, Bert; Velculescu, Victor E.

2013-01-01

158

The Z curve database: a graphic representation of genome sequences  

Microsoft Academic Search

Motivation: Genome projects for many prokaryotic and eukaryotic species have been completed and more new genome projects are being underway currently. The avail- ability of a large number of genomic sequences for re- searchers creates a need to find graphic tools to study genomes in a perceivable form. The Z curve is one of such tools available for visualizing genomes.

Chun-ting Zhang; Ren Zhang; Hong-yu Ou

2003-01-01

159

Identification of ancient remains through genomic sequencing  

PubMed Central

Studies of ancient DNA have been hindered by the preciousness of remains, the small quantities of undamaged DNA accessible, and the limitations associated with conventional PCR amplification. In these studies, we developed and applied a genomewide adapter-mediated emulsion PCR amplification protocol for ancient mammalian samples estimated to be between 45,000 and 69,000 yr old. Using 454 Life Sciences (Roche) and Illumina sequencing (formerly Solexa sequencing) technologies, we examined over 100 megabases of DNA from amplified extracts, revealing unbiased sequence coverage with substantial amounts of nonredundant nuclear sequences from the sample sources and negligible levels of human contamination. We consistently recorded over 500-fold increases, such that nanogram quantities of starting material could be amplified to microgram quantities. Application of our protocol to a 50,000-yr-old uncharacterized bone sample that was unsuccessful in mitochondrial PCR provided sufficient nuclear sequences for comparison with extant mammals and subsequent phylogenetic classification of the remains. The combined use of emulsion PCR amplification and high-throughput sequencing allows for the generation of large quantities of DNA sequence data from ancient remains. Using such techniques, even small amounts of ancient remains with low levels of endogenous DNA preservation may yield substantial quantities of nuclear DNA, enabling novel applications of ancient DNA genomics to the investigation of extinct phyla. PMID:18426903

Blow, Matthew J.; Zhang, Tao; Woyke, Tanja; Speller, Camilla F.; Krivoshapkin, Andrei; Yang, Dongya Y.; Derevianko, Anatoly; Rubin, Edward M.

2008-01-01

160

Environmental exposures and mutational patterns of cancer genomes  

Microsoft Academic Search

The etiology of most human cancers is unknown. Genetic inheritance and environmental factors are thought to have major roles,\\u000a and for some types of cancer, exposure to carcinogens is a proven mechanism leading to tumorigenesis. Sequencing of entire\\u000a cancer genomes has not only begun to provide clues regarding functionally relevant mutations, but has also paved the way towards\\u000a understanding the

Gerd P Pfeifer

2010-01-01

161

The Consensus Coding Sequences of Human Breast and Colorectal Cancers  

Microsoft Academic Search

The elucidation of the human genome sequence has made it possible to identify genetic alterations in cancers in unprecedented detail. To begin a systematic analysis of such alterations, we determined the sequence of well-annotated human protein-coding genes in two common tumor types. Analysis of 13,023 genes in 11 breast and 11 colorectal cancers revealed that individual tumors accumulate an average

Tobias Sjöblom; Siân Jones; Laura D. Wood; D. Williams Parsons; Jimmy Lin; Thomas D. Barber; Diana Mandelker; Rebecca J. Leary; Janine Ptak; Natalie Silliman; Steve Szabo; Phillip Buckhaults; Christopher Farrell; Paul Meeh; Sanford D. Markowitz; Joseph Willis; Dawn Dawson; James K. V. Willson; Adi F. Gazdar; James Hartigan; Leo Wu; Changsheng Liu; Giovanni Parmigiani; Ben Ho Park; Kurtis E. Bachman; Nickolas Papadopoulos; Bert Vogelstein; Kenneth W. Kinzler; Victor E. Velculescu

2006-01-01

162

Whole genome sequencing of matched primary and metastatic acral melanomas  

PubMed Central

Next generation sequencing has enabled systematic discovery of mutational spectra in cancer samples. Here, we used whole genome sequencing to characterize somatic mutations and structural variation in a primary acral melanoma and its lymph node metastasis. Our data show that the somatic mutational rates in this acral melanoma sample pair were more comparable to the rates reported in cancer genomes not associated with mutagenic exposure than in the genome of a melanoma cell line or the transcriptome of melanoma short-term cultures. Despite the perception that acral skin is sun-protected, the dominant mutational signature in these samples is compatible with damage due to ultraviolet light exposure. A nonsense mutation in ERCC5 discovered in both the primary and metastatic tumors could also have contributed to the mutational signature through accumulation of unrepaired dipyrimidine lesions. However, evidence of transcription-coupled repair was suggested by the lower mutational rate in the transcribed regions and expressed genes. The primary and the metastasis are highly similar at the level of global gene copy number alterations, loss of heterozygosity and single nucleotide variation (SNV). Furthermore, the majority of the SNVs in the primary tumor were propagated in the metastasis and one nonsynonymous coding SNV and one splice site mutation appeared to arise de novo in the metastatic lesion. PMID:22183965

Turajlic, Samra; Furney, Simon J.; Lambros, Maryou B.; Mitsopoulos, Costas; Kozarewa, Iwanka; Geyer, Felipe C.; MacKay, Alan; Hakas, Jarle; Zvelebil, Marketa; Lord, Christopher J.; Ashworth, Alan; Thomas, Meirion; Stamp, Gordon; Larkin, James; Reis-Filho, Jorge S.; Marais, Richard

2012-01-01

163

Why Assembling Plant Genome Sequences Is So Challenging  

PubMed Central

In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

Claros, Manuel Gonzalo; Bautista, Rocio; Guerrero-Fernandez, Dario; Benzerki, Hicham; Seoane, Pedro; Fernandez-Pozo, Noe

2012-01-01

164

CancerGenes: a gene selection resource for cancer genome projects  

Microsoft Academic Search

The genome sequence framework provided by the human genome project allows us to precisely map human genetic variations in order to study their association with disease and their direct effects on gene function. Since the description of tumor suppressor genes and oncogenes several decades ago, both germ-line variations and somatic mutations have been established to be important in cancer—in terms

Maureen E. Higgins; Martine Claremont; John E. Major; Chris Sander; Alex E. Lash

2007-01-01

165

Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements  

PubMed Central

As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under consideration. Furthermore, the linear order of these segments may be shuffled among genomes. We present methods for identification and alignment of conserved genomic DNA in the presence of rearrangements and horizontal transfer. Our methods have been implemented in a software package called Mauve. Mauve has been applied to align nine enterobacterial genomes and to determine global rearrangement structure in three mammalian genomes. We have evaluated the quality of Mauve alignments and drawn comparison to other methods through extensive simulations of genome evolution. PMID:15231754

Darling, Aaron C.E.; Mau, Bob; Blattner, Frederick R.; Perna, Nicole T.

2004-01-01

166

Morphology and genome sequence of phage ?1402  

PubMed Central

Phages are among the simplest biological entities known and simultaneously the most numerous and ubiquitous members of the biosphere. Among the three families of tailed dsDNA phages, the Myoviridae have the most structurally sophisticated tails which are capable of contraction, unlike the simpler tails of the Podoviridae and Siphoviridae. Such “nanomachines” tails are involved in both efficient phage adsorption and genome injection. Their structural complexity probably necessitates multistep morphogenetic pathways, involving non-structural components, to correctly assemble the structural constituents. For reasons probably related, at least in part, to such morphological intricacy, myoviruses tend to have larger genomes than simpler phages. As a consequence, there are no well-characterized myoviruses with a size of less than 40 kb. Here we report on the characterization and sequencing of the 23,931 bp genome of the dwarf myovirus ?1402 of Bdellovibrio bacteriovorus. Our genomic analysis shows that ?1402 differs substantially from all other known phages and appears to be the smallest known autonomous myovirus. PMID:22164347

Ackermann, Hans-W; Krisch, Henry M

2011-01-01

167

Genome Sequence of the Pea Aphid Acyrthosiphon The International Aphid Genomics Consortium"  

E-print Network

Genome Sequence of the Pea Aphid Acyrthosiphon pisum The International Aphid Genomics Consortium we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple

Paris-Sud XI, Université de

168

Ten years of bacterial genome sequencing: comparative-genomics-based discoveries  

Microsoft Academic Search

It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of

Tim T. Binnewies; Yair Motro; Peter F. Hallin; Ole Lund; David Dunn; Tom La; David J. Hampson; Matthew Bellgard; Trudy M. Wassenaar; David W. Ussery

2006-01-01

169

Initial sequencing and comparative analysis of the mouse genome  

SciTech Connect

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

2002-12-15

170

High-throughput sequencing of the melanoma genome.  

PubMed

Next-generation sequencing technologies are now common for whole-genome, whole-exome and whole-transcriptome sequencing (RNA-seq) of tumors to identify point mutations, structural or copy number alterations and changes in gene expression. A substantial number of studies have already been performed for melanoma. One study analysed eight melanoma cell lines with RNA-Seq technology and identified 11 novel melanoma gene fusions. Whole-exome sequencing of seven melanoma cell lines identified overlapping gain of function mutations in MAP2K1 (MEK1) and MAP2K2 (MEK2) genes. Integrative sequencing of cutaneous melanoma metastases using different sequencing platforms revealed a new somatic point mutation in HRAS and a structural rearrangement affecting CDKN2C (a CDK4 inhibitor). These latter sequencing-based discoveries may be used to motivate the inclusion of the affected patients into clinical trials with specific signalling pathway inhibitors. Taken together, we are at the beginning of an era with new sequencing technologies providing a more comprehensive view of cancer mutational landscapes and hereby a better understanding of their pathogenesis. This will also open interesting perspectives for new treatment approaches and clinical trial designs. PMID:23174022

Kunz, Manfred; Dannemann, Michael; Kelso, Janet

2013-01-01

171

The Jackson Laboratory: The Mouse Genome Sequence Project  

NSDL National Science Digital Library

Part of the Mouse Genome Informatics program (last reported on in the NSDL Scout Report for the Life Sciences on March 19, 2004) at the Jackson Laboratory, this website presents The Mouse Genome Sequence (MGS) project. MGS is designed "to integrate emerging mouse genomic sequence data with the genetic and biological data available in MGD and GXD." The site links to Eukaryotic Genome Annotation Projects, as well as Sequence Analysis Tools including MouseBlast and Genome Analysis. The site also offers basic background information about the Mouse Genome Sequencing Initiative, and provides site users with access to groups involved in mouse genome sequencing, the BAC clone library, request forms for targeted sequencing, and more.

172

Proposal Article A Proposal to Sequence Genomes of Unique  

E-print Network

species would define a new paradigm for geronto- logical research. For example, nearly all vertebrates sequencing of genomes, including the human genome, has revolutionized biomedical research. The projected genome sequencing of a large number of mam- malian species, such as several primates, will offer extra

de Magalhães, João Pedro

173

Human genome mapping and sequencing: perspectives for toxicology  

Microsoft Academic Search

Until recently, the human genome programs were mainly directed towards the development of maps to identify disease genes. Three major tools: the genetic map, the physical map and the gene map are presently available. The human genome project is now progressively shifting to massive sequencing although sequence-ready maps are not yet available for a large part of the human genome.

Jean Weissenbach

1998-01-01

174

Comprehensive transcriptome analysis with the Genome Sequencer FLX System  

E-print Network

-deletions and chromosomal rearrangements. When investigating unsequenced organisms or mapping back to the human genomeComprehensive transcriptome analysis with the Genome Sequencer FLX System Protein isoforms make just 200 ng of RNA as sample input, the Genome Sequencer FLX System offers a powerful solution to help

Cai, Long

175

Optimizing the BACEnd Strategy for Sequencing the Human Genome  

E-print Network

Optimizing the BAC­End Strategy for Sequencing the Human Genome Richard M. Karp \\Lambda Ron Shamir y April 24, 1999 Abstract The rapid increase in human genome sequencing effort and the emergence University, Tel Aviv, 69978, Israel. 1 #12; 1 Introduction With the Human Genome Project moving from the map

Shamir, Ron

176

A snapshot of the emerging tomato genome sequence  

Microsoft Academic Search

The genome of tomato (Solanum lycopersicum L.) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States) as part of the larger “International Solanaceae Genome Project (SOL): Systems Approach to Diversity and Adaptation” initiative. The tomato genome sequencing project uses an ordered bacterial artificial

L. A. Mueller; R. M. Klein Lankhorst; S. D. Tanksley; R. M. Peters; Staveren van M. J; E. Datema; M. W. E. J. Fiers; Ham van R. C. H. J; D. Szinay; Jong de J. H. S. G. M; N. Menda; I. Y. Tecle; A. Bombarely; S. Stack; S. M. Royer; S.-B. Chang; L. A. Shearer; B. D. Kim; S.-H. Jo; C.-G. Hur; D. Choi; C.-B. Li; J. Zhao; H. Jiang; Y. Geng; Y. Dai; H. Fan; J. Chen; F. Lu; J. Shi; S. Sun; X. Yang; C. Lu; M. Chen; Z. Cheng; H. Ling; Y. Xue; Y. Wang; G. B. Seymour; G. J. Bishop; G. Bryan; J. Rogers; S. Sims; S. Butcher; D. Buchan; J. Abbott; H. Beasley; C. Nicholson; C. Riddle; S. Humphray; K. McLaren; S. Mathur; S. Vyas; A. U. Solanke; R. Kumar; V. Gupta; A. K. Sharma; P. Khurana; J. P. Khurana; A. Tyagi; Sarita; P. Chowdhury; S. Shridhar; D. Chattopadhyay; A. Pandit; P. Singh; A. Kumar; R. Dixit; A. Singh; S. Praveen; V. Dalal; M. Yadav; I. A. Ghazi; K. Gaikwad; T. R. Sharma; T. Mohapatra; N. K. Singh; H. de Jong; S. Peters; M. van Staveren; R. C. H. J. van Ham; P. Lindhout; M. Philippot; P. Frasse; F. Regad; M. Zouine; M. Bouzayen; E. Asamizu; S. Sato; H. Fukuoka; S. Tabata; D. Shibata; M. A. Botella; M. Perez-Alonso; V. Fernandez-Pedrosa; S. Osorio; A. Mico; A. Granell; Z. Zhang; J. He; S. Huang; Y. Du; D. Qu; L. Liu; D. Liu; J. Wang; Z. Ye; W. Yang; G. Wang; A. Vezzi; S. Todesco; G. Valle; G. Falcone; M. Pietrella; G. Giuliano; S. Grandillo; A. Traini; N. D'Agostino; M. L. Chiusano; M. Ercolano; A. Barone; L. Frusciante; H. Schoof; A. Jocker; R. Bruggmann; M. Spannagl; K. X. F. Mayer; R. Guigo; F. Camara; S. Rombauts; J. A. Fawcett; Y. Van de Peer; S. Knapp; D. Zamir; W. Stiekema

2009-01-01

177

Draft Genome Sequence of Geotrichum candidum Strain 3C  

PubMed Central

We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

Bobrov, Kirill S.; Eneyskaya, Elena V.; Kulminskaya, Anna A.

2014-01-01

178

Draft Genome Sequence of Geotrichum candidum Strain 3C.  

PubMed

We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

Polev, Dmitrii E; Bobrov, Kirill S; Eneyskaya, Elena V; Kulminskaya, Anna A

2014-01-01

179

Mapping the Human Reference Genome's Missing Sequence by Three-Way Admixture in Latino Genomes  

E-print Network

ARTICLE Mapping the Human Reference Genome's Missing Sequence by Three-Way Admixture in Latino on next-generation sequencing, utilize physical maps of the human genome's sequence to inter- pret. McCarroll1,2,3,* A principal obstacle to completing maps and analyses of the human genome involves

McCarroll, Steve

180

Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis  

Microsoft Academic Search

The 4 202 353 bp genome of the alkaliphilic bacterium Bacillus halodurans C-125 contains 4066 predicted protein coding sequences (CDSs), 2141 (52.7%) of which have functional assignments, 1182 (29%) of which are conserved CDSs with unknown function and 743 (18.3%) of which have no match to any protein database. Among the total CDSs, 8.8% match sequences of proteins found only

Hideto Takami; Kaoru Nakasone; Yoshihiro Takaki; Go Maeno; Rumie Sasaki; Noriaki Masui; Fumie Fuji; Chie Hirama; Yuka Nakamura; Naotake Ogasawara; Satoru Kuhara; Koki Horikoshi

2000-01-01

181

Evaluation of Genome Sequencing Quality in Selected Plant Species Using Expressed Sequence Tags  

PubMed Central

Background With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. Methodology/Principal Finding Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ? 98.28% and 89.02% ? 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. Conclusion The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published. PMID:23922843

Shangguan, Lingfei; Han, Jian; Kayesh, Emrul; Sun, Xin; Zhang, Changqing; Pervaiz, Tariq; Wen, Xicheng; Fang, Jinggui

2013-01-01

182

Complete genome sequence of Methanoculleus marisnigri type strain JR1  

SciTech Connect

Methanoculleus marisnigri Romesser et al. 1981 is a methanogen belonging to the order Methanomicrobiales within the archaeal phylum Euryarchaeota. The type strain, JR1, was isolated from anoxic sediments of the Black Sea. M. marisnigri is of phylogenetic interest because at the time the sequencing project began only one genome had previously been sequenced from the order Methanomicrobiales. We report here the complete genome sequence of M. marisnigri type strain JR1 and its annotation. This is part of a Joint Genome Institute 2006 Community Sequencing Program to sequence genomes of diverse Archaea.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Sieprawska-Lupa, Magdalena [University of Georgia, Athens, GA; Goltsman, Eugene [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Dalin, Eileen [U.S. Department of Energy, Joint Genome Institute; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Land, Miriam L [ORNL; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Richardson, P M [U.S. Department of Energy, Joint Genome Institute; Whitman, W. B. [University of Georgia, Athens, GA; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

2009-01-01

183

Genomic Sequence Comparisons, 1987-2003 Final Report  

SciTech Connect

This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

George M. Church

2004-07-29

184

Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability  

PubMed Central

Genomic instability is a hallmark of human cancers, including the 5% caused by human papillomavirus (HPV). Here we report a striking association between HPV integration and adjacent host genomic structural variation in human cancer cell lines and primary tumors. Whole-genome sequencing revealed HPV integrants flanking and bridging extensive host genomic amplifications and rearrangements, including deletions, inversions, and chromosomal translocations. We present a model of “looping” by which HPV integrant-mediated DNA replication and recombination may result in viral–host DNA concatemers, frequently disrupting genes involved in oncogenesis and amplifying HPV oncogenes E6 and E7. Our high-resolution results shed new light on a catastrophic process, distinct from chromothripsis and other mutational processes, by which HPV directly promotes genomic instability. PMID:24201445

Akagi, Keiko; Li, Jingfeng; Broutian, Tatevik R.; Padilla-Nash, Hesed; Xiao, Weihong; Jiang, Bo; Rocco, James W.; Teknos, Theodoros N.; Kumar, Bhavna; Wangsa, Danny; He, Dandan; Ried, Thomas; Symer, David E.; Gillison, Maura L.

2014-01-01

185

Nanoscale Biomarkers for Cancer Genomics and Protemics  

Microsoft Academic Search

We are facing tremendous opportunities and challenges in combining emerging nanotechnology with genomic signal processing techniques in developing faster, smaller, yet more accurate and sensitive biomedical devices in cancer genomics and proteomics. The goal is to better understand the cancer mechanisms at the cellular and even subcellular levels. Nanotechnology has been applied to study the dynamic processes of individual cells,

Jie Zeng; Jie Chen; Xiaoping Wang; Jianguo Hou; Stephen T. C. Wong

2006-01-01

186

Prostate cancer genomics: towards a new understanding  

Microsoft Academic Search

Recent genetics and genomics studies of prostate cancer have helped to clarify the genetic basis of this common but complex disease. Genome-wide studies have detected numerous variants associated with disease as well as common gene fusions and expression 'signatures' in prostate tumours. On the basis of these results, some advocate gene-based individualized screening for prostate cancer, although such testing might

John S. Witte

2009-01-01

187

Genome sequencing of the important oilseed crop Sesamum indicum L  

PubMed Central

The Sesame Genome Working Group (SGWG) has been formed to sequence and assemble the sesame (Sesamum indicum L.) genome. The status of this project and our planned analyses are described. PMID:23369264

2013-01-01

188

Draft Genome Sequence of Bacillus amyloliquefaciens B-1895.  

PubMed

In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

Karlyshev, Andrey V; Melnikov, Vyacheslav G; Chistyakov, Vladimir A

2014-01-01

189

Draft Genome Sequence of Bacillus amyloliquefaciens B-1895  

PubMed Central

In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

Melnikov, Vyacheslav G.; Chistyakov, Vladimir A.

2014-01-01

190

Initial impact of the sequencing of the human genome  

E-print Network

The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

191

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

2000-01-01

192

First complete genome sequence of infectious laryngotracheitis virus  

PubMed Central

Background Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine the complete genomic sequence of a live attenuated vaccine strain of ILTV. Results The complete genomic sequence of the Serva vaccine strain of ILTV was determined, annotated and compared to the concatenated ILTV reference sequence. The genome size of the Serva strain was 152,628 bp, with a G + C content of 48%. A total of 80 predicted open reading frames were identified. The Serva strain had 96.5% DNA sequence identity with the concatenated ILTV sequence. Notably, the concatenated ILTV sequence was found to lack four large regions of sequence, including 528 bp and 594 bp of sequence in the UL29 and UL36 genes, respectively, and two copies of a 1,563 bp sequence in the repeat regions. Considerable differences in the size of the predicted translation products of 4 other genes (UL54, UL30, UL37 and UL38) were also identified. More than 530 single-nucleotide polymorphisms (SNPs) were identified. Most SNPs were located within three genomic regions, corresponding to sequence from the SA-2 ILTV vaccine strain in the concatenated ILTV sequence. Conclusions This is the first complete genomic sequence of an individual ILTV strain. This sequence will facilitate future comparative genomic studies of ILTV by providing an appropriate reference sequence for the sequence analysis of other ILTV strains. PMID:21501528

2011-01-01

193

Spectrum-Based De Novo Repeat Detection in Genomic Sequences  

Microsoft Academic Search

ABSTRACT A novel approach,to the detection of genomic,repeats is presented in this paper. The technique, dubbed SAGRI (Spectrum Assisted Genomic Repeat Identifier), is based on the spectrum (set of sequence k-mers, for some k) of the genomic sequence. Specifically, the genome,is scanned,twice. The first scan (FindHit) detects candidate,pairs of repeat- segments, by effectively reconstructing portions of the Euler path of

Huy Hoang Do; Kwok Pui Choi; Franco P. Preparata; Wing-kin Sung; Louxin Zhang

2008-01-01

194

Draft genome sequence of the coccolithovirus EhV-84  

PubMed Central

The Coccolithoviridae is a recently discovered group of viruses that infect the marine coccolithophorid Emiliania huxleyi. Emiliania huxleyi virus 84 (EhV-84) has a 160 -180 nm diameter icosahedral structure and a genome of approximately 400 kbp. Here we describe the structural and genomic features of this virus, together with a near complete draft genome sequence (~99%) and its annotation. This is the fourth genome sequence of a member of the coccolithovirus family. PMID:22180805

Nissimov, Jozef I.; Worthy, Charlotte A.; Rooks, Paul; Napier, Johnathan A.; Kimmance, Susan A.; Henn, Matthew R; Ogata, Hiroyuki; Allen, Michael J.

2011-01-01

195

Next-generation sequencing strategies for characterizing the turkey genome.  

PubMed

The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry. PMID:24570472

Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

2014-02-01

196

Genetics and genomics of prostate cancer  

PubMed Central

Prostate cancer (PCa) is one of the most common malignancies in the world with over 890 000 cases and over 258 000 deaths worldwide each year. Nearly all mortalities from PCa are due to metastatic disease, typically through tumors that evolve to be hormone-refractory or castrate-resistant. Despite intensive epidemiological study, there are few known environmental risk factors, and age and family history are the major determinants. However, there is extreme heterogeneity in PCa incidence worldwide, suggesting that major determining factors have not been described. Genome-wide association studies have been performed and a considerable number of significant, but low-risk loci have been identified. In addition, several groups have analyzed PCa by determination of genomic copy number, fusion gene generation and targeted resequencing of candidate genes, as well as exome and whole genome sequencing. These initial studies have examined both primary and metastatic tumors as well as murine xenografts and identified somatic alterations in TP53 and other potential driver genes, and the disturbance of androgen response and cell cycle pathways. It is hoped that continued characterization of risk factors as well as gene mutation and misregulation in tumors will aid in understanding, diagnosing and better treating PCa. PMID:23564043

Dean, Michael; Lou, Hong

2013-01-01

197

Toward a Comprehensive Genomic Analysis of Cancer  

Cancer.gov

The National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI) convened a "Toward a Comprehensive Genomic Analysis of Cancer" workshop in Washington, D.C. This workshop brought together physicians, basic scientists and other members of the U.S. and international cancer communities to assist in outlining the most effective strategies for the development of a successful project. Information about this workshop is reported in the Executive Summary.

198

A sequence-based survey of the complex structural organization of tumor genomes  

SciTech Connect

The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

2008-04-03

199

Selection to sequence: opportunities in fungal genomics  

SciTech Connect

Selection is a biological force, causing genotypic and phenotypic change over time. Whether environmental or human induced, selective pressures shape the genotypes and the phenotypes of organisms both in nature and in the laboratory. In nature, selective pressure is highly dynamic and the sum of the environment and other organisms. In the laboratory, selection is used in genetic studies and industrial strain development programs to isolate mutants affecting biological processes of interest to researchers. Selective pressures are important considerations for fungal biology. In the laboratory a number of fungi are used as experimental systems to study a wide range of biological processes and in nature fungi are important pathogens of plants and animals and play key roles in carbon and nitrogen cycling. The continued development of high throughput sequencing technologies makes it possible to characterize at the genomic level, the effect of selective pressures both in the lab and in nature for filamentous fungi as well as other organisms.

Baker, Scott E.

2009-12-01

200

The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now  

PubMed Central

The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639

Engel, Stacia R.; Dietrich, Fred S.; Fisk, Dianna G.; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C.; Dwight, Selina S.; Hitz, Benjamin C.; Karra, Kalpana; Nash, Robert S.; Weng, Shuai; Wong, Edith D.; Lloyd, Paul; Skrzypek, Marek S.; Miyasato, Stuart R.; Simison, Matt; Cherry, J. Michael

2014-01-01

201

Open-Access Cancer Genomics Tools: the UCSC Cancer Genomics Browser  

Cancer.gov

The completion of the Human Genome Project sparked a revolution in high-throughput genomics applied towards deciphering genetically complex diseases, like cancer. Now, almost 10 years later, we have a mountain of genomics data on many different cancer types and subtypes that is rapidly expanding.

202

The translation of cancer genomics: time for a revolution in clinical cancer care.  

PubMed

The introduction of next-generation sequencing technologies has dramatically impacted the life sciences, perhaps most profoundly in the area of cancer genomics. Clinical applications of next-generation sequencing and associated methods are emerging from ongoing large-scale discovery projects that have catalogued hundreds of genes as having a role in cancer susceptibility, onset and progression. For example, discovery cancer genomics has confirmed that many of the same genes are altered by mutation, copy number gain or loss, or structural variation across multiple tumor types, resulting in a gain or loss of function that likely contributes to cancer development in these tissues. Beyond these frequently mutated genes, we now know there is a 'long tail' of less frequently mutated, but probably important, genes that play roles in cancer onset or progression. Here, I discuss some of the remaining barriers to clinical translation, and look forward to new applications of these technologies in cancer care. PMID:25031616

Mardis, Elaine R

2014-01-01

203

The Roche Cancer Genome Database 2.0  

Microsoft Academic Search

Background  Cancer is a disease of genome alterations that arise through the acquisition of multiple somatic DNA sequence mutations. Some\\u000a of these mutations can be critical for the development of a tumor and can be useful to characterize tumor types or predict\\u000a outcome.\\u000a \\u000a \\u000a \\u000a \\u000a Description  We have constructed an integrated biological information system termed the Roche Cancer Genome Database (RCGDB) combining\\u000a different human

Jan Küntzer; Daniela Maisel; Hans-Peter Lenhof; Stefan Klostermann; Helmut Burtscher

2011-01-01

204

Genome scanning : an AFM-based DNA sequencing technique  

E-print Network

Genome Scanning is a powerful new technique for DNA sequencing. The method presented in this thesis uses an atomic force microscope with a functionalized cantilever tip to sequence single stranded DNA immobilized to a mica ...

Elmouelhi, Ahmed (Ahmed M.), 1979-

2003-01-01

205

Mapping the Human Reference Genome’s Missing Sequence by Three-Way Admixture in Latino Genomes  

PubMed Central

A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

Genovese, Giulio; Handsaker, Robert E.; Li, Heng; Kenny, Eimear E.; McCarroll, Steven A.

2013-01-01

206

Translational genomics in cancer research: converting profiles into personalized cancer medicine  

PubMed Central

Cancer genomics is a rapidly growing discipline in which the genetic molecular basis of malignancy is studied at the scale of whole genomes. While the discipline has been successful with respect to identifying specific oncogenes and tumor suppressors involved in oncogenesis, it is also challenging our approach to managing patients suffering from this deadly disease. Specifically cancer genomics is driving clinical oncology to take a more molecular approach to diagnosis, prognostication, and treatment selection. We review here recent work undertaken in cancer genomics with an emphasis on translation of genomic findings. Finally, we discuss scientific challenges and research opportunities emerging from findings derived through analysis of tumors with high-depth sequencing. PMID:24349831

Patel, Lalit; Parker, Brittany; Yang, Da; Zhang, Wei

2013-01-01

207

Integrative genomic approaches to understanding cancer  

Microsoft Academic Search

Further advances in the prevention, diagnosis and treatment of cancer require a more complete knowledge of the molecular mechanisms that program the malignant state. Until recently, identifying and validating genetic alterations in tumors that contribute to cancer involved painstaking efforts focused primarily on single mutations. However, the application of whole genome approaches to the study of cancer now makes it

William C. Hahn; Ian F. Dunn; So Young Kim; Anna C. Schinzel; Ron Firestein; Isil Guney; Jesse S. Boehm

2009-01-01

208

A new workflow for whole-genome sequencing of single human cells.  

PubMed

Unbiased amplification of the whole-genome amplification (WGA) of single cells is crucial to study cancer evolution and genetic heterogeneity, but is challenging due to the high complexity of the human genome. Here, we present a new workflow combining an efficient adapter-linker PCR-based WGA method with second-generation sequencing. This approach allows comparison of single cells at base pair resolution. Amplification recovered up to 74% of the human genome. Copy-number variants and loss of heterozygosity detected in single cell genomes showed concordance of up to 99% to pooled genomic DNA. Allele frequencies of mutations could be determined accurately due to an allele dropout rate of only 2%, clearly demonstrating the low bias of our PCR-based WGA approach. Sequencing with paired-end reads allowed genome-wide analysis of structural variants. By direct comparison to other WGA methods, we further endorse its suitability to analyze genetic heterogeneity. PMID:25066732

Binder, Vera; Bartenhagen, Christoph; Okpanyi, Vera; Gombert, Michael; Moehlendick, Birte; Behrens, Bianca; Klein, Hans-Ulrich; Rieder, Harald; Ida Krell, Pina Fanny; Dugas, Martin; Stoecklein, Nikolas Hendrik; Borkhardt, Arndt

2014-10-01

209

Unravelling the genomic targets of small molecules using high-throughput sequencing.  

PubMed

Small molecules - including various approved and novel cancer therapeutics - can operate at the genomic level by targeting the DNA and protein components of chromatin. Emerging evidence suggests that functional interactions between small molecules and the genome are non-stochastic and are influenced by a dynamic interplay between DNA sequences and chromatin states. The establishment of genome-wide maps of small-molecule targets using unbiased methodologies can help to characterize and exploit drug responses. In this Review, we discuss how high-throughput sequencing strategies, such as ChIP-seq (chromatin immunoprecipitation followed by sequencing) and Chem-seq (chemical affinity capture and massively parallel DNA sequencing), are enabling the comprehensive identification of small-molecule target sites throughout the genome, thereby providing insights into unanticipated drug effects. PMID:25311424

Rodriguez, Raphaël; Miller, Kyle M

2014-12-01

210

Genome Sequence of Brevibacillus laterosporus Strain GI-9  

PubMed Central

We report the 5.18-Mb genome sequence of Brevibacillus laterosporus strain GI-9, isolated from a subsurface soil sample during a screen for novel strains producing antimicrobial compounds. The draft genome of this strain will aid in biotechnological exploitation and comparative genomics of Brevibacillus laterosporus strains. PMID:22328768

Sharma, Vikas; Singh, Pradip K.; Midha, Samriti; Ranjan, Manish

2012-01-01

211

Complete Genome Sequences of Helicobacter pylori Clarithromycin-Resistant Strains  

PubMed Central

We report the complete genome sequences of two Helicobacter pylori clarithromycin-resistant strains. Clarithromycin (CLR)-resistant strains were obtained under the exposure of H. pylori strain 26695 on agar plates with low clarithromycin concentrations. The genome data provide insights into the genomic changes of H. pylori under selection by clarithromycin in vitro. PMID:24233587

Binh, Tran Thanh; Suzuki, Rumiko; Shiota, Seiji; Kwon, Dong Hyeon

2013-01-01

212

Genome sequence of Brevibacillus laterosporus strain GI-9.  

PubMed

We report the 5.18-Mb genome sequence of Brevibacillus laterosporus strain GI-9, isolated from a subsurface soil sample during a screen for novel strains producing antimicrobial compounds. The draft genome of this strain will aid in biotechnological exploitation and comparative genomics of Brevibacillus laterosporus strains. PMID:22328768

Sharma, Vikas; Singh, Pradip K; Midha, Samriti; Ranjan, Manish; Korpole, Suresh; Patil, Prabhu B

2012-03-01

213

Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective  

PubMed Central

Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead. PMID:22147957

Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

2011-01-01

214

Accurate whole human genome sequencing using reversible terminator chemistry  

Microsoft Academic Search

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

2008-01-01

215

Whole-genome sequencing and variant discovery in C. elegans  

Microsoft Academic Search

Massively parallel sequencing instruments enable rapid and inexpensive DNA sequence data production. Because these instruments are new, their data require characterization with respect to accuracy and utility. To address this, we sequenced a Caernohabditis elegans N2 Bristol strain isolate using the Solexa Sequence Analyzer, and compared the reads to the reference genome to characterize the data and to evaluate coverage

LaDeana W Hillier; Gabor T Marth; Aaron R Quinlan; David Dooling; Ginger Fewell; Derek Barnett; Paul Fox; Jarret I Glasscock; Matthew Hickenbotham; Weichun Huang; Vincent J Magrini; Ryan J Richt; Sacha N Sander; Donald A Stewart; Michael Stromberg; Eric F Tsung; Todd Wylie; Tim Schedl; Richard K Wilson; Elaine R Mardis

2008-01-01

216

Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes.  

PubMed

A principal obstacle to completing maps and analyses of the human genome involves the genome's "inaccessible" regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)-a substantial fraction of the human genome's remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

Genovese, Giulio; Handsaker, Robert E; Li, Heng; Kenny, Eimear E; McCarroll, Steven A

2013-09-01

217

On the current status of Phakopsora pachyrhizi genome sequencing.  

PubMed

Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

2014-01-01

218

Initial sequencing and analysis of the human genome  

Microsoft Academic Search

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

2001-01-01

219

Full Genome Sequence of Giant Panda Rotavirus Strain CH-1.  

PubMed

We report here the complete genomic sequence of the giant panda rotavirus strain CH-1. This work is the first to document the complete genomic sequence (segments 1 to 11) of the CH-1 strain, which offers an effective platform for providing authentic research experiences to novice scientists. PMID:23469354

Guo, Ling; Yan, Qigui; Yang, Shaolin; Wang, Chengdong; Chen, Shijie; Yang, Xiaonong; Hou, Rong; Quan, Zifang; Hao, Zhongxiang

2013-01-01

220

Draft Genome Sequence of Kocuria rhizophila P7-4?  

PubMed Central

We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

2011-01-01

221

Draft Genome Sequence of Raoultella planticola, Isolated from River Water  

PubMed Central

We isolated Raoultella planticola from a river water sample, which was phenotypically indistinguishable from Escherichia coli on MI agar. The genome sequence of R. planticola was determined to gain information about its metabolic functions contributing to its false positive appearance of E. coli on MI agar. We report the first whole genome sequence of Raoultella planticola. PMID:25323725

Kahler, Amy; Strockbine, Nancy; Gladney, Lori; Hill, Vincent R.

2014-01-01

222

Genome Sequence of the Nonpathogenic Pseudomonas aeruginosa Strain ATCC 15442  

PubMed Central

Pseudomonas aeruginosa ATCC 15442 is an environmental strain of the Pseudomonas genus. Here, we present a 6.77-Mb assembly of its genome sequence. Besides giving insights into characteristics associated with the pathogenicity of P. aeruginosa, such as virulence, drug resistance, and biofilm formation, the genome sequence may provide some information related to biotechnological utilization of the strain. PMID:24786961

Wang, Yujiao; Li, Chao; Ma, Cuiqing; Xu, Ping

2014-01-01

223

Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages  

PubMed Central

Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

Sheflo, Michael A.; Gardner, Adam V.; Merrill, Bryan D.; Fisher, Joshua N. B.; Lunt, Bryce L.; Breakwell, Donald P.; Grose, Julianne H.

2013-01-01

224

Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.  

PubMed

Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

2013-01-01

225

Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii  

PubMed Central

Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

2013-01-01

226

Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis  

PubMed Central

Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A.; Edwards, Robert A.

2013-01-01

227

A remarkably simple genome underlies highly malignant pediatric rhabdoid cancers  

PubMed Central

Cancer is principally considered a genetic disease, and numerous mutations are thought essential to drive its growth. However, the existence of genomically stable cancers and the emergence of mutations in genes that encode chromatin remodelers raise the possibility that perturbation of chromatin structure and epigenetic regulation are capable of driving cancer formation. Here we sequenced the exomes of 35 rhabdoid tumors, highly aggressive cancers of early childhood characterized by biallelic loss of SMARCB1, a subunit of the SWI/SNF chromatin remodeling complex. We identified an extremely low rate of mutation, with loss of SMARCB1 being essentially the sole recurrent event. Indeed, in 2 of the cancers there were no other identified mutations. Our results demonstrate that high mutation rates are dispensable for the genesis of cancers driven by mutation of a chromatin remodeling complex. Consequently, cancer can be a remarkably genetically simple disease. PMID:22797305

Lee, Ryan S.; Stewart, Chip; Carter, Scott L.; Ambrogio, Lauren; Cibulskis, Kristian; Sougnez, Carrie; Lawrence, Michael S.; Auclair, Daniel; Mora, Jaume; Golub, Todd R.; Biegel, Jaclyn A.; Getz, Gad; Roberts, Charles W.M.

2012-01-01

228

Unexpected cross-species contamination in genome sequencing projects  

PubMed Central

The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes.

Merchant, Samier; Wood, Derrick E.

2014-01-01

229

Minimum taxonomic criteria for bacterial genome sequence depositions and announcements.  

PubMed

Multiple bioinformatic methods are available to analyse the information encoded within the complete genome sequence of a bacterium and accurately assign its species status or nearest phylogenetic neighbour. However, it is clear that even now in what is the third decade of bacterial genomics, taxonomically incorrect genome sequence depositions are still being made. We outline a simple scheme of bioinformatic analysis and a set of minimum criteria that should be applied to all bacterial genomic data to ensure that they are accurately assigned to the species or genus level prior to database deposition. To illustrate the utility of the bioinformatic workflow, we analysed the recently deposited genome sequence of Lactobacillus acidophilus 30SC and demonstrated that this DNA was in fact derived from a strain of Lactobacillus amylovorus. Using these methods researchers can ensure that the taxonomic accuracy of genome sequence depositions is maintained within the ever increasing nucleic acid datasets. PMID:22366464

Bull, Matthew J; Marchesi, Julian R; Vandamme, Peter; Plummer, Sue; Mahenthiralingam, Eshwar

2012-04-01

230

The human genome sequence: impact on health care  

Microsoft Academic Search

The recent sequencing of the human genome, resulting from two independent global efforts, is poised to revolutionize all aspects of human health. This landmark achievement has also vindicated two different methodologies that can now be used to target other important large genomes. The human genome sequence has revealed several novel\\/surprising features notably the probable presence of a mere 30-35,000 genes.

M. D. Bashyam; S. E. Hasnain

2003-01-01

231

Genome sequence of the human malaria parasite Plasmodium falciparum  

Microsoft Academic Search

The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date.

Malcolm J. Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W. Hyman; Jane M. Carlton; Arnab Pain; Sharen Bowman; Ian T. Paulsen; Keith James; Kim Rutherford; Steven L. Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J. Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W. Mather; Akhil B. Vaidya; Alan H. Fairlamb; Martin J. Fraunholz; David S. Roos; Stuart A. Ralph; Geoffrey I. McFadden; Leda M. Cummings; G. Mani Subramanian; Chris Mungall; J. Craig Venter; Daniel J. Carucci; Stephen L. Hoffman; Chris Newbold; Ronald W. Davis; Claire M. Fraser; Bart Barrell

2002-01-01

232

Draft Sequences of the Radish (Raphanus sativus L.) Genome.  

PubMed

Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ?300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

2014-10-01

233

Draft Sequences of the Radish (Raphanus sativus L.) Genome  

PubMed Central

Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ?300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

2014-01-01

234

Implications of the Plastid Genome Sequence of Typha (Typhaceae, Poales) for Understanding Genome Evolution in Poaceae  

Microsoft Academic Search

Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been\\u000a a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution\\u000a has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the

Mary M. GuisingerTimothy; Timothy W. Chumley; Jennifer V. Kuehl; Jeffrey L. Boore; Robert K. Jansen

2010-01-01

235

Automated De Novo Identification of Repeat Sequence Families in Sequenced Genomes  

Microsoft Academic Search

Repetitive sequences make up a major part of eukaryotic genomes. We have developed an approachfor th e de novo identification and classification of repeat sequence families that is based on extensions to the usual approachof single linkage clustering of local pairwise alignments betwe en genomic sequences. Our extensions use multiple alignment information to define the boundaries of individual copies of

Zhirong Bao; Sean R. Eddy

2002-01-01

236

Progress in Understanding and Sequencing the Genome of Brassica rapa  

PubMed Central

Brassica rapa, which is closely related to Arabidopsis thaliana, is an important crop and a model plant for studying genome evolution via polyploidization. We report the current understanding of the genome structure of B. rapa and efforts for the whole-genome sequencing of the species. The tribe Brassicaceae, which comprises ca. 240 species, descended from a common hexaploid ancestor with a basic genome similar to that of Arabidopsis. Chromosome rearrangements, including fusions and/or fissions, resulted in the present-day “diploid” Brassica species with variation in chromosome number and phenotype. Triplicated genomic segments of B. rapa are collinear to those of A. thaliana with InDels. The genome triplication has led to an approximately 1.7-fold increase in the B. rapa gene number compared to that of A. thaliana. Repetitive DNA of B. rapa has also been extensively amplified and has diverged from that of A. thaliana. For its whole-genome sequencing, the Brassica rapa Genome Sequencing Project (BrGSP) consortium has developed suitable genomic resources and constructed genetic and physical maps. Ten chromosomes of B. rapa are being allocated to BrGSP consortium participants, and each chromosome will be sequenced by a BAC-by-BAC approach. Genome sequencing of B. rapa will offer a new perspective for plant biology and evolution in the context of polyploidization. PMID:18288250

Hong, Chang Pyo; Kwon, Soo-Jin; Kim, Jung Sun; Yang, Tae-Jin; Park, Beom-Seok; Lim, Yong Pyo

2008-01-01

237

Scrutinizing Virus Genome Termini by High-Throughput Sequencing  

PubMed Central

Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS) have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs) found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell. PMID:24465717

Fan, Huahao; Jiang, Huanhuan; Chen, Yubao; Tong, Yigang

2014-01-01

238

Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi  

SciTech Connect

Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

2011-02-01

239

Chapter 27 -- Breast Cancer Genomics, Section VI, Pathology and Biological Markers of Invasive Breast Cancer  

SciTech Connect

Breast cancer is predominantly a disease of the genome with cancers arising and progressing through accumulation of aberrations that alter the genome - by changing DNA sequence, copy number, and structure in ways that that contribute to diverse aspects of cancer pathophysiology. Classic examples of genomic events that contribute to breast cancer pathophysiology include inherited mutations in BRCA1, BRCA2, TP53, and CHK2 that contribute to the initiation of breast cancer, amplification of ERBB2 (formerly HER2) and mutations of elements of the PI3-kinase pathway that activate aspects of epidermal growth factor receptor (EGFR) signaling and deletion of CDKN2A/B that contributes to cell cycle deregulation and genome instability. It is now apparent that accumulation of these aberrations is a time-dependent process that accelerates with age. Although American women living to an age of 85 have a 1 in 8 chance of developing breast cancer, the incidence of cancer in women younger than 30 years is uncommon. This is consistent with a multistep cancer progression model whereby mutation and selection drive the tumor's development, analogous to traditional Darwinian evolution. In the case of cancer, the driving events are changes in sequence, copy number, and structure of DNA and alterations in chromatin structure or other epigenetic marks. Our understanding of the genetic, genomic, and epigenomic events that influence the development and progression of breast cancer is increasing at a remarkable rate through application of powerful analysis tools that enable genome-wide analysis of DNA sequence and structure, copy number, allelic loss, and epigenomic modification. Application of these techniques to elucidation of the nature and timing of these events is enriching our understanding of mechanisms that increase breast cancer susceptibility, enable tumor initiation and progression to metastatic disease, and determine therapeutic response or resistance. These studies also reveal the molecular differences between cancer and normal that may be exploited to therapeutic benefit or that provide targets for molecular assays that may enable early cancer detection, and predict individual disease progression or response to treatment. This chapter reviews current and future directions in genome analysis and summarizes studies that provide insights into breast cancer pathophysiology or that suggest strategies to improve breast cancer management.

Spellman, Paul T.; Heiser, Laura; Gray, Joe W.

2009-06-18

240

Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes  

SciTech Connect

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

2005-08-26

241

Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus  

PubMed Central

We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies. PMID:25395648

Wang, Qing-Yan; Huang, Yan-Yan; Song, Li-Fu; Du, Qi-Shi; Yu, Bo; Chen, Dong

2014-01-01

242

Understanding Cancer Series: Genome-Wide Profiling  

Cancer.gov

A Locally Focused Search Chromosome Continent Country U.S. State World Genome Cell Chemical bases Gene A G T C Single-gene tests focus on a specific, known location in a patient’s genome. Using this approach, scientists have looked for single genes linked to cancer. This research has revealed some important discoveries such as gene changes called mutations located within the BRCA1 or BRCA2 genes that may confer a significantly increased risk of breast and ovarian cancer.

243

Characterizing the cancer genome in lung adenocarcinoma  

Microsoft Academic Search

Somatic alterations in cellular DNA underlie almost all human cancers1. The prospect of targeted therapies2 and the development of high-resolution, genome-wide approaches3-8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection oftumours(n 5371)usingdensesinglenucleotidepolymorphism arrays, we identify a total of 57

Barbara A. Weir; Michele S. Woo; Gad Getz; Sven Perner; Li Ding; Rameen Beroukhim; William M. Lin; Michael A. Province; Aldi Kraja; Laura A. Johnson; Kinjal Shah; Mitsuo Sato; Roman K. Thomas; Justine A. Barletta; Ingrid B. Borecki; Stephen Broderick; Andrew C. Chang; Derek Y. Chiang; Lucian R. Chirieac; Jeonghee Cho; Yoshitaka Fujii; Adi F. Gazdar; Thomas Giordano; Heidi Greulich; Megan Hanna; Bruce E. Johnson; Mark G. Kris; Alex Lash; Ling Lin; Neal Lindeman; Elaine R. Mardis; John D. McPherson; John D. Minna; Margaret B. Morgan; Mark Nadel; Mark B. Orringer; John R. Osborne; Brad Ozenberger; Alex H. Ramos; James Robinson; Jack A. Roth; Valerie Rusch; Hidefumi Sasaki; Frances Shepherd; Carrie Sougnez; Margaret R. Spitz; Ming-Sound Tsao; David Twomey; Roel G. W. Verhaak; George M. Weinstock; David A. Wheeler; Wendy Winckler; Akihiko Yoshizawa; Soyoung Yu; Maureen F. Zakowski; Qunyuan Zhang; David G. Beer; Ignacio I. Wistuba; Mark A. Watson; Levi A. Garraway; Marc Ladanyi; William D. Travis; William Pao; Mark A. Rubin; Stacey B. Gabriel; Richard A. Gibbs; Harold E. Varmus; Richard K. Wilson; Eric S. Lander; Matthew Meyerson

2007-01-01

244

Towards systematic functional characterization of cancer genomes  

Microsoft Academic Search

Whole-genome approaches to identify genetic and epigenetic alterations in cancer genomes have begun to provide new insights into the range of molecular events that occurs in human tumours. Although in some cases this knowledge immediately illuminates a path towards diagnostic or therapeutic implementation, the bewildering lists of mutations in each tumour make it clear that systematic functional approaches are also

Jesse S. Boehm; William C. Hahn

2011-01-01

245

Complete Genome Sequence of Mycoplasma haemofelis, a Hemotropic Mycoplasma?  

PubMed Central

Here, we present the genome sequence of Mycoplasma haemofelis strain Langford 1, representing the first hemotropic mycoplasma (hemoplasma) species to be completely sequenced and annotated. Originally isolated from a cat with hemolytic anemia, this strain induces severe hemolytic anemia when inoculated into specific-pathogen-free-derived cats. The genome sequence has provided insights into the biology of this uncultivatable hemoplasma and has identified potential molecular mechanisms underlying its pathogenicity. PMID:21317334

Barker, Emily N.; Helps, Chris R.; Peters, Iain R.; Darby, Alistair C.; Radford, Alan D.; Tasker, Severine

2011-01-01

246

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

2002-01-01

247

Complete Genome Sequence of Salmonella Bacteriophage SS3e  

PubMed Central

A Salmonella lytic bacteriophage, SS3e, was isolated, and its genome was sequenced completely. This phage is able to lyse not only various Salmonella serovars but also Escherichia coli, Shigella sonnei, Enterobacter cloacae, and Serratia marcescens, indicating a broad host specificity. Genomic sequence analysis of SS3e revealed a linear double-stranded DNA sequence of 40,793 bp harboring 58 open reading frames, which is highly similar to Salmonella phages SETP13 and MB78. PMID:22923809

Kim, Sung-Hun; Park, Jeong-Hyun; Lee, Bok-Kwon; Kwon, Hyuk-Joon; Shin, Ji-Hyun; Kim, Jungmin

2012-01-01

248

Sequencing approach evaluates all 24 genes implicated in breast cancer  

Cancer.gov

Since 1994, many thousands of women with breast cancer from families severely affected with the disease have been tested for inherited mutations in BRCA1 and BRCA2. The vast majority of those patients were told that their gene sequences were normal. With the development of modern genomics sequencing tools, the discovery of additional genes implicated in breast cancer and the change in the legal status of genetic testing for BRCA1 and BRCA2, it is now possible to determine how often families in these circumstances actually do carry cancer-predisposing mutations in BRCA1, BRCA2, or another gene implicated in breast cancer, despite the results of their previous genetic tests. The results were presented Oct. 24, by researchers from the University of Washington (which is affiliated with the Fred Hutchinson Cancer Research Center) at the American Society of Human Genetics 2013 meeting in Boston.

249

Transposable elements in human cancers by genome-wide EST alignment  

Microsoft Academic Search

Transposable elements may affect coding sequences, splicing patterns, and tran- scriptional regulation of human genes. Particles of the transposable elements have been detected in several tissues and tumors. Here, we report genome-wide analysis of gene expression regulated by transposable elements in human cancers. We adopted an analysis pipeline for screening methods to detect cancer- specific expression from expressed human sequences.

Dae-Soo Kim; Jae-Won Huh; Heui-Soo Kim

2007-01-01

250

Complete genome sequence of Anaerococcus prevotii type strain (PC1).  

PubMed

Anaerococcus prevotii (Foubert and Douglas 1948) Ezaki et al. 2001 is the type species of the genus, and is of phylogenetic interest because of its arguable assignment to the provisionally arranged family 'Peptostreptococcaceae'. A. prevotii is an obligate anaerobic coccus, usually arranged in clumps or tetrads. The strain, whose genome is described here, was originally isolated from human plasma; other strains of the species were also isolated from clinical specimen. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the genus. Next to Finegoldia magna, A. prevotii is only the second species from the family 'Peptostreptococcaceae' for which a complete genome sequence is described. The 1,998,633 bp long genome (chromosome and one plasmid) with its 1852 protein-coding and 61 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304652

Labutti, Kurt; Pukall, Rudiger; Steenblock, Katja; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Saunders, Elizabeth; Brettin, Thomas; Detter, John C; Han, Cliff; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

2009-01-01

251

Reference genome sequence of the model plant Setaria  

SciTech Connect

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

Bennetzen, Jeffrey L [ORNL; Schmutz, Jeremy [Hudson Alpha Institute of Biotechnology; Wang, Hao [University of Georgia, Athens, GA; Percifield, Ryan [University of Georgia, Athens, GA; Hawkins, Jennifer [University of Georgia, Athens, GA; Pontaroli, Ana C. [University of Georgia, Athens, GA; Estep, Matt [University of Georgia, Athens, GA; Feng, Liang [University of Georgia, Athens, GA; Vaughn, Justin N [ORNL; Grimwood, Jane [Hudson Alpha Institute of Biotechnology; Jenkins, Jerry [Hudson Alpha Institute of Biotechnology; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Lindquist, Erika [U.S. Department of Energy, Joint Genome Institute; Hellsten, Uffe [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Wang, Xuewen [University of Georgia, Athens, GA; Wu, Xiaomei [University of Georgia, Athens, GA; Mitros, Therese [University of California, Berkeley; Triplett, Jimmy [University of Missouri, St. Louis; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Mauro-Herrera, Margarita [Oklahoma State University; Wang, Lin [Cornell University; Li, Pinghua [Cornell University; Sharma, Manoj [University of California, Davis; Sharma, Rita [University of California, Davis; Ronald, Pamela [University of California, Davis; Panaud, Olivier [Universite de Perpignan, Perpignan, France; Kellogg, Elizabeth A. [University of Missouri, St. Louis; Brutnell, Thomas P. [Cornell University; Doust, Andrew N. [Oklahoma State University; Tuskan, Gerald A [ORNL; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Devos, Katrien M [ORNL

2012-01-01

252

Reference genome sequence of the model plant Setaria.  

PubMed

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ?400-Mb assembly covers ?80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum). PMID:22580951

Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

2012-06-01

253

The Cancer Genome Atlas Pan-Cancer Analysis Project  

PubMed Central

Cancer can take hundreds of different forms depending on the location, cell of origin and spectrum of genomic alterations that promote oncogenesis and affect therapeutic response. Although many genomic events with direct phenotypic impact have been identified, much of the complex molecular landscape remains incompletely charted for most cancer lineages. For that reason, The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumours to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences, and emergent themes across tumour lineages. The Pan-Cancer initiative compares the first twelve tumour types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumour types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile. PMID:24071849

Weinstein, John N.; Collisson, Eric A.; Mills, Gordon B.; Shaw, Kenna M.; Ozenberger, Brad A.; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M.

2014-01-01

254

AACR 2014: NCI/NIH-Sponsored Session: Large-Scale Genomics Data for the Research Community through the NCI Center for Cancer Genomics  

Cancer.gov

The NCI’s Center for Cancer Genomics (CCG), which includes the Office of Cancer Genomics and The Cancer Genome Atlas Program Office, provides the research community access to large-scale molecular characterization data, which is largely sequence-based. CCG programs aim to improve patient outcome through identification of valid molecular targets and associated molecular markers (prognostic or diagnostic), in and across diseases investigated, which should ultimately lead to the rapid development of novel, more effective therapies.

255

Toolbox for Mobile-Element Insertion Detection on Cancer Genomes  

PubMed Central

Mobile elements constitute greater than 45% of the human genome as a result of repeated insertion events during human genome evolution. Although most of mobile elements are fixed within the human population, some elements (including ALU, long interspersed elements (LINE) 1 (L1), and SVA) are still actively duplicating and may result in life-threatening human diseases such as cancer, motivating the need for accurate mobile-element insertion (MEI) detection tools. We developed a software package, TANGRAM, for MEI detection in next-generation sequencing data, currently serving as the primary MEI detection tool in the 1000 Genomes Project. TANGRAM takes advantage of valuable mapping information provided by our own MOSAIK mapper, and until recently required MOSAIK mappings as its input. In this study, we report a new feature that enables TANGRAM to be used on alignments generated by any mainstream short-read mapper, making it accessible for many genomic users. To demonstrate its utility for cancer genome analysis, we have applied TANGRAM to the TCGA (The Cancer Genome Atlas) mutation calling benchmark 4 dataset. TANGRAM is fast, accurate, easy to use, and open source on https://github.com/jiantao/Tangram.

Lee, Wan-Ping; Wu, Jiantao; Marth, Gabor T

2014-01-01

256

A new approach to genome mapping and sequencing: slalom libraries  

PubMed Central

We describe here an efficient strategy for simultaneous genome mapping and sequencing. The approach is based on physically oriented, overlapping restriction fragment libraries called slalom libraries. Slalom libraries combine features of general genomic, jumping and linking libraries. Slalom libraries can be adapted to different applications and two main types of slalom libraries are described in detail. This approach was used to map and sequence (with ?46% coverage) two human P1-derived artificial chromosome (PAC) clones, each of ?100 kb. This model experiment demonstrates the feasibility of the approach and shows that the efficiency (cost-effectiveness and speed) of existing mapping/sequencing methods could be improved at least 5–10-fold. Furthermore, since the efficiency of contig assembly in the slalom approach is virtually independent of length of sequence reads, even short sequences produced by rapid, high throughput sequencing techniques would suffice to complete a physical map and a sequence scan of a small genome. PMID:11788732

Zabarovska, Veronika I.; Gizatullin, Rinat Z.; Al-Amin, Ali N.; Podowski, Raf; Protopopov, Alexei I.; Lofdahl, Sven; Wahlestedt, Claes; Winberg, Gosta; Kashuba, Vladimir I.; Ernberg, Ingemar; Zabarovsky, Eugene R.

2002-01-01

257

A new approach to genome mapping and sequencing: slalom libraries.  

PubMed

We describe here an efficient strategy for simultaneous genome mapping and sequencing. The approach is based on physically oriented, overlapping restriction fragment libraries called slalom libraries. Slalom libraries combine features of general genomic, jumping and linking libraries. Slalom libraries can be adapted to different applications and two main types of slalom libraries are described in detail. This approach was used to map and sequence (with approximately 46% coverage) two human P1-derived artificial chromosome (PAC) clones, each of approximately 100 kb. This model experiment demonstrates the feasibility of the approach and shows that the efficiency (cost-effectiveness and speed) of existing mapping/sequencing methods could be improved at least 5-10-fold. Furthermore, since the efficiency of contig assembly in the slalom approach is virtually independent of length of sequence reads, even short sequences produced by rapid, high throughput sequencing techniques would suffice to complete a physical map and a sequence scan of a small genome. PMID:11788732

Zabarovska, Veronika I; Gizatullin, Rinat Z; Al-Amin, Ali N; Podowski, Raf; Protopopov, Alexei I; Löfdahl, Sven; Wahlestedt, Claes; Winberg, Gösta; Kashuba, Vladimir I; Ernberg, Ingemar; Zabarovsky, Eugene R

2002-01-15

258

The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae.  

PubMed

The currently available yeast mitochondrial DNA (mtDNA) sequence is incomplete, contains many errors and is derived from several polymorphic strains. Here, we report that the mtDNA sequence of the strain used for nuclear genome sequencing assembles into a circular map of 85,779 bp which includes 10 kb of new sequence. We give a list of seven small hypothetical open reading frames (ORFs). Hot spots of point mutations are found in exons near the insertion sites of optional mobile group I intron-related sequences. Our data suggest that shuffling of mobile elements plays an important role in the remodelling of the yeast mitochondrial genome. PMID:9872396

Foury, F; Roganti, T; Lecrenier, N; Purnelle, B

1998-12-01

259

Complete genome sequence of Thermomonospora curvata type strain (B9)  

SciTech Connect

Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Chertkov, Olga [Los Alamos National Laboratory (LANL); Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California

2011-01-01

260

Mechanisms of Base Substitution Mutagenesis in Cancer Genomes  

PubMed Central

Cancer genome sequence data provide an invaluable resource for inferring the key mechanisms by which mutations arise in cancer cells, favoring their survival, proliferation and invasiveness. Here we examine recent advances in understanding the molecular mechanisms responsible for the predominant type of genetic alteration found in cancer cells, somatic single base substitutions (SBSs). Cytosine methylation, demethylation and deamination, charge transfer reactions in DNA, DNA replication timing, chromatin status and altered DNA proofreading activities are all now known to contribute to the mechanisms leading to base substitution mutagenesis. We review current hypotheses as to the major processes that give rise to SBSs and evaluate their relative relevance in the light of knowledge acquired from cancer genome sequencing projects and the study of base modifications, DNA repair and lesion bypass. Although gene expression data on APOBEC3B enzymes provide support for a role in cancer mutagenesis through U:G mismatch intermediates, the enzyme preference for single-stranded DNA may limit its activity genome-wide. For SBSs at both CG:CG and YC:GR sites, we outline evidence for a prominent role of damage by charge transfer reactions that follow interactions of the DNA with reactive oxygen species (ROS) and other endogenous or exogenous electron-abstracting molecules. PMID:24705290

Bacolla, Albino; Cooper, David N.; Vasquez, Karen M.

2014-01-01

261

Multiplex Sequencing of Seven Ocular Herpes Simplex Virus Type-1 Genomes: Phylogeny, Sequence Variability,  

E-print Network

Multiplex Sequencing of Seven Ocular Herpes Simplex Virus Type-1 Genomes: Phylogeny, Sequence-7812 Herpes simplex virus (HSV)-1 is a significant human patho- gen causing diseases such as mucocutaneous

Craven, Mark

262

Research Resources for Cancer Epidemiology and Genomics  

Cancer.gov

The Epidemiology and Genomics Research Program (EGRP) has developed a list with links to a number of cancer-related research resources available through EGRP-supported cohorts, consortia, and initiatives; other research programs in the Division of Cancer Control and Population Sciences and NCI; and partners elsewhere at NIH and other research organizations.

263

Cataloging Coding Sequence Variations in Human Genome Databases  

Microsoft Academic Search

BackgroundWith the recent growth of information on sequence variations in the human genome, predictions regarding the functional effects and relevance to disease phenotypes of coding sequence variations are becoming increasingly important. The aims of this study were to catalog protein-coding sequence variations (CVs) occurring in genetic variation databases and to use bioinformatic programs to analyze CVs. In addition, we aim

Hong-Hee Won; Hee-Jin Kim; Kyung-A. Lee; Jong-Won Kim; Cecile Fairhead

2008-01-01

264

Sequencing techniques uncover mutations in genes that can increase cancer risk  

Cancer.gov

Now that the findings from the Human Genome Project are widely available, scientists are working to put that data to work to understand the genetic causes of many diseases, including cancer, by using the latest sequencing techniques.

265

Genome sequencing: a systematic review of health economic evidence  

PubMed Central

Recently the sequencing of the human genome has become a major biological and clinical research field. However, the public health impact of this new technology with focus on the financial effect is not yet to be foreseen. To provide an overview of the current health economic evidence for genome sequencing, we conducted a thorough systematic review of the literature from 17 databases. In addition, we conducted a hand search. Starting with 5 520 records we ultimately included five full-text publications and one internet source, all focused on cost calculations. The results were very heterogeneous and, therefore, difficult to compare. Furthermore, because the methodology of the publications was quite poor, the reliability and validity of the results were questionable. The real costs for the whole sequencing workflow, including data management and analysis, remain unknown. Overall, our review indicates that the current health economic evidence for genome sequencing is quite poor. Therefore, we listed aspects that needed to be considered when conducting health economic analyses of genome sequencing. Thereby, specifics regarding the overall aim, technology, population, indication, comparator, alternatives after sequencing, outcomes, probabilities, and costs with respect to genome sequencing are discussed. For further research, at the outset, a comprehensive cost calculation of genome sequencing is needed, because all further health economic studies rely on valid cost data. The results will serve as an input parameter for budget-impact analyses or cost-effectiveness analyses. PMID:24330507

2013-01-01

266

Management of Incidental Findings in Clinical Genomic Sequencing  

PubMed Central

Genomic sequencing is becoming accurate, fast, and inexpensive, and is rapidly being incorporated into clinical practice. Incidental findings, which result in large numbers from genomic sequencing, are a potential barrier to the utility of this new technology due to their high prevalence and the lack of evidence or guidelines available to guide their clinical interpretation. This unit reviews the definition, classification, and management of incidental findings from genomic sequencing. The unit focuses on the clinical aspects of handling incidental findings, with an emphasis on the key role of clinical context in defining incidental findings and determining their clinical relevance and utility. PMID:23595601

Krier, Joel B.; Green, Robert C.

2013-01-01

267

Complete genome sequence of Ferroglobus placidus AEDII12DO  

PubMed Central

Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryarchaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemolithoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and annotation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was sequenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project. PMID:22180810

Anderson, Iain; Risso, Carla; Holmes, Dawn; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Samuel; Saunders, Elizabeth; Brettin, Thomas; Detter, John C.; Han, Cliff; Tapia, Roxanne; Larimer, Frank; Land, Miriam; Hauser, Loren; Woyke, Tanja; Lovley, Derek; Kyrpides, Nikos; Ivanova, Natalia

2011-01-01

268

Complete genome sequence of Staphylothermus hellenicus P8T  

SciTech Connect

Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phy- lum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shal- low hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the com- plete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein- coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) La- boratory Sequencing Program (LSP) project.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Davenport, Karen W. [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

2011-01-01

269

Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.  

PubMed

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

2013-01-01

270

Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens  

PubMed Central

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, Jozsef; Richardson, James E.; Bakker, Freek T.

2013-01-01

271

Genome-Wide Association Studies of Cancer  

PubMed Central

Knowledge of the inherited risk for cancer is an important component of preventive oncology. In addition to well-established syndromes of cancer predisposition, much remains to be discovered about the genetic variation underlying susceptibility to common malignancies. Increased knowledge about the human genome and advances in genotyping technology have made possible genome-wide association studies (GWAS) of human diseases. These studies have identified many important regions of genetic variation associated with an increased risk for human traits and diseases including cancer. Understanding the principles, major findings, and limitations of GWAS is becoming increasingly important for oncologists as dissemination of genomic risk tests directly to consumers is already occurring through commercial companies. GWAS have contributed to our understanding of the genetic basis of cancer and will shed light on biologic pathways and possible new strategies for targeted prevention. To date, however, the clinical utility of GWAS-derived risk markers remains limited. PMID:20585100

Stadler, Zsofia K.; Thom, Peter; Robson, Mark E.; Weitzel, Jeffrey N.; Kauff, Noah D.; Hurley, Karen E.; Devlin, Vincent; Gold, Bert; Klein, Robert J.; Offit, Kenneth

2010-01-01

272

Genome Science and Personalized Cancer Treatment  

ScienceCinema

August 4, 2009 Berkeley Lab lecture: Results from the Human Genome Project are enabling scientists to understand how individual cancers form and progress. This information, when combined with newly developed drugs, can optimize the treatment of individual cancers. Joe Gray, director of Berkeley Labs Life Sciences Division and Associate Laboratory Director for Life and Environmental Sciences, will focus on this approach, its promise, and its current roadblocks ? particularly with regard to breast cancer.

Joe Gray

2010-01-08

273

Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome  

Microsoft Academic Search

BACKGROUND: The recent availability of genome sequences has provided unparalleled insights into the broad-scale patterns of transposable element (TE) sequences in eukaryotic genomes. Nevertheless, the difficulties that TEs pose for genome assembly and annotation have prevented detailed, quantitative inferences about the contribution of TEs to genomes sequences. RESULTS: Using a high-resolution annotation of TEs in Release 4 genome sequence, we

Casey M Bergman; Hadi Quesneville; Dominique Anxolabéhère; Michael Ashburner

2006-01-01

274

Genome Announcement1 Draft genome sequence of the electricity producing3  

E-print Network

-positive dissimilatory metal25 reducing bacteria (DMRB) for which there is a draft genome sequence. Consistent with26 knowledge of extracellular respiration by Gram-positive bacteria. By comparing these58 mechanisms to Gram phylogenetic neighbors with sequenced genomes (5, 7, 8). C-type cytochromes are63 essential for the reduction

Hazen, Terry

275

A Genome-Wide Analysis of FRT-Like Sequences in the Human Genome  

PubMed Central

Efficient and precise genome manipulations can be achieved by the Flp/FRT system of site-specific DNA recombination. Applications of this system are limited, however, to cases when target sites for Flp recombinase, FRT sites, are pre-introduced into a genome locale of interest. To expand use of the Flp/FRT system in genome engineering, variants of Flp recombinase can be evolved to recognize pre-existing genomic sequences that resemble FRT and thus can serve as recombination sites. To understand the distribution and sequence properties of genomic FRT-like sites, we performed a genome-wide analysis of FRT-like sites in the human genome using the experimentally-derived parameters. Out of 642,151 identified FRT-like sequences, 581,157 sequences were unique and 12,452 sequences had at least one exact duplicate. Duplicated FRT-like sequences are located mostly within LINE1, but also within LTRs of endogenous retroviruses, Alu repeats and other repetitive DNA sequences. The unique FRT-like sequences were classified based on the number of matches to FRT within the first four proximal bases pairs of the Flp binding elements of FRT and the nature of mismatched base pairs in the same region. The data obtained will be useful for the emerging field of genome engineering. PMID:21448289

Shultz, Jeffry L.; Voziyanova, Eugenia; Konieczka, Jay H.; Voziyanov, Yuri

2011-01-01

276

De Novo Whole-Genome Sequence and Genome Annotation of Lichtheimia ramosa  

PubMed Central

We report the annotated draft genome sequence of Lichtheimia ramosa (JMRC FSU:6197). It has been reported to be a causative organism of mucormycosis, a rare but rapidly progressive infection in immunocompromised humans. The functionally annotated genomic sequence consists of 74 scaffolds with a total number of 11,510 genes. PMID:25212617

Linde, Jorg; Schwartze, Volker; Binder, Ulrike; Lass-Florl, Cornelia

2014-01-01

277

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change  

SciTech Connect

In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

2011-04-29

278

Assembly of large genomes using second-generation sequencing  

PubMed Central

Second-generation sequencing technology can now be used to sequence an entire human genome in a matter of days and at low cost. Sequence read lengths, initially very short, have rapidly increased since the technology first appeared, and we now are seeing a growing number of efforts to sequence large genomes de novo from these short reads. In this Perspective, we describe the issues associated with short-read assembly, the different types of data produced by second-gen sequencers, and the latest assembly algorithms designed for these data. We also review the genomes that have been assembled recently from short reads and make recommendations for sequencing strategies that will yield a high-quality assembly. PMID:20508146

Schatz, Michael C.; Delcher, Arthur L.; Salzberg, Steven L.

2010-01-01

279

Sequence-tagged connectors: A sequence approach to mapping and scanning the human genome  

PubMed Central

The sequence-tagged connector (STC) strategy proposes to generate sequence tags densely scattered (every 3.3 kilobases) across the human genome by arraying 450,000 bacterial artificial chromosomes (BACs) with randomly cleaved inserts, sequencing both ends of each, and preparing a restriction enzyme fingerprint of each. The STC resource, containing end sequences, fingerprints, and arrayed BACs, creates a map where the interrelationships of the individual BAC clones are resolved through their STCs as overlapping BAC clones are sequenced. Once a seed or initiation BAC clone is sequenced, the minimum overlapping 5? and 3? BAC clones can be identified computationally and sequenced. By reiterating this “sequence-then-map by computer analysis against the STC database” strategy, a minimum tiling path of clones can be sequenced at a rate that is primarily limited by the sequencing throughput of individual genome centers. As of February 1999, we had deposited, together with The Institute for Genomic Research (TIGR), into GenBank 314,000 STCs (?135 megabases), or 4.5% of human genomic DNA. This genome survey reveals numerous genes, genome-wide repeats, simple sequence repeats (potential genetic markers), and CpG islands (potential gene initiation sites). It also illustrates the power of the STC strategy for creating minimum tiling paths of BAC clones for large-scale genomic sequencing. Because the STC resource permits the easy integration of genetic, physical, gene, and sequence maps for chromosomes, it will be a powerful tool for the initial analysis of the human genome and other complex genomes. PMID:10449764

Mahairas, Gregory G.; Wallace, James C.; Smith, Kim; Swartzell, Steven; Holzman, Ted; Keller, Andrew; Shaker, Ron; Furlong, Jeff; Young, Janet; Zhao, Shaying; Adams, Mark D.; Hood, Leroy

1999-01-01

280

The Cancer Genome Atlas Pan-Cancer analysis project  

E-print Network

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a ...

Lander, Eric S.

281

Complete genome sequence of Haloterrigena turkmenica type strain (4k).  

PubMed

Haloterrigena turkmenica (Zvyagintseva and Tarasov 1987) Ventosa et al. 1999, comb. nov. is the type species of the genus Haloterrigena in the euryarchaeal family Halobacteriaceae. It is of phylogenetic interest because of the yet unclear position of the genera Haloterrigena and Natrinema within the Halobacteriaceae, which created some taxonomic problems historically. H. turkmenica, was isolated from sulfate saline soil in Turkmenistan, is a relatively fast growing, chemoorganotrophic, carotenoid-containing, extreme halophile, requiring at least 2 M NaCl for growth. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the genus Haloterrigena, but the eighth genome sequence from a member of the family Halobacteriaceae. The 5,440,782 bp genome (including six plasmids) with its 5,287 protein-coding and 63 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304683

Saunders, Elisabeth; Tindall, Brian J; Fähnrich, Regine; Lapidus, Alla; Copeland, Alex; Del Rio, Tijana Glavina; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Detter, John C; Bruce, David; Goodwin, Lynne; Chain, Patrick; Pitluck, Sam; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Brettin, Thomas; Rohde, Manfred; Göker, Markus; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

2010-01-01

282

Genome sequencing and analysis of the model grass Brachypodium distachyon  

SciTech Connect

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

2010-01-01

283

Draft Genome Sequences of Five Multilocus Sequence Types of Nonencapsulated Streptococcus pneumoniae  

PubMed Central

Nonencapsulated Streptococcus pneumoniae can colonize the human nasopharynx and cause conjunctivitis and otitis media. Different deletions in the capsular polysaccharide biosynthesis locus and different multilocus sequence types have been described for nonencapsulated strains. Draft genome sequences were generated to provide insight into the genomic diversity of these strains. PMID:23887920

Keller, Lance E.; Thomas, Jonathan C.; Luo, Xiao; Nahm, Moon H.; McDaniel, Larry S.

2013-01-01

284

Sequencing of Chloroplast Genome Using Whole Cellular DNA and Solexa Sequencing Technology  

PubMed Central

Sequencing of the chloroplast (cp) genome using traditional sequencing methods has been difficult because of its size (>120?kb) and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the cp genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246, 362, and 361?Mb sequence data were generated for the three accessions Chiifu-401-42, Z16, and FT, respectively. Micro-reads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8 or 95.5–99.7% of the B. rapa cp genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of cp genome. PMID:23162558

Wu, Jian; Liu, Bo; Cheng, Feng; Ramchiary, Nirala; Choi, Su Ryun; Lim, Yong Pyo; Wang, Xiao-Wu

2012-01-01

285

Complete genome sequence of Allochromatium vinosum DSM 180T  

PubMed Central

Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacterium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from freshwater, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp genome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Genome Institute Community Sequencing Program. PMID:22675582

Weissgerber, Thomas; Zigann, Renate; Bruce, David; Chang, Yun-juan; Detter, John C.; Han, Cliff; Hauser, Loren; Jeffries, Cynthia D.; Land, Miriam; Munk, A. Christine; Tapia, Roxanne; Dahl, Christiane

2011-01-01

286

Genome Sequence of Bacillus thuringiensis subsp. kurstaki Strain HD-1  

PubMed Central

We report here the complete genome sequence of Bacillus thuringiensis subsp. kurstaki strain HD-1, which serves as the primary U.S. reference standard for all commercial insecticidal formulations of B. thuringiensis manufactured around the world. PMID:25035322

Day, Michael; Ibrahim, Mohamed; Dyer, David

2014-01-01

287

Draft Genome Sequence of Lactobacillus animalis 381-IL-28  

PubMed Central

Lactobacillus animalis 381-IL-28 is an integral component of a multistrain commercial culture with food biopreservative and pathogen biocontrol functionality. A draft sequence of the L. animalis 381-IL-28 genome is described in this paper. PMID:24874675

Rajendran, Mahitha; Altermann, Eric

2014-01-01

288

Commentary on patents: Full bacterial DNA sequences boost genomics  

SciTech Connect

Together with recent U.S. federal court decisions on DNA patenting, the sequencing achievement indicates that efforts on the broader genomics front may be moving more rapidly than had been previously thought.

Fox, J.L.

1995-07-01

289

Complete genome sequences of six strains of the genus methylobacterium  

SciTech Connect

The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; Farhan Ul Haque, Muhammad [CNRS, Strasbourg, France; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Aguero, Fernan [Universidad Nacional de General San Martin; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

2012-01-01

290

Complete Genome Sequence of Rahnella aquatilis CIP 78.65  

SciTech Connect

Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

Martinez, Robert J [University of Alabama, Tuscaloosa; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Held, Brittany [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Sobeckya, Patricia A. [University of Alabama, Tuscaloosa

2012-01-01

291

Complete Genome Sequences of Six Strains of the Genus Methylobacterium  

SciTech Connect

The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints.

Marx, Christopher J [Harvard University; Bringel, Francoise O. [University of Strasbourg; Christoserdova, Ludmila [University of Washington, Seattle; Moulin, Lionel [UMR, France; UI Hague, Muhammad Farhan [University of Strasbourg; Fleischman, Darrell E. [Wright State University, Dayton, OH; Gruffaz, Christelle [CNRS, Strasbourg, France; Jourand, Philippe [UMR, France; Knief, Claudia [ETH Zurich, Switzerland; Lee, Ming-Chun [Harvard University; Muller, Emilie E. L. [CNRS, Strasbourg, France; Nadalig, Thierry [CNRS, Strasbourg, France; Peyraud, Remi [ETH Zurich, Switzerland; Roselli, Sandro [CNRS, Strasbourg, France; Russ, Lina [ETH Zurich, Switzerland; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Ivanov, Pavel S. [University of Wyoming, Laramie; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lajus, Aurelie [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Land, Miriam L [ORNL; Medigue, Claudine [Genoscope/Centre National de la Recherche Scientifique-Unite Mixte de Recherche; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Stolyar, Sergey [University of Washington; Vorholt, Julia A. [ETH Zurich, Switzerland; Vuilleumier, Stephane [University of Strasbourg

2012-01-01

292

Initial genome sequencing and analysis of multiple myeloma  

E-print Network

Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

Lander, Eric S.

293

Fulfilling the Promise of a Sequenced Human Genome – Part II  

SciTech Connect

Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

Green, Eric [National Human Genome Research Institute

2009-05-27

294

Fulfilling the Promise of a Sequenced Human Genome – Part I  

SciTech Connect

Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

Green, Eric [National Human Genome Research Institute

2009-05-27

295

Bacterial epidemiology and biology - lessons from genome sequencing  

PubMed Central

Next-generation sequencing has ushered in a new era of microbial genomics, enabling the detailed historical and geographical tracing of bacteria. This is helping to shape our understanding of bacterial evolution. PMID:22027015

2011-01-01

296

Genome Sequence of the Fish Pathogen Flavobacterium columnare ATCC 49512  

PubMed Central

Flavobacterium columnare is a Gram-negative, rod-shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. PMID:22535941

Tekedar, Hasan C.; Karsi, Attila; Gillaspy, Allison F.; Dyer, David W.; Benton, Nicole R.; Zaitshik, Jeremy; Vamenta, Stefanie; Banes, Michelle M.; Gulsoy, Nagihan; Aboko-Cole, Mary; Waldbieser, Geoffrey C.

2012-01-01

297

Sequence Imputation of HPV16 Genomes for Genetic Association Studies  

E-print Network

,2 , Laura Reimers3 , Koenraad van Doorslaer2 , Mark Schiffman4 , Rob DeSalle5 , Rolando Herrero6 , Kai Yu4, Reimers L, van Doorslaer K, Schiffman M, et al. (2011) Sequence Imputation of HPV16 Genomes for Genetic

DeSalle, Rob

298

Operational streamlining in a high-throughput genome sequencing center  

E-print Network

Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

Person, Kerry P. (Kerry Patrick)

2006-01-01

299

Melanoma genome sequencing reveals frequent PREX2 mutations  

E-print Network

Melanoma is notable for its metastatic propensity, lethality in the advanced setting and association with ultraviolet exposure early in life. To obtain a comprehensive genomic view of melanoma in humans, we sequenced the ...

Lander, Eric S.

300

Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050  

PubMed Central

Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

2014-01-01

301

Genomics through the lens of next-generation sequencing  

PubMed Central

A report on the 23rd annual meeting on 'The Biology of Genomes', 11-15 May 2010, Cold Spring Harbor, USA. Meeting report Recent advances in high-throughput sequencing technologies have greatly increased the scale and scope of genomics research, and this was evident throughout the recent Biology of Genomes meeting at the Cold Spring Harbor Laboratory. Here we describe some highlights of the meeting. PMID:20587080

2010-01-01

302

Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species  

PubMed Central

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

2014-01-01

303

Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.  

PubMed

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

2014-01-01

304

Genome sequence of vanilla distortion mosaic virus infecting Coriandrum sativum.  

PubMed

The 9573-nucleotide genome of a potyvirus was sequenced from a Coriandrum sativum plant from India with viral symptoms. On analysis, this virus was shown to have greater than 85 % nucleotide sequence identity to vanilla distortion mosaic virus (VDMV). Analysis of the putative coat protein sequence confirmed that this virus was in fact VDMV, with greater than 91 % amino acid sequence identity. The genome appears to encode a 3083-amino-acid polyprotein potentially cleaved into the 10 mature proteins expected in potyviruses. Phylogenetic analysis confirmed that VDMV is a distinct but ungrouped member of the genus Potyvirus. PMID:25252813

Adams, I P; Rai, S; Deka, M; Harju, V; Hodges, T; Hayward, G; Skelton, A; Fox, A; Boonham, N

2014-12-01

305

Complete genome sequence of Treponema pallidum strain DAL-1  

PubMed Central

Treponema pallidum strain DAL-1 is a human uncultivable pathogen causing the sexually transmitted disease syphilis. Strain DAL-1 was isolated from the amniotic fluid of a pregnant woman in the secondary stage of syphilis. Here we describe the 1,139,971 bp long genome of T. pallidum strain DAL-1 which was sequenced using two independent sequencing methods (454 pyrosequencing and Illumina). In rabbits, strain DAL-1 replicated better than the T. pallidum strain Nichols. The comparison of the complete DAL-1 genome sequence with the Nichols sequence revealed a list of genetic differences that are potentially responsible for the increased rabbit virulence of the DAL-1 strain. PMID:23449808

Zobanikova, Marie; Mikolka, Pavol; Cejkova, Darina; Pospisilova, Petra; Chen, Lei; Strouhal, Michal; Qin, Xiang; Weinstock, George M.; Smajs, David

2012-01-01

306

Intra-species sequence comparisons for annotating genomes  

SciTech Connect

Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

2004-07-15

307

The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae  

Microsoft Academic Search

The currently available yeast mitochondrial DNA (mtDNA) sequence is incomplete, contains many errors and is derived from several polymorphic strains. Here, we report that the mtDNA sequence of the strain used for nuclear genome sequencing assembles into a circular map of 85?779 bp which includes 10 kb of new sequence. We give a list of seven small hypothetical open reading

Françoise Foury; Tiziana Roganti; Nicolas Lecrenier; Bénédicte Purnelle

1998-01-01

308

Genome Sequence of Fusarium graminearum Isolate CS3005.  

PubMed

Fusarium graminearum is one of the most important fungal pathogens of wheat, barley, and maize worldwide. This announcement reports the genome sequence of a highly virulent Australian isolate of this species to supplement the existing genome of the North American F. graminearum isolate Ph1. PMID:24744326

Gardiner, Donald M; Stiller, Jiri; Kazan, Kemal

2014-01-01

309

Sequence Analysis of the Genome of the Neodiprion sertifer Nucleopolyhedrovirus  

Microsoft Academic Search

The genome of the Neodiprion sertifer nucleopolyhedrovirus (NeseNPV), which infects the European pine sawfly, N. sertifer (Hymenoptera: Diprionidae), was sequenced and analyzed. The genome was 86,462 bp in size. The CG content of 34% was lower than that of the majority of baculoviruses. A total of 90 methionine- initiated open reading frames (ORFs) with more than 50 amino acids and

Alejandra Garcia-Maruniak; James E. Maruniak; Paolo M. A. Zanotto; Aissa E. Doumbouya; Jaw-Ching Liu; Thomas M. Merritt; Jennifer S. Lanoie

2004-01-01

310

Combined Evidence Annotation of Transposable Elements in Genome Sequences  

E-print Network

sequences (e.g., 44.4% of the human genome; [1]), and there is no doubt that modern genomic DNA has evolved , Dominique Anxolabehere1 1 Laboratoire Dynamique du Ge´nome et Evolution, Institut Jacques Monod, Paris.6%) are inserted into at least one other TE, forming a nest of elements. The pipeline allows rapid and thorough

Paris-Sud XI, Université de

311

Complete Genome Sequence of Marinobacter sp. BSs20148.  

PubMed

Marinobacter sp. BSs20148 was isolated from marine sediment collected from the Arctic Ocean at a water depth of 3,800 m. Here we report the complete genome sequence of Marinobacter sp. BSs20148. This genomic information will facilitate the study of the physiological metabolism, ecological roles, and evolution of the Marinobacter species. PMID:23682144

Song, Lai; Ren, Lufeng; Li, Xingang; Yu, Dan; Yu, Yong; Wang, Xumin; Liu, Guiming

2013-01-01

312

Letter to the Editor Toward Sequencing Cotton (Gossypium) Genomes  

E-print Network

$900 million. Cotton fiber is an outstanding model for the study of plant cell elongation and cell wallLetter to the Editor Toward Sequencing Cotton (Gossypium) Genomes Despite rapidly decreasing costs complex ge- nomes de novo. The cotton (Gossypium spp.) genomes represent a challenging case. To this end

Chee, Peng W.

313

Complete Genome Sequence of Cronobacter sakazakii Strain CMCC 45402  

PubMed Central

Cronobacter sakazakii is considered to be an important pathogen involved in life-threatening neonatal infections. Here, we report the annotated complete genome sequence of C. sakazakii strain CMCC 45402, obtained from a milk sample in China. The major findings from the genomic analysis provide a better understanding of the isolates from China. PMID:24435860

Zhao, Zhijing; Wang, Lei; Wang, Bin; Liang, Haoyu; Ye, Qiang

2014-01-01

314

Complete Genome Sequence of Cronobacter sakazakii Strain CMCC 45402.  

PubMed

Cronobacter sakazakii is considered to be an important pathogen involved in life-threatening neonatal infections. Here, we report the annotated complete genome sequence of C. sakazakii strain CMCC 45402, obtained from a milk sample in China. The major findings from the genomic analysis provide a better understanding of the isolates from China. PMID:24435860

Zhao, Zhijing; Wang, Lei; Wang, Bin; Liang, Haoyu; Ye, Qiang; Zeng, Ming

2014-01-01

315

The genome sequence and structure of rice chromosome 1  

Microsoft Academic Search

The rice species Oryza sativa is considered to be a model plant because of its small genome size, extensive genetic map, relative ease of transformation and synteny with other cereal crops. Here we report the essentially complete sequence of chromosome 1, the longest chromosome in the rice genome. We summarize characteristics of the chromosome structure and the biological insight gained

Takuji Sasaki; Takashi Matsumoto; Kimiko Yamamoto; Katsumi Sakata; Tomoya Baba; Yuichi Katayose; Jianzhong Wu; Yoshihito Niimura; Zhukuan Cheng; Yoshiaki Nagamura; Baltazar A. Antonio; Hiroyuki Kanamori; Satomi Hosokawa; Masatoshi Masukawa; Koji Arikawa; Yoshino Chiden; Mika Hayashi; Masako Okamoto; Tsuyu Ando; Hiroyoshi Aoki; Kohei Arita; Masao Hamada; Chizuko Harada; Saori Hijishita; Mikiko Honda; Yoko Ichikawa; Atsuko Idonuma; Masumi Iijima; Michiko Ikeda; Maiko Ikeno; Sachie Ito; Tomoko Ito; Yuichi Ito; Yukiyo Ito; Aki Iwabuchi; Kozue Kamiya; Wataru Karasawa; Satoshi Katagiri; Ari Kikuta; Noriko Kobayashi; Izumi Kono; Kayo Machita; Tomoko Maehara; Hiroshi Mizuno; Tatsumi Mizubayashi; Yoshiyuki Mukai; Hideki Nagasaki; Marina Nakashima; Yuko Nakama; Yumi Nakamichi; Mari Nakamura; Nobukazu Namiki; Manami Negishi; Isamu Ohta; Nozomi Ono; Shoko Saji; Kumiko Sakai; Michie Shibata; Takanori Shimokawa; Ayahiko Shomura; Jianyu Song; Yuka Takazaki; Kimihiro Terasawa; Kumiko Tsuji; Kazunori Waki; Harumi Yamagata; Hiroko Yamane; Shoji Yoshiki; Rie Yoshihara; Kazuko Yukawa; Huisun Zhong; Hisakazu Iwama; Toshinori Endo; Hidetaka Ito; Jang Ho Hahn; Ho-Il Kim; Moo-Young Eun; Masahiro Yano; Jiming Jiang; Takashi Gojobori

2002-01-01

316

Draft Genome Sequence of Penicillium marneffei Strain PM1  

PubMed Central

Penicillium marneffei is the most important thermal dimorphic, pathogenic fungus endemic in China and Southeast Asia and is particularly important in HIV-positive patients. We report the 28,887,485-bp draft genome sequence of P. marneffei, which contains its complete mitochondrial genome, sexual cycle genes, a high diversity of Mp1p homologues, and polyketide synthase genes. PMID:22131218

Woo, Patrick C. Y.; Lau, Susanna K. P.; Liu, Bin; Cai, James J.; Chong, Ken T. K.; Tse, Herman; Kao, Richard Y. T.; Chan, Che-Man; Chow, Wang-Ngai; Yuen, Kwok-Yung

2011-01-01

317

Draft Genome Sequence of Necropsobacter rosorum Strain P709T  

PubMed Central

Necropsobacter is a recently described genus that contains a single species, N. rosorum, and belongs to the family Pasteurellaceae. Here, we present the draft genome of N. rosorum strain P709T, which is the first genome sequence from this species. PMID:25301642

Padmanabhan, Roshan; Robert, Catherine; Fenollar, Florence; Raoult, Didier

2014-01-01

318

Draft Genome Sequence of Mycobacterium cosmeticum DSM 44829  

PubMed Central

We announce the draft genome sequence of Mycobacterium cosmeticum strain DSM 44829, a nontuberculous species responsible for opportunistic infection. The genome described here is composed of 6,462,090 bp, with a G+C content of 68.24%. It contains 6,281 protein-coding genes and 75 predicted RNA genes. PMID:24723727

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

319

Draft Genome Sequence of Mycobacterium austroafricanum DSM 44191  

PubMed Central

We announce the draft genome sequence of Mycobacterium austroafricanum DSM 44191T (= E9789-SA12441T), a non-tuberculosis species responsible for opportunistic infection. The genome described here has a size of 6,772,357 bp with a G+C content of 66.79% and contains 6,419 protein-coding genes and 112 RNA genes. PMID:24744336

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

320

Draft Genome Sequence of Mycobacterium triplex DSM 44626  

PubMed Central

We announce the draft genome sequence of Mycobacterium triplex strain DSM 44626, a nontuberculosis species responsible for opportunistic infections. The genome described here is composed of 6,382,840 bp, with a G+C content of 66.57%, and contains 5,988 protein-coding genes and 81 RNA genes. PMID:24874681

Sassi, Mohamed; Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

321

Draft Genome Sequence of Mycobacterium vulneris DSM 45247T  

PubMed Central

We report the draft genome sequence of Mycobacterium vulneris DSM 45247T strain, an emerging, opportunistic pathogen of the Mycobacterium avium complex. The genome described here is composed of 6,981,439 bp (with a G+C content of 67.14%) and has 6,653 protein-coding genes and 84 predicted RNA genes. PMID:24812218

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

322

Draft Genome Sequence of Mycobacterium mageritense DSM 44476T  

PubMed Central

We report the draft genome sequence of Mycobacterium mageritense strain DSM 44476T (CIP 104973), a nontuberculosis species responsible for various infections. The genome described here is composed of 7,966,608 bp, with a G+C content of 66.95%, and contains 7,675 protein-coding genes and 120 predicted RNA genes. PMID:24786954

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

323

Whole-genome sequences of three symbiotic endozoicomonas strains.  

PubMed

Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp. PMID:25125646

Neave, Matthew J; Michell, Craig T; Apprill, Amy; Voolstra, Christian R

2014-01-01

324

Genome Sequence of Fusarium graminearum Isolate CS3005  

PubMed Central

Fusarium graminearum is one of the most important fungal pathogens of wheat, barley, and maize worldwide. This announcement reports the genome sequence of a highly virulent Australian isolate of this species to supplement the existing genome of the North American F. graminearum isolate Ph1. PMID:24744326

Stiller, Jiri; Kazan, Kemal

2014-01-01

325

Draft Genome Sequence of Amycolatopsis decaplanina Strain DSM 44594T  

PubMed Central

We report the 8.5-Mb genome sequence of Amycolatopsis decaplanina strain DSM 44594T, isolated from a soil sample from India. The draft genome of strain DSM 44594T consists of 8,533,276 bp with a 68.6% G+C content, 7,899 protein-coding genes, and 57 RNAs. PMID:23558534

Kaur, Navjot; Kumar, Shailesh; Bala, Monu; Raghava, Gajendra Pal Singh

2013-01-01

326

Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila  

Microsoft Academic Search

The soil actinomycete Kocuria rhizophila belongs to the suborder Micrococcineae, a divergent bacterial group for which only a limited amount of genomic information is currently available. K. rhizophila is also important in industrial applications; e.g., it is commonly used as a standard quality control strain for antimicrobial susceptibility testing. Sequencing and annotation of the genome of K. rhizophila DC2201 (NBRC

Hiromi Takarada; Mitsuo Sekine; Hiroki Kosugi; Yasunori Matsuo; Takatomo Fujisawa; Seiha Omata; Emi Kishi; Ai Shimizu; Naofumi Tsukatani; Satoshi Tanikawa; Nobuyuki Fujita; Shigeaki Harayama

2008-01-01

327

Draft genome sequences of 10 strains of the genus exiguobacterium.  

PubMed

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

Vishnivetskaya, Tatiana A; Chauhan, Archana; Layton, Alice C; Pfiffner, Susan M; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C; Markowitz, Victor M; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W; Pati, Amrita; Stamatis, Dimitrios; Reddy, T B K; Shapiro, Nicole; Nordberg, Henrik P; Cantor, Michael N; Hua, X Susan; Woyke, Tanja

2014-01-01

328

A new approach to genome mapping and sequencing: slalom libraries  

Microsoft Academic Search

We describe here an efficient strategy for simultaneous genome mapping and sequencing. The approach is based on physically oriented, overlapping restriction fragment libraries called slalom libraries. Slalom libraries combine features of general genomic, jumping and linking libraries. Slalom libraries can be adapted to different applications and two main types of slalom libraries are described in detail. This approach was used

Veronika I. Zabarovska; Rinat Z. Gizatullin; Ali N. Al-Amin; Raf Podowski; Alexei I. Protopopov; Sven Löfdahl; Claes Wahlestedt; Gösta Winberg; Vladimir I. Kashuba; Ingemar Ernberg; Eugene R. Zabarovsky

2002-01-01

329

The Genomic Sequence of the Accidental Pathogen Legionella pneumophila  

Microsoft Academic Search

We present the genomic sequence of Legionella pneumophila, the bacterial agent of Legionnaires' disease, a potentially fatal pneumonia acquired from aerosolized contaminated fresh water. The genome includes a 45-kilobase pair element that can exist in chromosomal and episomal forms, selective expansions of important gene families, genes for unexpected metabolic pathways, and previously unknown candidate virulence determinants. We highlight the genes

Minchen Chien; Irina Morozova; Shundi Shi; Huitao Sheng; Jing Chen; Shawn M. Gomez; Gifty Asamani; Kendra Hill; John Nuara; Marc Feder; Justin Rineer; Joseph J. Greenberg; Valeria Steshenko; Samantha H. Park; Baohui Zhao; Elita Teplitskaya; John R. Edwards; Sergey Pampou; Anthi Georghiou; I.-Chun Chou; William Iannuccilli; Michael E. Ulz; Dae H. Kim; Alex Geringer-Sameth; Curtis Goldsberry; Pavel Morozov; Stuart G. Fischer; Gil Segal; Xiaoyan Qu; Andrey Rzhetsky; Peisen Zhang; Eftihia Cayanis; Pieter J. De Jong; Jingyue Ju; Sergey Kalachikov; Howard A. Shuman; James J. Russo

2004-01-01

330

Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis  

Microsoft Academic Search

We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the ~160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated

J. M. Carlton; R. P. Hirt; J. C. Silva; A. L. Delcher; Michael Schatz; Qi Zhao; J. R. Wortman; S. L. Bidwell; U. C. M. Alsmark; Sébastien Besteiro; Thomas Sicheritz-Ponten; C. J. Noel; J. B. Dacks; P. G. Foster; Cedric Simillion; Y. Van de Peer; Diego Miranda-Saavedra; G. J. Barton; G. D. Westrop; S. Muller; Daniele Dessi; P. L. Fiori; Qinghu Ren; Ian Paulsen; Hanbang Zhang; F. D. Bastida-Corcuera; Augusto Simoes-Barbosa; M. T. Brown; R. D. Hayes; Mandira Mukherjee; C. Y. Okumura; Rachel Schneider; A. J. Smith; Stepanka Vanacova; Maria Villalvazo; B. J. Haas; Mihaela Pertea; Tamara V. Feldblyum; T. R. Utterback; Chung-Li Shu; Kazutoyo Osoegawa; P. J. de Jong; Ivan Hrdy; Lenka Horvathova; Zuzana Zubacova; Pavel Dolezal; Shehre-Banoo Malik; J. M. Logsdon; Katrin Henze; Arti Gupta; Ching C. Wang; R. L. Dunne; J. A. Upcroft; Peter Upcroft; Owen White; S. L. Salzberg; Petrus Tang; Cheng-Hsun Chiu; Ying-Shiung Lee; T. M. Embley; G. H. Coombs; J. C. Mottram; Jan Tachezy; C. M. Fraser-Liggett; P. J. Johnson

2007-01-01

331

Draft Genome Sequence of Enterobacter cloacae Strain JD6301  

PubMed Central

Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

Wilson, Jessica G.; French, William T.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W.; Champlin, Franklin R.

2014-01-01

332

Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium  

PubMed Central

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

2014-01-01

333

Detecting selection using a single genome sequence of  

E-print Network

strength of selection on each gene in the entire genomes of Mycobacterium tuberculosis and Plasmodium falci genome sequence of M. tuberculosis and P. falciparum Joshua B. Plotkin1 , Jonathan Dushoff2,3 & Hunter B, particularly the PE/PPE family2 of putative surface proteins in M. tuberculosis and the EMP1 family3

Plotkin, Joshua B.

334

Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii  

Microsoft Academic Search

The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

1996-01-01

335

The Genome Sequence of the SARS-Associated Coronavirus  

Microsoft Academic Search

We sequenced the 29,751-base genome of the severe acute respiratory syndrome (SARS)-associated coronavirus known as the Tor2 isolate. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously

Marco A. Marra; Steven J. M. Jones; Caroline R. Astell; Robert A. Holt; Angela Brooks-Wilson; Yaron S. N. Butterfield; Jaswinder Khattra; Jennifer K. Asano; Sarah A. Barber; Susanna Y. Chan; Alison Cloutier; Shaun M. Coughlin; Doug Freeman; Noreen Girn; Obi L. Griffith; Stephen R. Leach; Michael Mayo; Helen McDonald; Stephen B. Montgomery; Pawan K. Pandoh; Anca S. Petrescu; A. Gordon Robertson; Jacqueline E. Schein; Asim Siddiqui; Duane E. Smailus; Jeff M. Stott; George S. Yang; Francis Plummer; Anton Andonov; Harvey Artsob; Nathalie Bastien; Kathy Bernard; Timothy F. Booth; Donnie Bowness; Michael Drebot; Lisa Fernando; Ramon Flick; Michael Garbutt; Michael Garbutt; Allen Grolla; Heinz Feldmann; Adrienne Meyers; Amin Kabani; Yan Li; Susan Normand; Ute Stroher; Graham A. Tipples; Shaun Tyler; Robert Vogrig; Diane Ward; Robert C. Brunham; Mel Krajden; Martin Petric; Danuta M. Skowronski; Chris Upton; Rachel L. Roper

2003-01-01

336

Complete genome sequence of a raccoon rabies virus isolate  

Microsoft Academic Search

The entire genome of a mid-Atlantic raccoon strain rabies virus (RRV) isolated in Canada was sequenced; this is the second North American wildlife rabies virus isolate to be fully characterized. The overall organization and length of the genome was similar to that of other lyssaviruses. The nucleotide sequence identity of the raccoon strain ranged between 32.7% and 85.0% when compared

Annamaria G. Szanto; Susan A. Nadin-Davis; Bradley N. White

2008-01-01

337

Comparative Genome Analysis at the Sequence Level in the Brassicaceae  

Microsoft Academic Search

\\u000a In the world of plant genome sequencing, the cultivated Brassica species have been relatively under-resourced compared with other crop species largely due to their position in the economic\\u000a hierarchy of perceived importance. Thus, with the completion of the Arabidopsis thaliana genome in the year 2000, the limited sequencing efforts undertaken in the Brassica crops and other species of the Brassicaceae

Chris Town; Renate Schmidt; Ian Bancroft

338

Genomic distribution of simple sequence repeats in Brassica rapa.  

PubMed

Simple Sequence Repeats (SSRs) represent short tandem duplications found within all eukaryotic organisms. To examine the distribution of SSRs in the genome of Brassica rapa ssp. pekinensis, SSRs from different genomic regions representing 17.7 Mb of genomic sequence were surveyed. SSRs appear more abundant in non-coding regions (86.6%) than in coding regions (13.4%). Comparison of SSR densities in different genomic regions demonstrated that SSR density was greatest within the 5'-flanking regions of the predicted genes. The proportion of different repeat motifs varied between genomic regions, with trinucleotide SSRs more prevalent in predicted coding regions, reflecting the codon structure in these regions. SSRs were also preferentially associated with gene-rich regions, with peri-centromeric heterochromatin SSRs mostly associated with retrotransposons. These results indicate that the distribution of SSRs in the genome is non-random. Comparison of SSR abundance between B. rapa and the closely related species Arabidopsis thaliana suggests a greater abundance of SSRs in B. rapa, which may be due to the proposed genome triplication. Our results provide a comprehensive view of SSR genomic distribution and evolution in Brassica for comparison with the sequenced genomes of A. thaliana and Oryza sativa. PMID:17646709

Hong, Chang Pyo; Piao, Zhong Yun; Kang, Tae Wook; Batley, Jacqueline; Yang, Tae-Jin; Hur, Yoon-Kang; Bhak, Jong; Park, Beom-Seok; Edwards, David; Lim, Yong Pyo

2007-06-30

339

The sequencing of the human genome and the entire genomes of many model organisms has resulted in the identification of  

E-print Network

313 The sequencing of the human genome and the entire genomes of many model organisms has resulted TF transcription factor Introduction The sequencing of the human genome and the entire genomes genomics techniques for mapping transcription regulatory networks have evolved on the basis of advances

340

Sequencing viral genomes from a single isolated plaque  

PubMed Central

Background Whole genome sequencing of viruses and bacteriophages is often hindered because of the need for large quantities of genomic material. A method is described that combines single plaque sequencing with an optimization of Sequence Independent Single Primer Amplification (SISPA). This method can be used for de novo whole genome next-generation sequencing of any cultivable virus without the need for large-scale production of viral stocks or viral purification using centrifugal techniques. Methods A single viral plaque of a variant of the 2009 pandemic H1N1 human Influenza A virus was isolated and amplified using the optimized SISPA protocol. The sensitivity of the SISPA protocol presented here was tested with bacteriophage F_HA0480sp/Pa1651 DNA. The amplified products were sequenced with 454 and Illumina HiSeq platforms. Mapping and de novo assemblies were performed to analyze the quality of data produced from this optimized method. Results Analysis of the sequence data demonstrated that from a single viral plaque of Influenza A, a mapping assembly with 3590-fold average coverage representing 100% of the genome could be produced. The de novo assembled data produced contigs with 30-fold average sequence coverage, representing 96.5% of the genome. Using only 10 pg of starting DNA from bacteriophage F_HA0480sp/Pa1651 in the SISPA protocol resulted in sequencing data that gave a mapping assembly with 3488-fold average sequence coverage, representing 99.9% of the reference and a de novo assembly with 45-fold average sequence coverage, representing 98.1% of the genome. Conclusions The optimized SISPA protocol presented here produces amplified product that when sequenced will give high quality data that can be used for de novo assembly. The protocol requires only a single viral plaque or as little as 10 pg of DNA template, which will facilitate rapid identification of viruses during an outbreak and viruses that are difficult to propagate. PMID:23742765

2013-01-01

341

Genomic sequence analysis and characterization of Sneathia amnii sp. nov  

PubMed Central

Background Bacteria of the genus Sneathia are emerging as potential pathogens of the female reproductive tract. Species of Sneathia, which were formerly grouped with Leptotrichia, can be part of the normal microbiota of the genitourinary tracts of men and women, but they are also associated with a variety of clinical conditions including bacterial vaginosis, preeclampsia, preterm labor, spontaneous abortion, post-partum bacteremia and other invasive infections. Sneathia species also exhibit a significant correlation with sexually transmitted diseases and cervical cancer. Because Sneathia species are fastidious and rarely cultured successfully in vitro; and the genomes of members of the genus had until now not been characterized, very little is known about the physiology or the virulence of these organisms. Results Here, we describe a novel species, Sneathia amnii sp. nov, which closely resembles bacteria previously designated "Leptotrichia amnionii". As part of the Vaginal Human Microbiome Project at VCU, a vaginal isolate of S. amnii sp. nov. was identified, successfully cultured and bacteriologically cloned. The biochemical characteristics and virulence properties of the organism were examined in vitro, and the genome of the organism was sequenced, annotated and analyzed. The analysis revealed a reduced circular genome of ~1.34 Mbp, containing ~1,282 protein-coding genes. Metabolic reconstruction of the bacterium reflected its biochemical phenotype, and several genes potentially associated with pathogenicity were identified. Conclusions Bacteria with complex growth requirements frequently remain poorly characterized and, as a consequence, their roles in health and disease are unclear. Elucidation of the physiology and identification of genes putatively involved in the metabolism and virulence of S. amnii may lead to a better understanding of the role of this potential pathogen in bacterial vaginosis, preterm birth, and other issues associated with vaginal and reproductive health. PMID:23281612

2012-01-01

342

Large-Scale Sequencing: The Future of Genomic Sciences Colloquium  

SciTech Connect

Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin, since not only are their genomes available, but they are also accompanied by data on environment and physiology that can be used to understand the resulting data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into this effort. Efforts to sequence cultivated isolates should target characterized isolates from culture collections for which biochemical data are available, as well as other cultures of lasting value from personal collections. The genomes of type strains should be among the first targets for sequencing, but creative culture methods, novel cell isolation, and sorting methods would all be helpful in obtaining organisms we have not yet been able to cultivate for sequencing. The data that should be provided for strains targeted for sequencing will depend on the phylogenetic context of the organism and the amount of information available about its nearest relatives. Annotation is an important part of transforming genome sequences into useful resources, but it represents the most significant bottleneck to the field of comparative genomics right now and must be addressed. Furthermore, there is a need for more consistency in both annotation and achieving annotation data. As new annotation tools become available over time, re-annotation of genomes should be implemented, taking advantage of advancements in annotation techniques in order to capitalize on the genome sequences and increase both the societal and scientific benefit of genomics work. Given the proper resources, the knowledge and ability exist to be able to select model systems, some simple, some less so, and dissect them so that we may understand the processes and interactions at work in them. Colloquium participants suggest a five-pronged, coordinated initiative to exhaustively describe six different microbial ecosystems, designed to describe all the gene diversity, across genomes. In this effort, sequencing should be complemented by other experimental data, particularly transcriptomics and metabolomics data, all of which

Margaret Riley; Merry Buckley

2009-01-01

343

Mitochondrial Genome Sequence of the Legume Vicia faba.  

PubMed

The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000?bp with a genome complexity of 387,745?bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806?bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs. PMID:23675376

Negruk, Valentine

2013-01-01

344

Genome sequence of the date palm Phoenix dactylifera L  

PubMed Central

Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm’s unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

Al-Mssallem, Ibrahim S.; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M.; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O.; Jia, Shangang; Yin, An; Alhuzimi, Eman M.; Alsaihati, Burair A.; Al-Owayyed, Saad A.; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A.; Sun, Gaoyuan; Majrashi, Majed A.; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A.; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F.; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R.; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

2013-01-01

345

Complete Genome Sequence of Methanobacterium thermoautotrophicum DH: Functional Analysis and Comparative Genomics  

Microsoft Academic Search

The complete 1,751,377-bp sequence of the genome of the thermophilic archaeon Methanobacterium thermo- autotrophicum DH has been determined by a whole-genome shotgun sequencing approach. A total of 1,855 open reading frames (ORFs) have been identified that appear to encode polypeptides, 844 (46%) of which have been assigned putative functions based on their similarities to database sequences with assigned functions. A

DOUGLAS R. SMITH; LYNN A. DOUCETTE-STAMM; CRAIG DELOUGHERY; HONGMEI LEE; JOANN DUBOIS; TYLER ALDREDGE; ROMINA BASHIRZADEH; DERRON BLAKELY; ROBIN COOK; KATIE GILBERT; DAWN HARRISON; LIEU HOANG; PAMELA KEAGLE; WENDY LUMM; BRYAN POTHIER; DAYONG QIU; ROB SPADAFORA; RITA VICAIRE; YING WANG; JAMEY WIERZBOWSKI; RENE GIBSON; NILOFER JIWANI; ANTHONY CARUSO; DAVID BUSH; HERSHEL SAFER; DONIVAN PATWELL; SHASHI PRABHAKAR; STEVE MCDOUGALL; GEORGE SHIMER; ANIL GOYAL; SHMUEL PIETROKOVSKI; GEORGE M. CHURCH; CHARLES J. DANIELS; JEN-I MAO; PHIL RICE; JORK NOLLING; JOHN N. REEVE

1997-01-01

346

Characterizing the walnut genome through analyses of BAC end sequences.  

PubMed

Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2 Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut. PMID:22101470

Wu, Jiajie; Gu, Yong Q; Hu, Yuqin; You, Frank M; Dandekar, Abhaya M; Leslie, Charles A; Aradhya, Mallikarjuna; Dvorak, Jan; Luo, Ming-Cheng

2012-01-01

347

Translating the cancer genome: Going beyond p values  

E-print Network

from cancer biology, cancer genetics, cancer modeling andcancers of different cell lineages can be triangulated with genomic and biological data from tumors and geneticsCancer 100, 1459- Furnari, F.B. et al. Malignant astrocytic glioma: genetics,

Chin, Lynda

2008-01-01

348

Complete genome sequence of Serratia plymuthica strain AS12  

PubMed Central

A plant-associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest because it promotes plant growth and inhibits plant pathogens. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled “Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens”. PMID:22768360

Finlay, Roger D.; Alstrom, Sadhna; Goodwin, Lynne; Kyrpides, Nikos C.; Lucas, Susan; Lapidus, Alla; Bruce, David; Pitluck, Sam; Peters, Lin; Ovchinnikova, Galina; Chertkov, Olga; Han, James; Han, Cliff; Tapia, Roxanne; Detter, John C.; Land, Miriam; Hauser, Loren; Cheng, Jan-Fang; Ivanova, Natalia; Pagani, Ioanna; Klenk, Hans-Peter; Woyke, Tanja; Hogberg, Nils

2012-01-01

349

RESTseq - Efficient Benchtop Population Genomics with RESTriction Fragment SEQuencing  

PubMed Central

We present RESTseq, an improved approach for a cost efficient, highly flexible and repeatable enrichment of DNA fragments from digested genomic DNA using Next Generation Sequencing platforms including small scale Personal Genome sequencers. Easy adjustments make it suitable for a wide range of studies requiring SNP detection or SNP genotyping from fine-scale linkage mapping to population genomics and population genetics also in non-model organisms. We demonstrate the validity of our approach by comparing two honeybee and several stingless bee samples. PMID:23691128

Stolle, Eckart; Moritz, Robin F. A.

2013-01-01

350

RESTseq--efficient benchtop population genomics with RESTriction Fragment SEQuencing.  

PubMed

We present RESTseq, an improved approach for a cost efficient, highly flexible and repeatable enrichment of DNA fragments from digested genomic DNA using Next Generation Sequencing platforms including small scale Personal Genome sequencers. Easy adjustments make it suitable for a wide range of studies requiring SNP detection or SNP genotyping from fine-scale linkage mapping to population genomics and population genetics also in non-model organisms. We demonstrate the validity of our approach by comparing two honeybee and several stingless bee samples. PMID:23691128

Stolle, Eckart; Moritz, Robin F A

2013-01-01

351

Complete genome sequence of Serratia plymuthica strain AS12  

SciTech Connect

A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

2012-01-01

352

Complete genome sequence of Ferroglobus placidus AEDII12DO  

SciTech Connect

Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryar- chaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemoli- thoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and anno- tation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was se- quenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Risso, Carla [University of Massachusetts, Amherst; Holmes, Dawn [University of Massachusetts, Amherst; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Lovley, Derek [University of Massachusetts, Amherst; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

2011-01-01

353

Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis.  

PubMed

We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the approximately 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated in pathogenesis and phagocytosis of host proteins may exemplify adaptations of the parasite during its transition to a urogenital environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria. PMID:17218520

Carlton, Jane M; Hirt, Robert P; Silva, Joana C; Delcher, Arthur L; Schatz, Michael; Zhao, Qi; Wortman, Jennifer R; Bidwell, Shelby L; Alsmark, U Cecilia M; Besteiro, Sébastien; Sicheritz-Ponten, Thomas; Noel, Christophe J; Dacks, Joel B; Foster, Peter G; Simillion, Cedric; Van de Peer, Yves; Miranda-Saavedra, Diego; Barton, Geoffrey J; Westrop, Gareth D; Müller, Sylke; Dessi, Daniele; Fiori, Pier Luigi; Ren, Qinghu; Paulsen, Ian; Zhang, Hanbang; Bastida-Corcuera, Felix D; Simoes-Barbosa, Augusto; Brown, Mark T; Hayes, Richard D; Mukherjee, Mandira; Okumura, Cheryl Y; Schneider, Rachel; Smith, Alias J; Vanacova, Stepanka; Villalvazo, Maria; Haas, Brian J; Pertea, Mihaela; Feldblyum, Tamara V; Utterback, Terry R; Shu, Chung-Li; Osoegawa, Kazutoyo; de Jong, Pieter J; Hrdy, Ivan; Horvathova, Lenka; Zubacova, Zuzana; Dolezal, Pavel; Malik, Shehre-Banoo; Logsdon, John M; Henze, Katrin; Gupta, Arti; Wang, Ching C; Dunne, Rebecca L; Upcroft, Jacqueline A; Upcroft, Peter; White, Owen; Salzberg, Steven L; Tang, Petrus; Chiu, Cheng-Hsun; Lee, Ying-Shiung; Embley, T Martin; Coombs, Graham H; Mottram, Jeremy C; Tachezy, Jan; Fraser-Liggett, Claire M; Johnson, Patricia J

2007-01-12

354

Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis  

PubMed Central

We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the ~160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated in pathogenesis and phagocytosis of host proteins may exemplify adaptations of the parasite during its transition to a urogenital environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria. PMID:17218520

Carlton, Jane M.; Hirt, Robert P.; Silva, Joana C.; Delcher, Arthur L.; Schatz, Michael; Zhao, Qi; Wortman, Jennifer R.; Bidwell, Shelby L.; Alsmark, U. Cecilia M.; Besteiro, Sébastien; Sicheritz-Ponten, Thomas; Noel, Christophe J.; Dacks, Joel B.; Foster, Peter G.; Simillion, Cedric; Van de Peer, Yves; Miranda-Saavedra, Diego; Barton, Geoffrey J.; Westrop, Gareth D.; Müller, Sylke; Dessi, Daniele; Fiori, Pier Luigi; Ren, Qinghu; Paulsen, Ian; Zhang, Hanbang; Bastida-Corcuera, Felix D.; Simoes-Barbosa, Augusto; Brown, Mark T.; Hayes, Richard D.; Mukherjee, Mandira; Okumura, Cheryl Y.; Schneider, Rachel; Smith, Alias J.; Vanacova, Stepanka; Villalvazo, Maria; Haas, Brian J.; Pertea, Mihaela; Feldblyum, Tamara V.; Utterback, Terry R.; Shu, Chung-Li; Osoegawa, Kazutoyo; de Jong, Pieter J.; Hrdy, Ivan; Horvathova, Lenka; Zubacova, Zuzana; Dolezal, Pavel; Malik, Shehre-Banoo; Logsdon, John M.; Henze, Katrin; Gupta, Arti; Wang, Ching C.; Dunne, Rebecca L.; Upcroft, Jacqueline A.; Upcroft, Peter; White, Owen; Salzberg, Steven L.; Tang, Petrus; Chiu, Cheng-Hsun; Lee, Ying-Shiung; Embley, T. Martin; Coombs, Graham H.; Mottram, Jeremy C.; Tachezy, Jan; Fraser-Liggett, Claire M.; Johnson, Patricia J.

2007-01-01

355

Comparison of Sample Sequences of the Salmonella typhi Genome to the Sequence of the Complete Escherichia coli K-12 Genome  

PubMed Central

Raw sequence data representing the majority of a bacterial genome can be obtained at a tiny fraction of the cost of a completed sequence. To demonstrate the utility of such a resource, 870 single-stranded M13 clones were sequenced from a shotgun library of the Salmonella typhi Ty2 genome. The sequence reads averaged over 400 bases and sampled the genome with an average spacing of once every 5,000 bases. A total of 339,243 bases of unique sequence was generated (approximately 7% representation). The sample of 870 sequences was compared to the complete Escherichia coli K-12 genome and to the rest of the GenBank database, which can also be considered a collection of sampled sequences. Despite the incomplete S. typhi data set, interesting categories could easily be discerned. Sixteen percent of the sequences determined from S. typhi had close homologs among known Salmonella sequences (P < 1e?40 in BlastX or BlastN), reflecting the proportion of these genomes that have been sequenced previously; 277 sequences (32%) had no apparent orthologs in the complete E. coli K-12 genome (P > 1e?20), of which 155 sequences (18%) had no close similarities to any sequence in the database (P > 1e?5). Eight of the 277 sequences had similarities to genes in other strains of E. coli or plasmids, and six sequences showed evidence of novel phage lysogens or sequence remnants of phage integrations, including a member of the lambda family (P < 1e?15). Twenty-three sample sequences had a significantly closer similarity a sequence in the database from organisms other than the E. coli/Salmonella clade (which includes Shigella and Citrobacter). These sequences are new candidate lateral transfer events to the S. typhi lineage or deletions on the E. coli K-12 lineage. Eleven putative junctions of insertion/deletion events greater than 100 bp were observed in the sample, indicating that well over 150 such events may distinguish S. typhi from E. coli K-12. The need for automatic methods to more effectively exploit sample sequences is discussed. PMID:9712782

McClelland, Michael; Wilson, Richard K.

1998-01-01

356

A survey of tools for variant analysis of next-generation genome sequencing data.  

PubMed

Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers. PMID:23341494

Pabinger, Stephan; Dander, Andreas; Fischer, Maria; Snajder, Rene; Sperk, Michael; Efremova, Mirjana; Krabichler, Birgit; Speicher, Michael R; Zschocke, Johannes; Trajanoski, Zlatko

2014-03-01

357

Short reads, circular genome: skimming solid sequence to construct the bighorn sheep mitochondrial genome.  

PubMed

As sequencing technology improves, an increasing number of projects aim to generate full genome sequence, even for nonmodel taxa. These projects may be feasibly conducted at lower read depths if the alignment can be aided by previously developed genomic resources from a closely related species. We investigated the feasibility of constructing a complete mitochondrial (mt) genome without preamplification or other targeting of the sequence. Here we present a full mt genome sequence (16,463 nucleotides) for the bighorn sheep (Ovis canadensis) generated though alignment of SOLiD short-read sequences to a reference genome. Average read depth was 1240, and each base was covered by at least 36 reads. We then conducted a phylogenomic analysis with 27 other bovid mitogenomes, which placed bighorn sheep firmly in the Ovis clade. These results show that it is possible to generate a complete mitogenome by skimming a low-coverage genomic sequencing library. This technique will become increasingly applicable as the number of taxa with some level of genome sequence rises. PMID:21948953

Miller, Joshua M; Malenfant, René M; Moore, Stephen S; Coltman, David W

2012-01-01

358

Pattern discovery and cancer gene identification in integrated cancer genomic data  

PubMed Central

Large-scale integrated cancer genome characterization efforts including the cancer genome atlas and the cancer cell line encyclopedia have created unprecedented opportunities to study cancer biology in the context of knowing the entire catalog of genetic alterations. A clinically important challenge is to discover cancer subtypes and their molecular drivers in a comprehensive genetic context. Curtis et al. [Nature (2012) 486(7403):346–352] has recently shown that integrative clustering of copy number and gene expression in 2,000 breast tumors reveals novel subgroups beyond the classic expression subtypes that show distinct clinical outcomes. To extend the scope of integrative analysis for the inclusion of somatic mutation data by massively parallel sequencing, we propose a framework for joint modeling of discrete and continuous variables that arise from integrated genomic, epigenomic, and transcriptomic profiling. The core idea is motivated by the hypothesis that diverse molecular phenotypes can be predicted by a set of orthogonal latent variables that represent distinct molecular drivers, and thus can reveal tumor subgroups of biological and clinical importance. Using the cancer cell line encyclopedia dataset, we demonstrate our method can accurately group cell lines by their cell-of-origin for several cancer types, and precisely pinpoint their known and potential cancer driver genes. Our integrative analysis also demonstrates the power for revealing subgroups that are not lineage-dependent, but consist of different cancer types driven by a common genetic alteration. Application of the cancer genome atlas colorectal cancer data reveals distinct integrated tumor subtypes, suggesting different genetic pathways in colon cancer progression. PMID:23431203

Mo, Qianxing; Wang, Sijian; Seshan, Venkatraman E.; Olshen, Adam B.; Schultz, Nikolaus; Sander, Chris; Powers, R. Scott; Ladanyi, Marc; Shen, Ronglai

2013-01-01

359

Corruption of genomic databases with anomalous sequence.  

PubMed Central

We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%. PMID:1614861

Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L

1992-01-01

360

The consequences of structural genomic alterations in humans: Genomic Disorders, genomic instability and cancer  

Microsoft Academic Search

Over the last decade or so, sophisticated technological advances in array-based genomics have firmly established the contribution of structural alterations in the human genome to a variety of complex developmental disorders, and also to diseases such as cancer. In fact, multiple ‘novel’ disorders have been identified as a direct consequence of these advances. Our understanding of the molecular events leading

Rita Colnaghi; Gillian Carpenter; Marcel Volker; Mark O’Driscoll

361

Complete Genome Sequence of a Novel Pestivirus from Sheep  

PubMed Central

We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species. PMID:22997427

Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

2012-01-01

362

Triplex-forming oligonucleotide target sequences in the human genome  

Microsoft Academic Search

The existence of sequences in the human genome which can be a target for triplex formation, and accordingly are candidates for anti-gene therapies, has been studied by using bioinformatics tools. It was found that the population of triplex-forming oligonucleotide target sequences (TTS) is much more abundant than that expected from simple random models. The population of TTS is large in

J. Ramon Goni; Xavier de la Cruz; Modesto Orozco

2004-01-01

363

Genome Sequence of Fusobacterium nucleatum Subspecies Polymorphum — a Genetically Tractable  

Microsoft Academic Search

Fusobacterium nucleatum is a prominent member of the oral microbiota and is a common cause of human infection. F. nucleatum includes five subspecies: polymorphum, nucleatum, vincentii, fusiforme, and animalis. F. nucleatum subsp. polymorphum ATCC 10953 has been well characterized phenotypically and, in contrast to previously sequenced strains, is amenable to gene transfer. We sequenced and annotated the 2,429,698 bp genome

Fusobacterium Sandor; E. Karpathy; Xiang Qin; Jason Gioia; Huaiyang Jiang; Yamei Liu; Joseph F. Petrosino; Shailaja Yerrapragada; George E. Fox; Susan Kinder Haake; George M. Weinstock; Sarah K. Highlander

364

Genome Sequences of Vibrio navarrensis, a Potential Human Pathogen  

PubMed Central

Vibrio navarrensis is an aquatic bacterium recently shown to be associated with human illness. We report the first genome sequences of three V. navarrensis strains obtained from clinical and environmental sources. Preliminary analyses of the sequences reveal that V. navarrensis contains genes commonly associated with virulence in other human pathogens. PMID:25414502

Gladney, Lori M.; Katz, Lee S.; Knipe, Kristen M.; Rowe, Lori A.; Conley, Andrew B.; Rishishwar, Lavanya; Mariño-Ramírez, Leonardo

2014-01-01

365

Environmental Genome Shotgun Sequencing of the Sargasso Sea  

Microsoft Academic Search

We have applied ``whole-genome shotgun sequencing'' to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These

J. Craig Venter; Karin Remington; John F. Heidelberg; Aaron L. Halpern; Doug Rusch; Dongying Wu; Ian Paulsen; Karen E. Nelson; William Nelson; Derrick E. Fouts; Samuel Levy; Anthony H. Knap; Michael W. Lomas; Ken Nealson; Owen White; Jeremy Peterson; Jeff Hoffman; Rachel Parsons; Holly Baden-Tillson; Cynthia Pfannkoch; Yu-Hui Rogers; Hamilton O. Smith

2004-01-01

366

Revisiting the sequencing of the first tree genome: Populus trichocarpa.  

PubMed

Ten years ago, it was announced that the Joint Genome Institute with funds provided by the Department of Energy, Office of Science, Biological and Environmental Research would sequence the black cottonwood (Populus trichocarpa Torr. & Gray) genome. This landmark decision was the culmination of work by the forest science community to develop Populus as a model system. Since its public release in late 2006, the availability of the Populus genome has spawned research in plant biology, morphology, genetics and ecology. Here we address how the tree physiologist has used this resource. More specifically, we revisit our earlier contention that the rewards of sequencing the Populus genome would depend on how quickly scientists working with woody perennials could adopt molecular approaches to investigate the mechanistic underpinnings of basic physiological processes. Several examples illustrate the integration of functional and comparative genomics into the forest sciences, especially in areas that target improved understanding of the developmental differences between woody perennials and herbaceous annuals (e.g., phase transitions). Sequencing the Populus genome and the availability of genetic and genomic resources has also been instrumental in identifying candidate genes that underlie physiological and morphological traits of interest. Genome-enabled research has advanced our understanding of how phenotype and genotype are related and provided insights into the genetic mechanisms whereby woody perennials adapt to environmental stress. In the future, we anticipate that low-cost, high-throughput sequencing will continue to facilitate research in tree physiology and enhance our understanding at scales of individual organisms and populations. A challenge remains, however, as to how genomic resources, including the Populus genome, can be used to understand ecosystem function. Although examples are limited, progress in this area is encouraging and will undoubtedly improve as future research targets the many unique aspects of Populus as a keystone species in terrestrial ecosystems. PMID:23100257

Wullschleger, Stan D; Weston, D J; DiFazio, S P; Tuskan, G A

2013-04-01

367

Sequence Analysis of the Genome of Carnation (Dianthus caryophyllus L.)  

PubMed Central

The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. ‘Francesco’ was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568 887 315 bp, consisting of 45 088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16 644 bp and 60 737 bp, respectively, and the longest scaffold was 1 287 144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ?98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. PMID:24344172

Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

2014-01-01

368

Understanding Cancer Series: Genome-Wide Profiling  

Cancer.gov

Single-gene tests focus on a specific, known location in a patient’s genome. Using this approach, scientists have looked for single genes linked to cancer. This research has revealed some important discoveries such as gene changes called mutations located within the BRCA1 or BRCA2 genes that may confer a significantly increased risk of breast and ovarian cancer. And some single-gene tests continue to inform treatment decisions.

369

Understanding Cancer Series: Genome-Wide Profiling  

Cancer.gov

Single-gene tests focus on a specific location in a patient's genome. Using this approach, scientists have looked for single genes linked to cancer. This research has revealed some important discoveries such as gene changes called mutations located within the BRCA1 or BRCA2 genes that may confer a significantly increased risk of breast and ovarian cancer. And some single-gene tests continue to inform treatment decisions.

370

Next-generation sequencing and large genome assemblies  

PubMed Central

The next-generation sequencing (NGS) revolution has drastically reduced time and cost requirements for sequencing of large genomes, and also qualitatively changed the problem of assembly. This article reviews the state of the art in de novo genome assembly, paying particular attention to mammalian-sized genomes. The strengths and weaknesses of the main sequencing platforms are highlighted, leading to a discussion of assembly and the new challenges associated with NGS data. Current approaches to assembly are outlined and the various software packages available are introduced and compared. The question of whether quality assemblies can be produced using short-read NGS data alone, or whether it must be combined with more expensive sequencing techniques, is considered. Prospects for future assemblers and tests of assembly performance are also discussed. PMID:22676195

Henson, Joseph; Tischler, German; Ning, Zemin

2012-01-01

371

Complete Genome Sequence of Treponema pallidum, the  

E-print Network

spirochete, Borrelia burgdorferi, the agent of Lyme disease, identified unique and common genes agent of Lyme disease, are similar in having relatively small genomes and surviv- ing only. The disease quickly reached epidemic pro- portions in Europe and spread across the world during the early 16th

Salzberg, Steven

372

Genomics and proteomics: Emerging technologies in clinical cancer research  

Microsoft Academic Search

Fueled by the complete genomic data acquired from the human genome project and the desperate clinical need of comprehensive analytical tools to study a heterogeneous disease like cancer, genomic and proteomic technologies have evolved rapidly, accelerating the rate and number of discoveries in clinical cancer research. These discoveries include mechanistic understanding of cancer biology as well as the identification of

Christine H. Chung; Shawn Levy; Pierre Chaurand; David P. Carbone

2007-01-01

373

The Diploid Genome Sequence of an Individual Human  

Microsoft Academic Search

Presented here is a genome sequence of an individual human. It was produced from ?32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison

Samuel Levy; Granger Sutton; Pauline C Ng; Lars Feuk; Aaron L Halpern; Brian P Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A Kravitz; Dana A Busam; Karen Y Beeson; Tina C McIntosh; Karin A Remington; Josep F Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E Frazier; Stephen W Scherer; Robert L Strausberg; J. Craig Venter

2007-01-01

374

Sequence-Based Mapping of the Polyploid Wheat Genome  

PubMed Central

The emergence of new sequencing technologies has provided fast and cost-efficient strategies for high-resolution mapping of complex genomes. Although these approaches hold great promise to accelerate genome analysis, their application in studying genetic variation in wheat has been hindered by the complexity of its polyploid genome. Here, we applied the next-generation sequencing of a wheat doubled-haploid mapping population for high-resolution gene mapping and tested its utility for ordering shotgun sequence contigs of a flow-sorted wheat chromosome. A bioinformatical pipeline was developed for reliable variant analysis of sequence data generated for polyploid wheat mapping populations. The results of variant mapping were consistent with the results obtained using the wheat 9000 SNP iSelect assay. A reference map of the wheat genome integrating 2740 gene-associated single-nucleotide polymorphisms from the wheat iSelect assay, 1351 diversity array technology, 118 simple sequence repeat/sequence-tagged sites, and 416,856 genotyping-by-sequencing markers was developed. By analyzing the sequenced megabase-size regions of the wheat genome we showed that mapped markers are located within 40?100 kb from genes providing a possibility for high-resolution mapping at the level of a single gene. In our population, gene loci controlling a seed color phenotype cosegregated with 2459 markers including one that was located within the red seed color gene. We demonstrate that the high-density reference map presented here is a useful resource for gene mapping and linking physical and genetic maps of the wheat genome. PMID:23665877

Saintenac, Cyrille; Jiang, Dayou; Wang, Shichen; Akhunov, Eduard

2013-01-01

375

Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida  

PubMed Central

Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70. PMID:23405337

Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.

2013-01-01

376

Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida.  

PubMed

Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70. PMID:23405337

Abrahante, Juan E; Johnson, Timothy J; Hunter, Samuel S; Maheswaran, Samuel K; Hauglund, Melissa J; Bayles, Darrell O; Tatum, Fred M; Briggs, Robert E

2013-01-01

377

Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner  

PubMed Central

We define a “threaded blockset,” which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for “threaded blockset aligner”) builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser. PMID:15060014

Blanchette, Mathieu; Kent, W. James; Riemer, Cathy; Elnitski, Laura; Smit, Arian F.A.; Roskin, Krishna M.; Baertsch, Robert; Rosenbloom, Kate; Clawson, Hiram; Green, Eric D.; Haussler, David; Miller, Webb

2004-01-01

378

The Diploid Genome Sequence of an Individual Human  

PubMed Central

Presented here is a genome sequence of an individual human. It was produced from ?32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information. PMID:17803354

Levy, Samuel; Sutton, Granger; Ng, Pauline C; Feuk, Lars; Halpern, Aaron L; Walenz, Brian P; Axelrod, Nelson; Huang, Jiaqi; Kirkness, Ewen F; Denisov, Gennady; Lin, Yuan; MacDonald, Jeffrey R; Pang, Andy Wing Chun; Shago, Mary; Stockwell, Timothy B; Tsiamouri, Alexia; Bafna, Vineet; Bansal, Vikas; Kravitz, Saul A; Busam, Dana A; Beeson, Karen Y; McIntosh, Tina C; Remington, Karin A; Abril, Josep F; Gill, John; Borman, Jon; Rogers, Yu-Hui; Frazier, Marvin E; Scherer, Stephen W; Strausberg, Robert L; Venter, J. Craig

2007-01-01

379

Genome sequence of the pea aphid Acyrthosiphon pisum.  

PubMed

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. PMID:20186266

2010-02-01

380

Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)  

DOE Data Explorer

Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

381

Adaptive seeds tame genomic sequence comparison  

PubMed Central

The main way of analyzing biological sequences is by comparing and aligning them to each other. It remains difficult, however, to compare modern multi-billionbase DNA data sets. The difficulty is caused by the nonuniform (oligo)nucleotide composition of these sequences, rather than their size per se. To solve this problem, we modified the standard seed-and-extend approach (e.g., BLAST) to use adaptive seeds. Adaptive seeds are matches that are chosen based on their rareness, instead of using fixed-length matches. This method guarantees that the number of matches, and thus the running time, increases linearly, instead of quadratically, with sequence length. LAST, our open source implementation of adaptive seeds, enables fast and sensitive comparison of large sequences with arbitrarily nonuniform composition. PMID:21209072

Kielbasa, Szymon M.; Wan, Raymond; Sato, Kengo; Horton, Paul; Frith, Martin C.

2011-01-01

382

Genomics of Squamous Cell Lung Cancer  

PubMed Central

Approximately 30% of patients with non-small cell lung cancer have the squamous cell carcinoma (SQCC) histological subtype. Although targeted therapies have improved outcomes in patients with adenocarcinoma, no agents are currently approved specifically for use in SQCC. The Cancer Genome Atlas (TCGA) recently published the results of comprehensive genomic analyses of tumor samples from 178 patients with SQCC of the lung. In this review, we briefly discuss key molecular aberrations reported by TCGA and other investigators and their potential therapeutic implications. Carefully designed preclinical and clinical studies based on these large-scale genomic analyses are critical to improve the outcomes of patients with SQCC of lung in the near future. PMID:23728941

Rooney, Melissa; Devarakonda, Siddhartha

2013-01-01

383

MicroRNAs, Genomic Instability and Cancer  

PubMed Central

MicroRNAs (miRNAs) are small non-coding RNA transcripts approximately 20 nucleotides in length that regulate expression of protein-coding genes via complementary binding mechanisms. The last decade has seen an exponential increase of publications on miRNAs, ranging from every aspect of basic cancer biology to diagnostic and therapeutic explorations. In this review, we summarize findings of miRNA involvement in genomic instability, an interesting but largely neglected topic to date. We discuss the potential mechanisms by which miRNAs induce genomic instability, considered to be one of the most important driving forces of cancer initiation and progression, though its precise mechanisms remain elusive. We classify genomic instability mechanisms into defects in cell cycle regulation, DNA damage response, and mitotic separation, and review the findings demonstrating the participation of specific miRNAs in such mechanisms. PMID:25141103

Vincent, Kimberly; Pichler, Martin; Lee, Gyeong-Won; Ling, Hui

2014-01-01

384

Sequences Promoting Recoding Are Singular Genomic Elements  

Microsoft Academic Search

\\u000a The distribution of sequences which induce non-standard decoding, especially of shift-prone sequences, is very unusual. On\\u000a one hand, since they can disrupt standard genetic readout, they are avoided within the coding regions of most genes. On the\\u000a other hand, they play important regulatory roles for the expression of those genes where they do occur. As a result, they\\u000a are preserved

Pavel V. Baranov; Olga Gurvich

385

Draft genome sequence of the Tibetan antelope  

PubMed Central

The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared with other plain-dwelling mammals, the genome of the Tibetan antelope shows signals of adaptive evolution and gene-family expansion in genes associated with energy metabolism and oxygen transmission. Both the highland American pika, and the Tibetan antelope have signals of positive selection for genes involved in DNA repair and the production of ATPase. Genes associated with hypoxia seem to have experienced convergent evolution. Thus, our study suggests that common genetic mechanisms might have been utilized to enable high-altitude adaptation. PMID:23673643

Ge, Ri-Li; Cai, Qingle; Shen, Yong-Yi; San, A; Ma, Lan; Zhang, Yong; Yi, Xin; Chen, Yan; Yang, Lingfeng; Huang, Ying; He, Rongjun; Hui, Yuanyuan; Hao, Meirong; Li, Yue; Wang, Bo; Ou, Xiaohua; Xu, Jiaohui; Zhang, Yongfen; Wu, Kui; Geng, Chunyu; Zhou, Weiping; Zhou, Taicheng; Irwin, David M.; Yang, Yingzhong; Ying, Liu; Bao, Haihua; Kim, Jaebum; Larkin, Denis M.; Ma, Jian; Lewin, Harris A.; Xing, Jinchuan; Platt, Roy N.; Ray, David A.; Auvil, Loretta; Capitanu, Boris; Zhang, Xiufeng; Zhang, Guojie; Murphy, Robert W.; Wang, Jun; Zhang, Ya-Ping; Wang, Jian

2013-01-01

386

Rosaceaous Genome Sequencing: Perspectives and Progress  

Microsoft Academic Search

\\u000a The long-term goal of plant genomics is to identify, isolate and determine the function of plant genes that are associated\\u000a with both vegetative and reproductive phenotypes. Most phenotypes require the coordinated activity and regulatory control\\u000a of suites of genes over time and in precise positions within the plant. Until recently, the idea of establishing a comprehensive\\u000a approach to isolate and

Bryon Sosinski; Vladimir Shulaev; Amit Dhingra; Ananth Kalyanaraman; Roger Bumgarner; Daniel Rokhsar; Ignazio Verde; Riccardo Velasco; Albert G. Abbott

387

A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)  

ScienceCinema

Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

FitzGerald, Michael [Broad Institute

2013-02-12

388

Triticeae genomics: advances in sequence analysis of large genome cereal crops  

Microsoft Academic Search

Whole genome sequencing provides direct access to all genes of an organism and represents an essential step towards a systematic\\u000a understanding of (crop) plant biology. Wheat and barley, two of the most important crop species worldwide, have two- to five-fold\\u000a larger genomes than human – too large to be completely sequenced at current costs. Nevertheless, significant progress has\\u000a been made

Nils Stein

2007-01-01

389

Complete genome sequence of Arcobacter nitrofigilis type strain (CIT)  

PubMed Central

Arcobacter nitrofigilis (McClung et al. 1983) Vandamme et al. 1991 is the type species of the genus Arcobacter in the family Campylobacteraceae within the Epsilonproteobacteria. The species was first described in 1983 as Campylobacter nitrofigilis [1] after its detection as a free-living, nitrogen-fixing Campylobacter species associated with Spartina alterniflora Loisel roots [2]. It is of phylogenetic interest because of its lifestyle as a symbiotic organism in a marine environment in contrast to many other Arcobacter species which are associated with warm-blooded animals and tend to be pathogenic. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a type stain of the genus Arcobacter. The 3,192,235 bp genome with its 3,154 protein-coding and 70 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304714

Pati, Amrita; Gronow, Sabine; Lapidus, Alla; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Chertkov, Olga; Bruce, David; Tapia, Roxanne; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Detter, John C.; Rohde, Manfred; Göker, Markus; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

2010-01-01

390

Complete genome sequence of Arthrobacter sp. strain FB24  

SciTech Connect

Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program.

Nakatsu, C. H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, T.; Han, Cliff F.; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

2013-09-30

391

Draft genome sequence of the rubber tree Hevea brasiliensis  

PubMed Central

Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR). NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. Conclusions The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber. PMID:23375136

2013-01-01

392

Complete genome sequence of Desulfohalobium retbaense type strain (HR(100)).  

PubMed

Desulfohalobium retbaense (Ollivier et al. 1991) is the type species of the polyphyletic genus Desulfohalobium, which comprises, at the time of writing, two species and represents the family Desulfohalobiaceae within the Deltaproteobacteria. D. retbaense is a moderately halophilic sulfate-reducing bacterium, which can utilize H(2) and a limited range of organic substrates, which are incompletely oxidized to acetate and CO(2), for growth. The type strain HR(100) (T) was isolated from sediments of the hypersaline Retba Lake in Senegal. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Desulfohalobiaceae. The 2,909,567 bp genome (one chromosome and a 45,263 bp plasmid) with its 2,552 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304676

Spring, Stefan; Nolan, Matt; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Land, Miriam; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Munk, Christine; Kiss, Hajnalka; Chain, Patrick; Han, Cliff; Brettin, Thomas; Detter, John C; Schüler, Esther; Göker, Markus; Rohde, Manfred; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

2010-01-01

393

Complete genome sequence of a virulent isolate of Streptococcus pneumoniae.  

PubMed

The 2,160,837-base pair genome sequence of an isolate of Streptococcus pneumoniae, a Gram-positive pathogen that causes pneumonia, bacteremia, meningitis, and otitis media, contains 2236 predicted coding regions; of these, 1440 (64%) were assigned a biological role. Approximately 5% of the genome is composed of insertion sequences that may contribute to genome rearrangements through uptake of foreign DNA. Extracellular enzyme systems for the metabolism of polysaccharides and hexosamines provide a substantial source of carbon and nitrogen for S. pneumoniae and also damage host tissues and facilitate colonization. A motif identified within the signal peptide of proteins is potentially involved in targeting these proteins to the cell surface of low-guanine/cytosine (GC) Gram-positive species. Several surface-exposed proteins that may serve as potential vaccine candidates were identified. Comparative genome hybridization with DNA arrays revealed strain differences in S. pneumoniae that could contribute to differences in virulence and antigenicity. PMID:11463916

Tettelin, H; Nelson, K E; Paulsen, I T; Eisen, J A; Read, T D; Peterson, S; Heidelberg, J; DeBoy, R T; Haft, D H; Dodson, R J; Durkin, A S; Gwinn, M; Kolonay, J F; Nelson, W C; Peterson, J D; Umayam, L A; White, O; Salzberg, S L; Lewis, M R; Radune, D; Holtzapple, E; Khouri, H; Wolf, A M; Utterback, T R; Hansen, C L; McDonald, L A; Feldblyum, T V; Angiuoli, S; Dickinson, T; Hickey, E K; Holt, I E; Loftus, B J; Yang, F; Smith, H O; Venter, J C; Dougherty, B A; Morrison, D A; Hollingshead, S K; Fraser, C M

2001-07-20

394

Sequencing and analysis of an Irish human genome  

PubMed Central

Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge. PMID:20822512

2010-01-01

395

Whole-genome sequencing and identification of Morganella morganii KT pathogenicity-related genes  

PubMed Central

Background The opportunistic enterobacterium, Morganella morganii, which can cause bacteraemia, is the ninth most prevalent cause of clinical infections in patients at Changhua Christian Hospital, Taiwan. The KT strain of M. morganii was isolated during postoperative care of a cancer patient with a gallbladder stone who developed sepsis caused by bacteraemia. M. morganii is sometimes encountered in nosocomial settings and has been causally linked to catheter-associated bacteriuria, complex infections of the urinary and/or hepatobiliary tracts, wound infection, and septicaemia. M. morganii infection is associated with a high mortality rate, although most patients respond well to appropriate antibiotic therapy. To obtain insights into the genome biology of M. morganii and the mechanisms underlying its pathogenicity, we used Illumina technology to sequence the genome of the KT strain and compared its sequence with the genome sequences of related bacteria. Results The 3,826,919-bp sequence contained in 58 contigs has a GC content of 51.15% and includes 3,565 protein-coding sequences, 72 tRNA genes, and 10 rRNA genes. The pathogenicity-related genes encode determinants of drug resistance, fimbrial adhesins, an IgA protease, haemolysins, ureases, and insecticidal and apoptotic toxins as well as proteins found in flagellae, the iron acquisition system, a type-3 secretion system (T3SS), and several two-component systems. Comparison with 14 genome sequences from other members of Enterobacteriaceae revealed different degrees of similarity to several systems found in M. morganii. The most striking similarities were found in the IS4 family of transposases, insecticidal toxins, T3SS components, and proteins required for ethanolamine use (eut operon) and cobalamin (vitamin B12) biosynthesis. The eut operon and the gene cluster for cobalamin biosynthesis are not present in the other Proteeae genomes analysed. Moreover, organisation of the 19 genes of the eut operon differs from that found in the other non-Proteeae enterobacterial genomes. Conclusions This is the first genome sequence of M. morganii, which is a clinically relevant pathogen. Comparative genome analysis revealed several pathogenicity-related genes and novel genes not found in the genomes of other members of Proteeae. Thus, the genome sequence of M. morganii provides important information concerning virulence and determinants of fitness in this pathogen. PMID:23282187

2012-01-01

396

Sequence-Tagged Connectors: A Sequence Approach to Mapping and Scanning the Human Genome  

Microsoft Academic Search

The sequence-tagged connector (STC) strategy proposes to generate sequence tags densely scattered (every 3.3 kilobases) across the human genome by arraying 450,000 bacterial artificial chromosomes (BACs) with randomly cleaved inserts, sequencing both ends of each, and preparing a restriction enzyme fingerprint of each. The STC resource, containing end sequences, fingerprints, and arrayed BACs, creates a map where the interrelationships of

Gregory G. Mahairas; James C. Wallace; Kim Smith; Steven Swartzell; Ted Holzman; Andrew Keller; Ron Shaker; Jepf Furlong; Janet Young; Shaying Zhao; Mark D. Adams; Leroy Hood

1999-01-01

397

Contribution to Sequencing of the Deinococcus radiodurans Genome  

SciTech Connect

The stated goal of this project was to supply The Institute for Genomic Research (TIGR) with pure DNA from the bacterium Deinocmus radiodurans RI for purposes of complete genomic sequencing by TIGR. We subsequently decided to expand this project to include a second goal; this second goal was the development of a NotI chromosomal map of D. radiodurans R1 using Pulsed Field Gel Electrophoresis (PFGE).

Minton, K.W.

1999-03-11

398

Genome Sequence of Mycoplasma columbinum Strain SF7  

PubMed Central

Mycoplasma columbinum is a member of nonglycolytic Mycoplasma species which can hydrolyze arginine. Increasingly research has revealed that M. columbinum is associated with respiratory disease of pigeons and that the respiratory disease symptoms could be eliminated via the use of mycoplasma treatment medicine. Here we report the genome sequence of M. columbinum strain SF7, which is the first genome report for M. columbinum. PMID:23599295

Guo, Zisheng; Xu, Xiaolong; Zheng, Qian; Li, Tingting; Kuang, Shichang; Zhang, Zongde; Chen, Yushan; Lu, Xidong; Zhou, Rui; Jin, Hui

2013-01-01

399

Ancient human genome sequence of an extinct Palaeo-Eskimo  

E-print Network

diversity and composition directly. To access such data, ancient genomic sequencing is needed. Presently no genome from an ancient human has been published, the closest being two data sets representing a few megabases (Mb) ofDNA froma singleNeanderthal9..., Denmark. 8Departments of Integrative Biology and Statistics, UC-Berkeley, 4098 VLSB, Berkeley, California 94720, USA. 9Research Laboratory for Archaeology and the History of Art, Dyson Perrins Building, South Parks Road, Oxford OX1 3QY, UK. 10Department...

Rasmussen, Morten; Li, Yingrui; Lindgreen, Stinus; Pedersen, Jakob Skou; Albrechtsen, Anders; Moltke, Ida; Metspalu, Mait; Metspalu, Ene; Kivisild, Toomas; Gupta, Ramneek; Bertalan, Marcelo; Nielsen, Kasper; Gilbert, M. Thomas P.; Wang, Yong; Raghavan, Maanasa; Campos, Paula F.; Kamp, Hanne Munkholm; Wilson, Andrew S.; Gledhill, Andrew; Tridico, Silvana; Bunce, Michael; Lorenzen, Eline D.; Binladen, Jonas; Guo, Xiaosen; Zhao, Jing; Zhang, Xiuqing; Zhang, Hao; Li, Zhuo; Chen, Minfeng; Orlando, Ludovic; Kristiansen, Karsten; Bak, Mads; Tommerup, Niels; Bendixen, Christian; Pierre, Tracey L.; Gronnow, Bjarne; Meldgaard, Morten; Andreasen, Claus; Fedorova, Sardana A.; Osipova, Ludmila P.; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Hansen, Thomas v. O.; Nielsen, Finn C.; Crawford, Michael H.; Brunak, Soren; Sicheritz-Ponten, Thomas; Villems, Richard; Nielsen, Rasmus; Krogh, Anders; Wang, Jun; Willerslev, Eske

2010-02-11

400

Computational Approaches for Predicting Causal Missense Mutations in Cancer Genome Projects  

Microsoft Academic Search

A central focus of cancer genetics is the study of mutations that are causally implicated in tumorigenesis. Al- though missense variants are commonly identified in genomic sequence, only a small fraction directly contributes to on- cogenesis. The ability to distinguish those somatic missense changes that contribute to cancer progression from those that do not is a difficult problem usually accomplished

Zemin Zhang; Lawrence S. Hon; Joshua S. Kaminker

2008-01-01

401

Nanopore Sequencing of the phi X 174 genome  

E-print Network

Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost, and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length that can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. All methods and data are made fully available.

Laszlo, Andrew H; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

2014-01-01

402

Genome sequence of the model medicinal mushroom Ganoderma lucidum  

PubMed Central

Ganoderma lucidum is a widely used medicinal macrofungus in traditional Chinese medicine that creates a diverse set of bioactive compounds. Here we report its 43.3-Mb genome, encoding 16,113 predicted genes, obtained using next-generation sequencing and optical mapping approaches. The sequence analysis reveals an impressive array of genes encoding cytochrome P450s (CYPs), transporters and regulatory proteins that cooperate in secondary metabolism. The genome also encodes one of the richest sets of wood degradation enzymes among all of the sequenced basidiomycetes. In all, 24 physical CYP gene clusters are identified. Moreover, 78 CYP genes are coexpressed with lanosterol synthase, and 16 of these show high similarity to fungal CYPs that specifically hydroxylate testosterone, suggesting their possible roles in triterpenoid biosynthesis. The elucidation of the G. lucidum genome makes this organism a potential model system for the study of secondary metabolic pathways and their regulation in medicinal fungi. PMID:22735441

Chen, Shilin; Xu, Jiang; Liu, Chang; Zhu, Yingjie; Nelson, David R.; Zhou, Shiguo; Li, Chunfang; Wang, Lizhi; Guo, Xu; Sun, Yongzhen; Luo, Hongmei; Li, Ying; Song, Jingyuan; Henrissat, Bernard; Levasseur, Anthony; Qian, Jun; Li, Jianqin; Luo, Xiang; Shi, Linchun; He, Liu; Xiang, Li; Xu, Xiaolan; Niu, Yunyun; Li, Qiushi; Han, Mira V.; Yan, Haixia; Zhang, Jin; Chen, Haimei; Lv, Aiping; Wang, Zhen; Liu, Mingzhu; Schwartz, David C.; Sun, Chao

2012-01-01

403

Exploring genome characteristics and sequence quality without a reference  

PubMed Central

Motivation: The de novo assembly of large, complex genomes is a significant challenge with currently available DNA sequencing technology. While many de novo assembly software packages are available, comparatively little attention has been paid to assisting the user with the assembly. Results: This article addresses the practical aspects of de novo assembly by introducing new ways to perform quality assessment on a collection of sequence reads. The software implementation calculates per-base error rates, paired-end fragment-size distributions and coverage metrics in the absence of a reference genome. Additionally, the software will estimate characteristics of the sequenced genome, such as repeat content and heterozygosity that are key determinants of assembly difficulty. Availability: The software described is freely available online (https://github.com/jts/sga) and open source under the GNU Public License. Contact: jared.simpson@oicr.on.ca Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:24443382

2014-01-01

404

Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes  

PubMed Central

The natural transfer of DNA from mitochondria to the nucleus generates nuclear copies of mitochondrial DNA (numts) and is an ongoing evolutionary process, as genome sequences attest. In humans, five different numts cause genetic disease and a dozen human loci are polymorphic for the presence of numts, underscoring the rapid rate at which mitochondrial sequences reach the nucleus over evolutionary time. In the laboratory and in nature, numts enter the nuclear DNA via non-homolgous end joining (NHEJ) at double-strand breaks (DSBs). The frequency of numt insertions among 85 sequenced eukaryotic genomes reveal that numt content is strongly correlated with genome size, suggesting that the numt insertion rate might be limited by DSB frequency. Polymorphic numts in humans link maternally inherited mitochondrial genotypes to nuclear DNA haplotypes during the past, offering new opportunities to associate nuclear markers with mitochondrial markers back in time. PMID:20168995

Hazkani-Covo, Einat; Zeller, Raymond M.; Martin, William

2010-01-01

405

Complete Genome Sequence of Rickettsia typhi and Comparison with Sequences of Other Rickettsiae  

Microsoft Academic Search

Rickettsia typhi, the causative agent of murine typhus, is an obligate intracellular bacterium with a life cycle involving both vertebrate and invertebrate hosts. Here we present the complete genome sequence of R. typhi (1,111,496 bp) and compare it to the two published rickettsial genome sequences: R. prowazekii and R. conorii. We identified 877 genes in R. typhi encoding 3 rRNAs,

Michael P. McLeod; Xiang Qin; Sandor E. Karpathy; Jason Gioia; Sarah K. Highlander; George E. Fox; Thomas Z. McNeill; Huaiyang Jiang; Donna Muzny; Leni S. Jacob; Alicia C. Hawes; Erica Sodergren; Rachel Gill; Jennifer Hume; Maggie Morgan; Guangwei Fan; Anita G. Amin; Richard A. Gibbs; Chao Hong; Xue-jie Yu; David H. Walker; George M. Weinstock

406

Study reveals genomic similarities between breast and ovarian cancers  

Cancer.gov

A new study from The Cancer Genome Atlas captured a complete view of genomic alterations in breast cancer and classified them into four intrinsic subtypes, one of which shares many genetic features with high-grade serous ovarian cancer. Depicted are breast cancer cells with the HER2 protein, which can trigger cell growth responses, lit up in bright red. (Photo credit: NIST)

407

Supplementary Materials for A high coverage genome sequence from an archaic Denisovan individual  

E-print Network

........................................... 12 Note 5: Sequencing and processing of 11 present-day human genomes1 Supplementary Materials for A high coverage genome sequence from an archaic Denisovan individual Note 4: Processing and mapping raw sequence data from Denisova

Reich, David

408

The genome sequence of the colonial chordate, Botryllus schlosseri  

PubMed Central

Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI: http://dx.doi.org/10.7554/eLife.00569.001 PMID:23840927

Voskoboynik, Ayelet; Neff, Norma F; Sahoo, Debashis; Newman, Aaron M; Pushkarev, Dmitry; Koh, Winston; Passarelli, Benedetto; Fan, H Christina; Mantalas, Gary L; Palmeri, Karla J; Ishizuka, Katherine J; Gissi, Carmela; Griggio, Francesca; Ben-Shlomo, Rachel; Corey, Daniel M; Penland, Lolita; White, Richard A; Weissman, Irving L; Quake, Stephen R

2013-01-01

409

Complete genome sequence of Haliscomenobacter hydrossis type strain (OT)  

SciTech Connect

Haliscomenobacter hydrossis van Veen et al. 1973 is the type species of the genus Halisco- menobacter, which belongs to order 'Sphingobacteriales'. The species is of interest because of its isolated phylogenetic location in the tree of life, especially the so far genomically un- charted part of it, and because the organism grows in a thin, hardly visible hyaline sheath. Members of the species were isolated from fresh water of lakes and from ditch water. The genome of H. hydrossis is the first completed genome sequence reported from a member of the family 'Saprospiraceae'. The 8,771,651 bp long genome with its three plasmids of 92 kbp, 144 kbp and 164 kbp length contains 6,848 protein-coding and 60 RNA genes, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Daligault, Hajnalka E. [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Zeytun, Ahmet [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Verbarg, Susanne [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

2011-01-01

410

Standardized Metadata for Human Pathogen/Vector Genomic Sequences  

PubMed Central

High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. PMID:24936976

Dugan, Vivien G.; Emrich, Scott J.; Giraldo-Calderon, Gloria I.; Harb, Omar S.; Newman, Ruchi M.; Pickett, Brett E.; Schriml, Lynn M.; Stockwell, Timothy B.; Stoeckert, Christian J.; Sullivan, Dan E.; Singh, Indresh; Ward, Doyle V.; Yao, Alison; Zheng, Jie; Barrett, Tanya; Birren, Bruce; Brinkac, Lauren; Bruno, Vincent M.; Caler, Elizabet; Chapman, Sinead; Collins, Frank H.; Cuomo, Christina A.; Di Francesco, Valentina; Durkin, Scott; Eppinger, Mark; Feldgarden, Michael; Fraser, Claire; Fricke, W. Florian; Giovanni, Maria; Henn, Matthew R.; Hine, Erin; Hotopp, Julie Dunning; Karsch-Mizrachi, Ilene; Kissinger, Jessica C.; Lee, Eun Mi; Mathur, Punam; Mongodin, Emmanuel F.; Murphy, Cheryl I.; Myers, Garry; Neafsey, Daniel E.; Nelson, Karen E.; Nierman, William C.; Puzak, Julia; Rasko, David; Roos, David S.; Sadzewicz, Lisa; Silva, Joana C.; Sobral, Bruno; Squires, R. Burke; Stevens, Rick L.; Tallon, Luke; Tettelin, Herve; Wentworth, David; White, Owen; Will, Rebecca; Wortman, Jennifer; Zhang, Yun; Scheuermann, Richard H.

2014-01-01

411

Draft Genome Sequence of Bacillus endophyticus 2102  

PubMed Central

Bacillus endophyticus 2102 is an endospore-forming, plant growth-promoting rhizobacterium isolated from a hypersaline pond in South Korea. Here we present the draft sequence of B. endophyticus 2102, which is of interest because of its potential use in the industrial production of algaecides and bioplastics and for the treatment of industrial textile effluents. PMID:23012284

Lee, Yong-Jik; Lee, Sang-Jae; Kim, Sun Hong; Lee, Sang Jun; Kim, Byoung-Chan; Lee, Han-Seung

2012-01-01

412

A HIGH COVERAGE GENOME SEQUENCE FROM AN ARCHAIC DENISOVAN INDIVIDUAL  

PubMed Central

We present a DNA library preparation method that has allowed us to reconstruct a high coverage (30X) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of “missing evolution” in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans. PMID:22936568

Meyer, Matthias; Kircher, Martin; Gansauge, Marie-Theres; Li, Heng; Racimo, Fernando; Mallick, Swapan; Schraiber, Joshua G.; Jay, Flora; Prufer, Kay; de Filippo, Cesare; Sudmant, Peter H.; Alkan, Can; Fu, Qiaomei; Do, Ron; Rohland, Nadin; Tandon, Arti; Siebauer, Michael; Green, Richard E.; Bryc, Katarzyna; Briggs, Adrian W.; Stenzel, Udo; Dabney, Jesse; Shendure, Jay; Kitzman, Jacob; Hammer, Michael F.; Shunkov, Michael V.; Derevianko, Anatoli P.; Patterson, Nick; Andres, Aida M.; Eichler, Evan E.; Slatkin, Montgomery; Reich, David; Kelso, Janet; Paabo, Svante

2013-01-01

413

Mapping and sequencing of structural variation from eight human genomes  

PubMed Central

Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale—particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation—a standard for genotyping platforms and a prelude to future individual genome sequencing projects. PMID:18451855

Kidd, Jeffrey M.; Cooper, Gregory M.; Donahue, William F.; Hayden, Hillary S.; Sampas, Nick; Graves, Tina; Hansen, Nancy; Teague, Brian; Alkan, Can; Antonacci, Francesca; Haugen, Eric; Zerr, Troy; Yamada, N. Alice; Tsang, Peter; Newman, Tera L.; Tüzün, Eray; Cheng, Ze; Ebling, Heather M.; Tusneem, Nadeem; David, Robert; Gillett, Will; Phelps, Karen A.; Weaver, Molly; Saranga, David; Brand, Adrianne; Tao, Wei; Gustafson, Erik; McKernan, Kevin; Chen, Lin; Malig, Maika; Smith, Joshua D.; Korn, Joshua M.; McCarroll, Steven A.; Altshuler, David A.; Peiffer, Daniel A.; Dorschner, Michael; Stamatoyannopoulos, John; Schwartz, David; Nickerson, Deborah A.; Mullikin, James C.; Wilson, Richard K.; Bruhn, Laurakay; Olson, Maynard V.; Kaul, Rajinder; Smith, Douglas R.; Eichler, Evan E.

2008-01-01