Sample records for genome sequencing centers

  1. The Genome Sequencing Center at NCGR

    SciTech Connect

    Schilkey, Faye [National Center for Genome Resources

    2010-06-02

    Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  2. The Genome Database Organism-centered listing of available genomic sequence records and projects

    E-print Network

    Levin, Judith G.

    The Genome Database Organism-centered listing of available genomic sequence records and projects http://www.ncbi.nlm.nih.gov/genome National Center for Biotechnology Information · National Library | NCBI Genome | Last Update August 19, 2013 Contact: info@ncbi.nlm.nih.gov Scope Since 2011, the Genome

  3. Genome Science: A Video Tour of the Washington University Genome Sequencing Center for High School and Undergraduate Students

    ERIC Educational Resources Information Center

    Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.

    2005-01-01

    Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…

  4. Operational streamlining in a high-throughput genome sequencing center

    E-print Network

    Person, Kerry P. (Kerry Patrick)

    2006-01-01

    Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

  5. Nevada Genomics Center These are general instructions on how to use dnaTools to submit sequencing

    E-print Network

    Hemmers, Oliver

    Nevada Genomics Center These are general instructions on how to use dnaTools to submit sequencing samples. We here at the Nevada Genomics Center feel that dnaTools is user friendly and fairly intuitive-784-1657) or email us (Genomics@unr.nevada.edu) and we will assist you. How to use dnaTools Table of Contents

  6. Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    SciTech Connect

    Crow, John [National Center for Genome Resources] [National Center for Genome Resources

    2012-06-01

    John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  7. Porcine Genomic Sequencing Initiative

    Microsoft Academic Search

    Gary Rohrer; Jonathan E. Beever; Max F. Rothschild; Lawrence Schook; Richard Gibbs; George Weinstock; W. Gregory

    A. Specific biological rationales for the utility of the porcine sequence information Rationale and Objectives. Completion of the human genome sequence provides the starting point for understanding the genetic complexity of humans and how genetic variation contributes to diverse phenotypes and disease. It is clear that model organisms have played an invaluable role in the synthesis of this understanding. It

  8. Microbial genome sequencing

    Microsoft Academic Search

    Claire M. Fraser; Jonathan A. Eisen; Steven L. Salzberg

    2000-01-01

    Complete genome sequences of 30 microbial species have been determined during the past five years, and work in progress indicates that the complete sequences of more than 100 further microbial species will be available in the next two to four years. These results have revealed a tremendous amount of information on the physiology and evolution of microbial species, and should

  9. Genome Data Analysis Centers

    Cancer.gov

    The use of novel technologies, the need to integrate different data types and the immense quantity of data generated by The Cancer Genome Atlas (TCGA) Research Network has led to an expansion of the TCGA Research Network to include new centers devoted to data analysis. The Genome Data Analysis Centers (GDACs) work hand-in-hand with the Genome Characterization Centers (GCCs) to develop state-of-the-art tools that assist researchers with processing and integrating data analyses across the entire genome.

  10. Human Genome Center

    NSDL National Science Digital Library

    Human Genome Center At Lawrence Berkeley Lab (LBL), Berkeley, California: offering information about projects in Biology, Informatics and Instrumentation, photos of LBL robotic instruments, software, and online access to one LBL genomic database.

  11. Genomes and evolution From sequence to organism

    E-print Network

    Patel, Nipam H.

    Genomes and evolution From sequence to organism Editorial overview Evan E Eichler and Nipam H Patel, Center for Computational Genomics, Case Western Reserve University School of Medicine and University research is to understand the evolution, pathology and mechanisms of recent genome duplication in human

  12. Wheat and Barley Genome Sequencing

    Microsoft Academic Search

    Kellye Eversole; Andreas Graner; Nils Stein

    A high quality reference genome sequence is a prerequisite resource for accessing any gene, driving genomics-based approaches\\u000a to systems biology, and for efficient exploitation of natural and induced genetic diversity of an organism. Wheat and barley\\u000a possess genomes of a size that was long presumed to be not amenable for whole genome sequencing. So far, only limited genomic\\u000a sequencing of

  13. The Genome Center at Washington University

    SciTech Connect

    Fulton, Bob [Washington University

    2010-06-02

    Bob Fulton of Washington University discusses the sequencing platforms in use at this large scale genome center on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  14. Operations capability improvement of a molecular biology laboratory in a high throughput genome sequencing center

    E-print Network

    Vokoun, Matthew R. (Matthew Richard)

    2005-01-01

    The Broad Institute is a research collaboration of MIT, Harvard University and affiliated hospitals, and the Whitehead Institute for Biomedical Research. Its scientific mission is to "(1) create tools for genomic medicine ...

  15. Pig genome sequence - analysis and publication strategy

    Microsoft Academic Search

    Alan L Archibald; Lars Bolund; Carol Churcher; Merete Fredholm; Martien AM Groenen; Barbara Harlizius; Kyung-Tai Lee; Denis Milan; Jane Rogers; Max F Rothschild; Hirohide Uenishi; Jun Wang; Lawrence B Schook

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through

  16. Using the Potato Genome Sequence! Robin Buell!

    E-print Network

    Douches, David S.

    Using the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant Biology! August 15, 2010! buell@msu.edu! 1 #12;Whole Genome Shotgun Sequencing 2 #12;New genomics & post-genomic biology genomes genera 2002 2010 3 #12;So, you say you can sequence-Now what

  17. The Genome Sequence DataBase (GSDB): meeting the challenge of genomic sequencing

    Microsoft Academic Search

    Gifford Keen; Jillian Burton; David Crowley; Emily Dickinson; Ada Espinosa-lujan; Ed Franks; Carol Harger; Mo Manning; Shelley March; Mia Mcleod; John O'neill; Alicia Power; Maria Pumilia; Rhonda Reinert; David Rider; John Rohrlich; Jolene Schwertfeger; Linda Smyth; Nina Thayer; Charles Troup; Chris A. Fields

    1996-01-01

    The genome sequence database (GSDB) is a complete, publicly available relational database of DNA se- quences and annotation maintained by the National Center for Genome Resources (NCGR) under a Coop- erative Agreement with the US Department of Energy (DOE). GSDB provides direct, client-server access to the database for data contributions, community an- notation and SQL queries. The GSDB Annotator, a

  18. Integrating sequence, evolution and functional genomics in regulatory genomics

    PubMed Central

    Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

    2009-01-01

    With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

  19. The Center for integrative genomics

    E-print Network

    Kaessmann, Henrik

    The Center for integrative genomics Report 2005­2006 #12;Presentation Director's message 4 Scientific advisory committee 6 Organigram of the CIG 7 research The structure and function of genomes and their evolution alexandrereymond ­ Genome structure and expression 10 henrikKaessmann ­ Evolutionary genomics 12

  20. MIPS: a database for genomes and protein sequences

    Microsoft Academic Search

    Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

    2000-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

  1. Whole genome sequencing in pharmacogenomics.

    PubMed

    Katsila, Theodora; Patrinos, George P

    2015-01-01

    Pharmacogenomics aims to shed light on the role of genes and genomic variants in clinical treatment response. Although, several drug-gene relationships are characterized to date, many challenges still remain toward the application of pharmacogenomics in the clinic; clinical guidelines for pharmacogenomic testing are still in their infancy, whereas the emerging high throughput genotyping technologies produce a tsunami of new findings. Herein, the potential of whole genome sequencing on pharmacogenomics research and clinical application are highlighted. PMID:25859217

  2. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan [University of Washington

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

  3. Sequencing Complex Genomic Regions

    SciTech Connect

    Eichler, Evan [University of Washington

    2009-05-28

    Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

  4. Improved Yield and Diverse Finished Bacterial Genomes using Pacific Biosciences RS II SMRT Sequencing

    E-print Network

    Weber, David J.

    genome sequencing in our center. Further, using comparative Illumina sequencing, we found a median of one Evaluation As one measure of genome consensus sequence quality, we used Illumina MiSeq 250bp PE data to align to complete genomes sequenced using PacBio data alone and assembled using one of three genome assemblers. We

  5. Genome Sequence of Salmonella Phage ?

    PubMed Central

    Ko, Ching-Chung; Jacobs-Sera, Deborah; Hatfull, Graham F.; Erhardt, Marc; Hughes, Kelly T.; Casjens, Sherwood R.

    2015-01-01

    Salmonella bacteriophage ? is a member of the Siphoviridae family that gains entry into its host cells by adsorbing to their flagella. We report the complete 59,578-bp sequence of the genome of phage ?, which together with its relatives, exemplifies a largely unexplored type of tailed bacteriophage. PMID:25720684

  6. Genome Sequence of Mycobacteriophage Momo

    PubMed Central

    Bina, Elizabeth A.; Brahme, Indraneel S.; Hill, Amy B.; Himmelstein, Philip H.; Hunsicker, Sara M.; Ish, Amanda R.; Le, Tinh S.; Martin, Mary M.; Moscinski, Catherine N.; Shetty, Sameer A.; Swierzewski, Tomasz; Iyengar, Varun B.; Kim, Hannah; Schafer, Claire E.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Momo is a newly discovered phage of Mycobacterium smegmatis mc2155. Momo has a double-stranded DNA genome 154,553 bp in length, with 233 predicted protein-encoding genes, 34 tRNA genes, and one transfer-messenger RNA (tmRNA) gene. Momo has a myoviral morphology and shares extensive nucleotide sequence similarity with subcluster C1 mycobacteriophages.

  7. Genome Sequence of Mycobacteriophage Phayonce

    PubMed Central

    Jacobetz, Emily; Johnson, Courtney A.; Kihle, Brooke L.; Sobeski, Margaret A.; Werner, Madison B.; Adkins, Nancy L.; Kramer, Zachary J.; Montgomery, Matthew T.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophage Phayonce is a newly isolated phage recovered from a soil sample in Pittsburgh, PA, using Mycobacterium smegmatis mc2155 as a host. Phayonce’s genome is 49,203 bp long and contains 77 protein-coding genes, 23 of them having predicted functions. Phayonce shares a strong similarity in nucleotide sequence with phages of cluster P.

  8. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission.

    PubMed

    Giongo, Adriana; Tyler, Heather L; Zipperer, Ursula N; Triplett, Eric W

    2010-01-01

    Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures. PMID:21304715

  9. GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT

    E-print Network

    Wurtele, Eve Syrkin

    GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR ON-CAMPUS USERS Please fill out completely, and email, fax or mail to: Genomic Technologies Facility Manager 2025 Roy J. Carver Co-Laboratory Center for Plant Genomics Iowa State University Ames, Iowa 50011-3650 515

  10. GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT

    E-print Network

    Wurtele, Eve Syrkin

    GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR OFF-CAMPUS USERS Please fill out completely, and email, fax or mail to: Genomic Technologies Facility Manager 2025 Roy J. Carver Co-Laboratory Center for Plant Genomics Iowa State University Ames, Iowa 50011-3650 515

  11. Genome Sequences of Eight Morphologically Diverse Alphaproteobacteria?

    PubMed Central

    Brown, Pamela J. B.; Kysela, David T.; Buechlein, Aaron; Hemmerich, Chris; Brun, Yves V.

    2011-01-01

    The Alphaproteobacteriacomprise morphologically diverse bacteria, including many species of stalked bacteria. Here we announce the genome sequences of eight alphaproteobacteria, including the first genome sequences of species belonging to the genera Asticcacaulis, Hirschia, Hyphomicrobium, and Rhodomicrobium. PMID:21705585

  12. Almost finished: the complete genome sequence of Mycosphaerella graminicola

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Mycosphaerella graminicola causes septoria tritici blotch of wheat. An 8.9x shotgun sequence of bread wheat strain IPO323 was generated through the Community Sequencing Program of the U.S. Department of Energy’s Joint Genome Institute (JGI), and was finished at the Stanford Human Genome Center. The ...

  13. The Sequence of the Human Genome

    Microsoft Academic Search

    J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

    2001-01-01

    A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

  14. Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project

    E-print Network

    Brendel, Volker

    Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project Vicki L. Chandler Genome Sequencing Project. The momentum for this endeavor has been building within the maize (Zea mays and human genomes (Gregory et al., 2002). Our current picture of the maize genome is largely derived from

  15. Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,

    E-print Network

    Purugganan, Michael D.

    COMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant, Arabidopsis thaliana, was published ,6 years ago (Arabidopsis Genome Initiative, 2000). Since Information Entrez Genome Projects website reports that sequencing of several more plant genomes is in prog

  16. Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project

    Microsoft Academic Search

    Mark D. Adams; Jenny M. Kelley; Jeannine D. Gocayne; Mark Dubnick; Mihael H. Polymeropoulos; Hong Xiao; Carl R. Merril; Andrew Wu; Bjorn Olde; Ruben F. Moreno; Anthony R. Kerlavage; W. Richard McCombie; J. Craig Venter

    1991-01-01

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity

  17. Genome sequencing and functional genomics approaches in tomato

    Microsoft Academic Search

    Daisuke Shibata

    2005-01-01

    Tomato genome sequencing has been taking place through an international, 10-year initiative entitled the “International Solanaceae Genome Project” (SOL). The strategy proposed by the SOL consortium is to sequence the approximately 220?Mb of euchromatin that contains the majority of genes, rather than the entire tomato genome. Tomato and other Solanaceae plants have unique developmental aspects, such as the formation of

  18. Genomic Sequence Analysis Using Gap Sequences and Pattern Filtering

    Microsoft Academic Search

    Shih-chieh Su; Chia H. Yeh; C.-C. Jay Kuo

    2003-01-01

    A new pattern filtering technique is developed to ana- lyze the genomic sequence in this research based on gap sequences, in which the distance of the same symbol is re- corded consecutively as a sequence of integers. Sequence alignment and similarity testing can be performed on a family of gap sequences over selected patterns. The gap sequence offers a new

  19. Fuzzy Genome Sequence Assembly for Single and Environmental Genomes

    E-print Network

    Nicolescu, Monica

    Fuzzy Genome Sequence Assembly for Single and Environmental Genomes Sara Nasser, Adrienne Breland. Traditional methods obtain a microorganism's DNA by culturing it in- dividually. Recent advances in genomics microbial commu- nities are often very complex with tens and hundreds of species. Assembling these genomes

  20. MIPS: a database for genomes and protein sequences

    Microsoft Academic Search

    Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

    2002-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

  1. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  2. Sequencing and analysis of bacterial genomes

    Microsoft Academic Search

    Eugene V. Koonin; Arcady R. Mushegian; Kenneth E. Rudd

    1996-01-01

    The complete sequences of two small bacterial genomes have recently become available, and those of several more species should follow within the next two years. Sequence comparisons show that the most bacterial proteins are highly conserved in evolution, allowing predictions to be made about the functions of most products of an uncharacterized genome. Bacterial genomes differ vastly in their gene

  3. The Human Genome Project: Sequencing the Future

    E-print Network

    #12;The Human Genome Project: Sequencing the Future I n 1986, the U.S. Department of Energy (DOE and unilateral step by announcing its Human Genome Initiative--forerunner of the Human Genome Project critical areas, including those important to DOE missions. The Human Genome Project and DOE's complementary

  4. Sequencing a Genome by Walking With Clone-end Sequences

    E-print Network

    Batzoglou, Serafim

    genome is (i) to sequence a collection of non- overlapping 'seeds' chosen from a genomic library of large of seed clones and the depth of the genomic library used for walking, affect the cost and time, Massachusetts lnsmute of Technology, Cambridge MA 02139. * To whom correspondence should be addressed. 45 #12;

  5. Progress in Arabidopsis genome sequencing and functional genomics

    Microsoft Academic Search

    R. Wambutt; G. Murphy; G. Volckaert; T. Pohl; A Düsterhöft; W Stiekema; K.-D Entian; N Terryn; B Harris; W Ansorge; P Brandt; L Grivell; M Rieger; M Weichselgartner; V de Simone; B Obermaier; R Mache; M Müller; M Kreis; M Delseny; P Puigdomenech; M Watson; T Schmidtheini; B Reichert; D Portatelle; M Perez-Alonso; M Boutry; I Bancroft; P Vos; J Hoheisel; W Zimmermann; H Wedler; P Ridley; S.-A Langham; B McCullagh; L Bilham; J Robben; J Van der Schueren; B Grymonprez; Y.-J Chuang; F Vandenbussche; M Braeken; I Weltjens; M Voet; I Bastiaens; R Aert; E Defoor; T Weitzenegger; G Bothe; U Ramsperger; H Hilbert; M Braun; E Holzer; A Brandt; S Peters; M van Staveren; W Dirkse; P Mooijman; R Klein Lankhorst; M Rose; J Hauf; P Kötter; S Berneiser; S Hempel; M Feldpausch; S Lamberth; H Van den Daele; A De Keyser; C Buysshaert; J Gielen; R Villarroel; R De Clercq; M Van Montagu; J Rogers; A Cronin; M Quail; S Bray-Allen; L Clark; J Doggett; S Hall; M Kay; N Lennard; K McLay; R Mayes; A Pettett; M.-A Rajandream; M Lyne; V Benes; S Rechmann; D Borkova; H Blöcker; M Scharfe; M Grimm; T.-H Löhnert; S Dose; M de Haan; A Maarse; M Schäfer; S Müller-Auer; C Gabel; M Fuchs; B Fartmann; K Granderath; D Dauner; A Herzl; S Neumann; A Argiriou; D Vitale; R Liguori; E Piravandi; O Massenet; F Quigley; G Clabauld; A Mündlein; R Felber; S Schnabl; R Hiller; W Schmidt; A Lecharny; S Aubourg; I Gy; R Cooke; C Berger; A Monfort; E Casacuberta; T Gibbons; N Weber; M Vandenbol; M Bargues; J Terol; A Torres; A Perez-Perez; B Purnelle; E Bent; S Johnson; D Tacon; T Jesse; L Heijnen; S Schwarz; P Scholler; S Heber; C Bielke; D Frishmann; D Haase; K Lemcke; H. W Mewes; S Stocker; P Zaccaria; K Mayer; C Schüller; M Bevan

    2000-01-01

    Arabidopsis thaliana has a relatively small genome of approximately 130 Mb containing about 10% repetitive DNA. Genome sequencing studies reveal a gene-rich genome, predicted to contain approximately 25?000 genes spaced on average every 4.5 kb. Between 10 to 20% of the predicted genes occur as clusters of related genes, indicating that local sequence duplication and subsequent divergence generates a significant

  6. Expressed sequence tags: alternative or complement to whole genome sequences?

    Microsoft Academic Search

    Stephen Rudd

    2003-01-01

    Over three million sequences from approximately 200 plant species have been deposited in the publicly available plant expressed sequence tag (EST) sequence databases. Many of the ESTs have been sequenced as an alternative to complete genome sequencing or as a substrate for cDNA array-based expression analyses. This creates a formidable resource from both biodiversity and gene-discovery standpoints. Bioinformatics-based sequence analysis

  7. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    PubMed Central

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences. PMID:22247527

  8. BSMAP: whole genome bisulfite sequence MAPping program

    Microsoft Academic Search

    Yuanxin Xi; Wei Li

    2009-01-01

    BACKGROUND: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the

  9. BAC as tools for genome sequencing

    Microsoft Academic Search

    Hong-Bin Zhang; Chengcang Wu

    2001-01-01

    Genome sequencing represents the state-of-the-art technology for large-scale gene discovery, cloning and decoding. Bacteria-based large-insert clones, including bacterial artificial chromosome (BAC), bacteriophage P1-derived artificial chromosome (PAC) and large-insert conventional plasmid-based clone (PBC), are desirable resources and have offered numerous potentials for accelerated sequencing of large, complex genomes. They are not only capable of cloning large DNA fragments of complex genomes

  10. Comparison of 61 Sequenced Escherichia coli Genomes

    Microsoft Academic Search

    Oksana Lukjancenko; Trudy M. Wassenaar; David W. Ussery

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution.\\u000a Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the\\u000a pan- and core genomes of this set of sequenced strains. A hierarchical

  11. Next generation sequencing of viral RNA genomes

    PubMed Central

    2013-01-01

    Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

  12. Human Genome Sequencing in Health and Disease

    PubMed Central

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  13. Genome sequence of the palaeopolyploid Jeremy Schmutz1,2

    E-print Network

    Bhattacharyya, Madan Kumar

    ). The soybean genome is the largest whole-genome shotgun- sequenced plant genome so far and compares favourably to all other high-quality draft whole-genome shotgun-sequenced plant genomes (Supplementary Table 4ARTICLES Genome sequence of the palaeopolyploid soybean Jeremy Schmutz1,2 , Steven B. Cannon3

  14. Genomic sequencing of Pleistocene cave bears

    SciTech Connect

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  15. Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes

    PubMed Central

    Barthelson, Roger; McFarlin, Adam J.; Rounsley, Steven D.; Young, Sarah

    2011-01-01

    Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. PMID:22174807

  16. Reconstruction of Ancestral Genomic Sequences Using Likelihood

    Microsoft Academic Search

    Isaac Elias; Tamir Tuller

    2007-01-01

    A challenging task in computational biology is the reconstruction of genomic sequences of extinct ancestors, given the phylogenetic tree and the sequences at the leafs. This task is best solved by calculating the most likely estimate of the ancestral sequences, along with the most likely edge lengths. We deal with this problem and also the variant in which the phylogenetic

  17. Solvable Sequence Evolution Models and Genomic Correlations

    NASA Astrophysics Data System (ADS)

    Messer, Philipp W.; Arndt, Peter F.; Lässig, Michael

    2005-04-01

    We study a minimal model for genome evolution whose elementary processes are single site mutation, duplication and deletion of sequence regions, and insertion of random segments. These processes are found to generate long-range correlations in the composition of letters as long as the sequence length is growing; i.e., the combined rates of duplications and insertions are higher than the deletion rate. For constant sequence length, on the other hand, all initial correlations decay exponentially. These results are obtained analytically and by simulations. They are compared with the long-range correlations observed in genomic DNA, and the implications for genome evolution are discussed.

  18. Draft Genome Sequences of the Onion Center Rot Pathogen Pantoea ananatis PA4 and Maize Brown Stalk Rot Pathogen P. ananatis BD442

    PubMed Central

    Weller-Stuart, Tania; Chan, Wai Yin; Venter, Stephanus N.; Smits, Theo H. M.; Duffy, Brion; Goszczynska, Teresa; Cowan, Don A.; de Maayer, Pieter

    2014-01-01

    Pantoea ananatis is an emerging phytopathogen that infects a broad spectrum of plant hosts. Here, we present the genomes of two South African isolates, P. ananatis PA4, which causes center rot of onion, and BD442, isolated from brown stalk rot of maize. PMID:25103759

  19. Center for Eukaryotic Structural Genomics

    NSDL National Science Digital Library

    A collaboration between the Department of Biochemistry at the University of Wisconsin-Madison, the Medical College of Wisconsin, Molecular Kinetics, Inc., and Hebrew University, the Center for Eukaryotic Structural Genomics (CESG) intends to "develop critical technologies for determining three-dimensional structures of proteins rapidly and economically." The site gives an overview of CESG, including the goals and mission of the center, biographies of people involved, and the methodology and results of the program. The results section is the most substantial part of the site, giving information on how target proteins were selected, protocols and technology used, publications based on CESG research, and more.

  20. POSTDOCTORAL POSITION IN BIOINFORMATICS AND EVOLUTIONARY GENOMICS: Next generation sequencing and analysis of complex polyploid genomes

    E-print Network

    Rennes, Université de

    POSTDOCTORAL POSITION IN BIOINFORMATICS AND EVOLUTIONARY GENOMICS: Next generation sequencing and analysis of complex polyploid genomes The research group Genome Evolution and Speciation (Team) to work on the analysis of genome and transcriptome sequence data (generated using 454 Roche

  1. Genome sequence of Coxiella burnetii strain Namibia

    PubMed Central

    2014-01-01

    We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

  2. Pairwise Comparison Between Genomic Sequences and

    E-print Network

    Mohri, Mehryar

    similar translated genomic sequences using the stable-marriage algorithm (SM) as an alignment filter learned from him how to ask questions and express my ideas. He showed me different ways to approach

  3. First Complete Sequence of the Human Genome

    NSDL National Science Digital Library

    de Nie, Michael Willem.

    On April 6, Celera Genomics announced that it had completed the sequencing phase of one person's genome. It will now begin the process of assembling the sequenced fragments into their proper order with the aid of powerful computers. Work on this project began in September 1999 using a method called "whole genome shotgun sequencing," a quicker method than that used by the international Human Genome Project, which has completed about two-thirds of its own, more thorough, sequence of the human genome. Although talks between Celera and the Human Genome Project over the sharing of data broke down earlier this year, they have since resumed and the company has stated that it will cooperate. While this is just the first step towards understanding the human genome, it only reveals the order of the nucleotides, not what the genes do, it is certainly an important milestone, with broad implications for biology and medicine. Users can begin with the company's press release and then read reports from the BBC, the New York Times (free registration required), CNN, National Public Radio's All Things Considered, and the Times of India. Additional related resources are available from the Human Genome Project site and Doubletwist.com.

  4. Genome sequence and analysis of Lactobacillus helveticus

    PubMed Central

    Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

    2013-01-01

    The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

  5. Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

    Microsoft Academic Search

    Aaron C. E. Darling; Bob Mau; Frederick R. Blattner; Nicole T. Perna

    2004-01-01

    As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments

  6. Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships

    PubMed Central

    2014-01-01

    Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia. PMID:25001059

  7. A Workshop Report on Wheat Genome Sequencing

    PubMed Central

    Gill, Bikram S.; Appels, Rudi; Botha-Oberholster, Anna-Maria; Buell, C. Robin; Bennetzen, Jeffrey L.; Chalhoub, Boulos; Chumley, Forrest; Dvo?ák, Jan; Iwanaga, Masaru; Keller, Beat; Li, Wanlong; McCombie, W. Richard; Ogihara, Yasunari; Quetier, Francis; Sasaki, Takuji

    2004-01-01

    Sponsored by the National Science Foundation and the U.S. Department of Agriculture, a wheat genome sequencing workshop was held November 10–11, 2003, in Washington, DC. It brought together 63 scientists of diverse research interests and institutions, including 45 from the United States and 18 from a dozen foreign countries (see list of participants at http://www.ksu.edu/igrow). The objectives of the workshop were to discuss the status of wheat genomics, obtain feedback from ongoing genome sequencing projects, and develop strategies for sequencing the wheat genome. The purpose of this report is to convey the information discussed at the workshop and provide the basis for an ongoing dialogue, bringing forth comments and suggestions from the genetics community. PMID:15514080

  8. Complete Genome Sequences of 63 Mycobacteriophages

    PubMed Central

    2013-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. The current collection of sequenced mycobacteriophages—all isolated on a single host strain, Mycobacterium smegmatis mc2155, reveals substantial genetic diversity. The complete genome sequences of 63 newly isolated mycobacteriophages expand the resolution of our understanding of phage diversity. PMID:24285655

  9. Draft Genome Sequence of Tombunodavirus UC1

    PubMed Central

    DeRisi, Joseph L.

    2015-01-01

    We report here the draft genome sequence of tombunodavirus UC1 assembled from metagenomic sequencing of organisms in San Francisco wastewater. This virus shares hallmarks of members of the Tombusviridae and the nodavirus-like Plasmopara halstedii and Sclerophthora macrospora viruses. PMID:26139709

  10. Genomic sequencing of single microbial cells from environmental samples.

    PubMed

    Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S

    2008-06-01

    Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification. Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing. PMID:18550420

  11. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  12. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    E-print Network

    2011-01-01

    plants have large and complex genomes with an abundance of repeated sequences.plants have large and complex genomes with a great abundance of repeated sequences.Sequence composition, organization, and evolution of the core Triticeae genome. Plant

  13. Genome sequence and comparative analysis of the model rodent malaria

    E-print Network

    Salzberg, Steven

    Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii Medical Centre, PO Box 9600, 2300 RC Leiden, The Netherlands § Naval Medical Research Center, Malaria ........................................................................................................................................................................................................................... Species of malaria parasite that infect rodents have long been used as models for malaria disease research

  14. Using comparative genomics to reorder the human genome sequence into a virtual sheep genome

    Microsoft Academic Search

    Brian P Dalrymple; Ewen F Kirkness; Mikhail Nefedov; Sean McWilliam; Abhirami Ratnakumar; Wes Barris; Shaying Zhao; Jyoti Shetty; Jillian F Maddox; Margaret O'Grady; Frank Nicholas; Allan M Crawford; Tim Smith; Pieter J de Jong; John McEwan; V Hutton Oddy; Noelle E Cockett

    2007-01-01

    BACKGROUND: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? RESULTS: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the

  15. Mining for single nucleotide polymorphisms in pig genome sequence data

    PubMed Central

    Kerstens, Hindrik HD; Kollers, Sonja; Kommadath, Arun; del Rosario, Marisol; Dibbits, Bert; Kinders, Sylvia M; Crooijmans, Richard P; Groenen, Martien AM

    2009-01-01

    Background Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation. PMID:19126189

  16. Genome Sequence of Mycobacteriophage Mindy

    PubMed Central

    Bernstein, Nicholas I.; Fasolas, Christina S.; Mezghani, Nadia; Pressimone, Catherine A.; Selvakumar, Priyanga; Stanton, Ann-Catherine J.; Lapin, Jonathan S.; Prout, Ashley K.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Mycobacteriophage Mindy is a newly isolated phage of Mycobacterium smegmatis, recovered from a soil sample in Pittsburgh, Pennsylvania, USA. Mindy has a genome length of 75,796 bp, encodes 147 predicted proteins and two tRNAs, and is closely related to mycobacteriophages in cluster E.

  17. Complete genome sequence of Borrelia crocidurae.

    PubMed

    Elbir, Haitham; Gimenez, Grégory; Robert, Catherine; Bergström, Sven; Cutler, Sally; Raoult, Didier; Drancourt, Michel

    2012-07-01

    We announce the draft genome sequence of Borrelia crocidurae (strain Achema). The 1,557,560-bp genome (27% GC content) comprises one 919,477-bp linear chromosome and 638,083-bp plasmids that together carry 1,472 open reading frames, 32 tRNAs, and three complete rRNAs, with almost complete colinearity between B. crocidurae and Borrelia duttonii chromosomes. PMID:22740657

  18. Accelerating Genome Sequencing 100X with FPGAs

    SciTech Connect

    Storaasli, Olaf O [ORNL; Strenski, Dave [Cray, Inc.

    2007-01-01

    The performance of two Cray XD1 systems with Virtex-II Pro 50 and Virtex-4 LX160 FPGAs was evaluated using the FASTA computational biology program for human genome (DNA and protein) sequence comparisons. FPGA speedups of 50X (Virtex-II Pro 50) and 100X (Virtex-4 LX160) over a 2.2 GHz Opteron were obtained. FPGA coding issues for human genome data are described.

  19. Noninvasive fetal genome sequencing: a primer.

    PubMed

    Snyder, Matthew W; Simmons, LaVone E; Kitzman, Jacob O; Santillan, Donna A; Santillan, Mark K; Gammill, Hilary S; Shendure, Jay

    2013-06-01

    We recently demonstrated whole genome sequencing of a human fetus using only parental DNA samples and plasma from the pregnant mother. This proof-of-concept study demonstrated how samples obtained noninvasively in the first or second trimester can be analyzed to yield a highly accurate and substantially complete genetic profile of the fetus, including both inherited and de novo variation. Here, we revisit our original study from a clinical standpoint, provide an overview of the scientific approach, and describe opportunities and challenges along the path toward clinical adoption of noninvasive fetal whole genome sequencing. PMID:23553552

  20. Noninvasive fetal genome sequencing: a primer

    PubMed Central

    Snyder, Matthew W.; Simmons, LaVone E.; Kitzman, Jacob O.; Santillan, Donna A.; Santillan, Mark K.; Gammill, Hilary S.; Shendure, Jay

    2013-01-01

    We recently demonstrated whole genome sequencing of a human fetus using only parental DNA samples and plasma from the pregnant mother. This proof-of-concept study demonstrated how samples obtained noninvasively in the first or second trimester can be analyzed to yield a highly accurate and substantially complete genetic profile of the fetus, including both inherited and de novo variation. Here, we revisit our original study from a clinical standpoint, provide an overview of the scientific approach, and describe opportunities and challenges along the path towards clinical adoption of noninvasive fetal whole genome sequencing (NIFWGS). PMID:23553552

  1. Controlling Size When Aligning Multiple Genomic Sequences with Duplications

    E-print Network

    Miller, Webb

    - ments in 1% of the human genome. As part of the project, genomic sequence data from a number of mammals;Controlling Size When Aligning Multiple Genomic Sequences 139 relationship among aligned sequences to be the same as the phylogenetic tree relating the species for those sequences. A main (and probably the main

  2. Genome Sequence of Mercury-Methylating and Pleomorphic Desulfovibrio africanus

    E-print Network

    Genome Sequence of Mercury-Methylating and Pleomorphic Desulfovibrio africanus Contact: Steven D. africanus genome sequence to allow us to gain insights into the physiological states genomics using the sequence information for D. africanus and the previously sequenced mercury methylator D

  3. Human Genome Project Sequencing, 3D animation with basic narrationSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    DNAi Location: Genome>Project>putting it together>Mapping the genome As represented by this huge stack of paper, the human genome contains more than three billion nucleotides or DNA letters. The first stage of the public Human Genome Project focused on identifying marker sequences or unique tags (shown here in yellow) at regular intervals throughout this \\"book of life.\\" Once enough sequences were tagged, various blocks of the genome were allocated to different academic centers for sequencing.

  4. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    E-print Network

    Timme, Ruth E.

    2009-01-01

    genomes are crop plants, their complete genome sequence willchloroplast genome sequence for any plant within the largersequence of Glycine max and comparative analyses with other legume genomes. Plant

  5. Genome Sequence Assembly Using Trace Signals and Additional Sequence Information

    Microsoft Academic Search

    Bastien Chevreux; Thomas Wetter; Sándor Suhai

    1999-01-01

    Motivation: This article presents a method for as- sembling shotgun sequences which primarily uses high confidence regions whilst taking advantage of additional available information such as low con- fidence regions, quality values or repetitive region tags. Conflict situations are resolved with routines for analysing trace signals. Results: Initial tests with different human and mouse genome projects showed promising results but

  6. The first Korean genome sequence and analysis: Full genome sequencing for a socio-ethnic group

    PubMed Central

    Ahn, Sung-Min; Kim, Tae-Hyung; Lee, Sunghoon; Kim, Deokhoon; Ghang, Ho; Kim, Dae-Soo; Kim, Byoung-Chul; Kim, Sang-Yoon; Kim, Woo-Yeon; Kim, Chulhong; Park, Daeui; Lee, Yong Seok; Kim, Sangsoo; Reja, Rohit; Jho, Sungwoong; Kim, Chang Geun; Cha, Ji-Young; Kim, Kyung-Hee; Lee, Bonghee; Bhak, Jong; Kim, Seong-Jin

    2009-01-01

    We present the first Korean individual genome sequence (SJK) and analysis results. The diploid genome of a Korean male was sequenced to 28.95-fold redundancy using the Illumina paired-end sequencing method. SJK covered 99.9% of the NCBI human reference genome. We identified 420,083 novel single nucleotide polymorphisms (SNPs) that are not in the dbSNP database. Despite a close similarity, significant differences were observed between the Chinese genome (YH), the only other Asian genome available, and SJK: (1) 39.87% (1,371,239 out of 3,439,107) SNPs were SJK-specific (49.51% against Venter's, 46.94% against Watson's, and 44.17% against the Yoruba genomes); (2) 99.5% (22,495 out of 22,605) of short indels (< 4 bp) discovered on the same loci had the same size and type as YH; and (3) 11.3% (331 out of 2920) deletion structural variants were SJK-specific. Even after attempting to map unmapped reads of SJK to unanchored NCBI scaffolds, HGSV, and available personal genomes, there were still 5.77% SJK reads that could not be mapped. All these findings indicate that the overall genetic differences among individuals from closely related ethnic groups may be significant. Hence, constructing reference genomes for minor socio-ethnic groups will be useful for massive individual genome sequencing. PMID:19470904

  7. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    PubMed Central

    Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W.; Aarestrup, Frank M.; Lund, Ole

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

  8. Comparison of Sample Sequences of the Salmonella typhi Genome to the Sequence of the Complete Escherichia coli K-12 Genome

    Microsoft Academic Search

    MICHAEL MCCLELLAND; RICHARD K. WILSON

    1998-01-01

    Raw sequence data representing the majority of a bacterial genome can be obtained at a tiny fraction of the cost of a completed sequence. To demonstrate the utility of such a resource, 870 single-stranded M13 clones were sequenced from a shotgun library of the Salmonella typhi Ty2 genome. The sequence reads averaged over 400 bases and sampled the genome with

  9. Draft Genome Sequence of Virgibacillus halodenitrificans 1806

    PubMed Central

    Lee, Sang-Jae; Lee, Yong-Jik; Jeong, Haeyoung; Lee, Sang Jun; Lee, Han-Seung; Pan, Jae-Gu

    2012-01-01

    Virgibacillus halodenitrificans 1806 is an endospore-forming halophilic bacterium isolated from salterns in Korea. Here, we report the draft genome sequence of V. halodenitrificans 1806, which may reveal the molecular basis of osmoadaptation and insights into carbon and anaerobic metabolism in moderate halophiles. PMID:23105070

  10. Feature Opinion From complete genome sequence to

    E-print Network

    Levin, Judith G.

    bacteria, 61 archaea, and 23 eukaryotes) were completely sequenced, deposited in the public nucleotide prokaryotic lin- eage (the Genomic Encyclopedia of Bacteria and Archaea: www.jgi.doe.gov/programs/GEBA/, [4]. Similarly, in structural geno- mics projects, the chances of discovering a new protein fold or even a new

  11. Hidden ribozymes in eukaryotic genome sequence

    PubMed Central

    2010-01-01

    The small self-cleaving ribozymes fold into complex tertiary structures to promote autocatalytic cleavage or ligation at a precise position within their sequence. Until recently, relatively few examples had been identified. Two papers now reveal that self-cleaving ribozymes are prevalent in eukaryotic genomes and, in some cases, might play a role in regulating gene expression. PMID:20948783

  12. Genome Sequence of Lactobacillus amylovorus GRL1112?

    PubMed Central

    Kant, Ravi; Paulin, Lars; Alatalo, Edward; de Vos, Willem M.; Palva, Airi

    2011-01-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium. PMID:21131492

  13. Genome sequence of Lactobacillus amylovorus GRL1112.

    PubMed

    Kant, Ravi; Paulin, Lars; Alatalo, Edward; de Vos, Willem M; Palva, Airi

    2011-02-01

    Lactobacillus amylovorus is a common member of the normal gastrointestinal tract (GIT) microbiota in pigs. Here, we report the genome sequence of L. amylovorus GRL1112, a porcine feces isolate displaying strong adherence to the pig intestinal epithelial cells. The strain is of interest, as it is a potential probiotic bacterium. PMID:21131492

  14. Assigning genomic sequences to CATH

    Microsoft Academic Search

    Frances M. G. Pearl; David Lee; James E. Bray; Ian Sillitoe; Annabel E. Todd; Andrew P. Harrison; Janet M. Thornton; Christine A. Orengo

    2000-01-01

    We report the latest release (version 1.6) of the CATH protein domains database (http:\\/\\/www.biochem.ucl. ac.uk\\/bsm\\/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo- logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and

  15. Defining Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    Chain, Patrick [DOE-JGI

    2009-05-27

    Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

  16. Dominant short repeated sequences in bacterial genomes.

    PubMed

    Avershina, Ekaterina; Rudi, Knut

    2015-03-01

    We use a novel multidimensional searching approach to present the first exhaustive search for all possible repeated sequences in 166 genomes selected to cover the bacterial domain. We found an overrepresentation of repeated sequences in all but one of the genomes. The most prevalent repeats by far were related to interspaced short palindromic repeats (CRISPRs)—conferring bacterial adaptive immunity. We identified a deep branching clade of thermophilic Firmicutes containing the highest number of CRISPR repeats. We also identified a high prevalence of tandem repeated heptamers. In addition, we identified GC-rich repeats that could potentially be involved in recombination events. Finally, we identified repeats in a 16322 amino acid mega protein (involved in biofilm formation) and inverted repeats flanking miniature transposable elements (MITEs). In conclusion, the exhaustive search for repeated sequences identified new elements and distribution of these, which has implications for understanding both the ecology and evolution of bacteria. PMID:25561351

  17. Draft Genome Sequence of Mycobacterium elephantis Strain Lipa

    PubMed Central

    Greninger, Alexander L.; Cunningham, Gail; Yu, Joanna M.; Hsu, Elaine D.; Chiu, Charles Y.

    2015-01-01

    We report the draft genome sequence of Mycobacterium elephantis strain Lipa from a sputum sample of a patient with pulmonary disease. This is the first draft genome sequence of M. elephantis, a rapidly growing mycobacterium. PMID:26112791

  18. Draft Genome Sequence of Mycobacterium arupense Strain GUC1

    PubMed Central

    Greninger, Alexander L.; Cunningham, Gail; Yu, Joanna M.; Hsu, Elaine D.; Chiu, Charles Y.

    2015-01-01

    We report the draft genome sequence of Mycobacterium arupense strain GUC1 from a sputum sample of a patient with bronchiectasis. This is the first draft genome sequence of Mycobacterium arupense, a rapidly growing nonchromogenic mycobacteria. PMID:26067970

  19. Overview of PSB track on gene structure identification in large-scale genomic sequence

    Microsoft Academic Search

    E. C. Uberbacher; Y. Xu

    1998-01-01

    The recent funding of more than a dozen major genome centers to begin community-wide high-throughput sequencing of the human genome has created a significant new challenge for the computational analysis of DNA sequence and the prediction of gene structure and function. It has been estimated that on average from 1996 to 2003, approximately 2 million bases of newly finished DNA

  20. Genlight: Interactive high-throughput sequence analysis and comparative genomics

    Microsoft Academic Search

    Michael Beckstette; Jens T. Mailänder; Richard J. Marhöfer; Alexander Sczyrba; Enno Ohlebusch; Robert Giegerich; Paul M. Selzer

    2004-01-01

    With rising numbers of fully sequenced genomes the importance of comparative genom- ics is constantly increasing. Although several software systems for genome comparison analyses do exist, their functionality and flexibility is still limited, compared to the mani- fold possible applications. Therefore, we developed Genlight, a Client\\/Server based pro- gram suite for large scale sequence analysis and comparative genomics. Genlight uses

  1. Genome sequencing and analysis of the model grass Brachypodium distachyon

    E-print Network

    Green, Pamela

    ARTICLES Genome sequencing and analysis of the model grass Brachypodium distachyon) and contains three independent genomes8 . This has prohibited genome-scale comparisons spanning the three most describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our

  2. Initial sequencing and comparative analysis of the mouse genome

    Microsoft Academic Search

    Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

    2002-01-01

    The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

  3. The diploid genome sequence of an Asian individual

    Microsoft Academic Search

    Jun Wang; Wei Wang; Ruiqiang Li; Yingrui Li; Geng Tian; Laurie Goodman; Wei Fan; Junqing Zhang; Jun Li; Juanbin Zhang; Yiran Guo; Binxiao Feng; Heng Li; Yao Lu; Xiaodong Fang; Huiqing Liang; Zhenglin Du; Dong Li; Yiqing Zhao; Yujie Hu; Zhenzhen Yang; Hancheng Zheng; Ines Hellmann; Michael Inouye; John Pool; Xin Yi; Jing Zhao; Jinjie Duan; Yan Zhou; Junjie Qin; Lijia Ma; Guoqing Li; Zhentao Yang; Guojie Zhang; Bin Yang; Chang Yu; Fang Liang; Wenjie Li; Shaochuan Li; Dawei Li; Peixiang Ni; Jue Ruan; Qibin Li; Hongmei Zhu; Dongyuan Liu; Zhike Lu; Ning Li; Guangwu Guo; Jianguo Zhang; Jia Ye; Lin Fang; Qin Hao; Quan Chen; Yu Liang; Yeyang Su; A. San; Cuo Ping; Shuang Yang; Fang Chen; Li Li; Ke Zhou; Hongkun Zheng; Yuanyuan Ren; Ling Yang; Guohua Yang; Zhuo Li; Xiaoli Feng; Karsten Kristiansen; Gane Ka-Shu Wong; Rasmus Nielsen; Richard Durbin; Lars Bolund; Xiuqing Zhang; Songgang Li; Huanming Yang; Jian Wang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the

  4. The Genome Sequence of Drosophila melanogaster

    NSDL National Science Digital Library

    Ramanujan, Krishna.

    On Thursday March 23, 2000, a historic milestone was marked as researchers announced they have completed mapping the genome of the fruit fly, Drosophila melanogaster. The achievement, which was announced in a special issue of the journal Science, culminates close to 100 years of research. Drosophila melanogaster is the most complex animal thus far to have its genetic sequence deciphered. The findings have important implications for human medical research and for completing a map of the human genome. Mapping the fruit fly genome has been a broad collaborative effort between academia and industry in several countries. While a foundation was laid by US (Berkeley), European, and Canadian Drosophila Genome Projects, Celera Genomic finished the job over the last year by employing super-computers and state-of-the-art gene-sequencing machines. The techniques learned and used in this last phase of mapping may now be applied to more rapidly decode genes of other organisms, including humans. This week's In The News takes a closer look at this important landmark.

  5. Comparative Analysis of Genome Sequences with VISTA

    DOE Data Explorer

    Dubchak, Inna

    VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

  6. PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly

    E-print Network

    Bader, David A.

    AN organism's genome consists of base pairs (bp) from two strands of complementary bases. Reading a sequencePASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly Xing Liu, Student Member of genomes has been revolutionized by sequencing machines that output many short overlapping substrings

  7. Data structures and compression algorithms for genomic sequence data

    Microsoft Academic Search

    Marty C. Brandon; Douglas C. Wallace; Pierre Baldi

    2009-01-01

    Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function, and evolution, but also for the storage, navigation, and privacy of genomic data. Here we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and

  8. The Norway spruce genome sequence and conifer genome evolution.

    PubMed

    Nystedt, Björn; Street, Nathaniel R; Wetterbom, Anna; Zuccolo, Andrea; Lin, Yao-Cheng; Scofield, Douglas G; Vezzi, Francesco; Delhomme, Nicolas; Giacomello, Stefania; Alexeyenko, Andrey; Vicedomini, Riccardo; Sahlin, Kristoffer; Sherwood, Ellen; Elfstrand, Malin; Gramzow, Lydia; Holmberg, Kristina; Hällman, Jimmie; Keech, Olivier; Klasson, Lisa; Koriabine, Maxim; Kucukoglu, Melis; Käller, Max; Luthman, Johannes; Lysholm, Fredrik; Niittylä, Totte; Olson, Ake; Rilakovic, Nemanja; Ritland, Carol; Rosselló, Josep A; Sena, Juliana; Svensson, Thomas; Talavera-López, Carlos; Theißen, Günter; Tuominen, Hannele; Vanneste, Kevin; Wu, Zhi-Qiang; Zhang, Bo; Zerbe, Philipp; Arvestad, Lars; Bhalerao, Rishikesh; Bohlmann, Joerg; Bousquet, Jean; Garcia Gil, Rosario; Hvidsten, Torgeir R; de Jong, Pieter; MacKay, John; Morgante, Michele; Ritland, Kermit; Sundberg, Björn; Thompson, Stacey Lee; Van de Peer, Yves; Andersson, Björn; Nilsson, Ove; Ingvarsson, Pär K; Lundeberg, Joakim; Jansson, Stefan

    2013-05-30

    Conifers have dominated forests for more than 200?million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000?base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding. PMID:23698360

  9. Functional genomics of tomato in a post-genome-sequencing phase

    PubMed Central

    Aoki, Koh; Ogata, Yoshiyuki; Igarashi, Kaori; Yano, Kentaro; Nagasaki, Hideki; Kaminuma, Eli; Toyoda, Atsushi

    2013-01-01

    Completion of tomato genome sequencing project has broad impacts on genetic and genomic studies of tomato and Solanaceae plants. The reference genome sequence derived from Solanum lycopersicum cv ‘Heinz 1706’ serves as the firm basis for sequencing-based approaches to tomato genomics. In this article, we first present a brief summary of the genome sequencing project and a summary of the reference genome sequence. We then focus on recent progress in transcriptome sequencing and small RNA sequencing and show how the reference genome sequence makes these analyses more comprehensive than before. We discuss the potential of in-depth analysis that is based on DNA methylome sequencing and transcription start-site detection. Finally, we describe the current status of efforts to resequence S. lycopersicum cultivars to demonstrate how resequencing can allow the use of intraspecific genomic diversity for detailed phenotyping and breeding. PMID:23641177

  10. Complete genome sequence of Candidatus Ruthia magnifica.

    PubMed

    Roeselers, Guus; Newton, Irene L G; Woyke, Tanja; Auchtung, Thomas A; Dilly, Geoffrey F; Dutton, Rachel J; Fisher, Meredith C; Fontanez, Kristina M; Lau, Evan; Stewart, Frank J; Richardson, Paul M; Barry, Kerrie W; Saunders, Elizabeth; Detter, John C; Wu, Dongying; Eisen, Jonathan A; Cavanaugh, Colleen M

    2010-01-01

    The hydrothermal vent clam Calyptogena magnifica (Bivalvia: Mollusca) is a member of the Vesicomyidae. Species within this family form symbioses with chemosynthetic Gammaproteobacteria. They exist in environments such as hydrothermal vents and cold seeps and have a rudimentary gut and feeding groove, indicating a large dependence on their endosymbionts for nutrition. The C. magnifica symbiont, Candidatus Ruthia magnifica, was the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced (Newton et al. 2007). Here we expand upon the original report and provide additional details complying with the emerging MIGS/MIMS standards. The complete genome exposed the genetic blueprint of the metabolic capabilities of the symbiont. Genes which were predicted to encode the proteins required for all the metabolic pathways typical of free-living chemoautotrophs were detected in the symbiont genome. These include major pathways including carbon fixation, sulfur oxidation, nitrogen assimilation, as well as amino acid and cofactor/vitamin biosynthesis. This genome sequence is invaluable in the study of these enigmatic associations and provides insights into the origin and evolution of autotrophic endosymbiosis. PMID:21304746

  11. Genome sequence of Leuconostoc pseudomesenteroides KCTC 3652.

    PubMed

    Kim, Dong-Wook; Choi, Sang-Haeng; Kang, Aram; Nam, Seong-Hyeuk; Kim, Ryong Nam; Kim, Aeri; Kim, Dae-Soo; Park, Hong-Seog

    2011-08-01

    We announce the genome sequence of one of the most prevalent lactic acid bacteria present during the manufacturing process of cane juice, the type strain Leuconostoc pseudomesenteroides KCTC 3652 (3,244,985 bp, with a G+C content of 38.3%), which consists of 1,160 large contigs (>100 bp in size). All of the contigs were assembled by the Newbler Assembler 2.3 software program (454 Life Sciences). PMID:21705609

  12. The genome sequence of Schizosaccharomyces pombe

    Microsoft Academic Search

    R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

    2002-01-01

    We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

  13. Why Assembling Plant Genome Sequences Is So Challenging

    PubMed Central

    Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

    2012-01-01

    In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

  14. Genome sequence of Halobacterium species NRC-1

    PubMed Central

    Ng, Wailap Victor; Kennedy, Sean P.; Mahairas, Gregory G.; Berquist, Brian; Pan, Min; Shukla, Hem Dutt; Lasky, Stephen R.; Baliga, Nitin S.; Thorsson, Vesteinn; Sbrogna, Jennifer; Swartzell, Steven; Weir, Douglas; Hall, John; Dahl, Timothy A.; Welti, Russell; Goo, Young Ah; Leithauser, Brent; Keller, Kim; Cruz, Randy; Danson, Michael J.; Hough, David W.; Maddocks, Deborah G.; Jablonski, Peter E.; Krebs, Mark P.; Angevine, Christine M.; Dale, Heather; Isenbarger, Thomas A.; Peck, Ronald F.; Pohlschroder, Mechthild; Spudich, John L.; Jung, Kwang-Hwan; Alam, Maqsudul; Freitas, Tracey; Hou, Shaobin; Daniels, Charles J.; Dennis, Patrick P.; Omer, Arina D.; Ebhardt, Holger; Lowe, Todd M.; Liang, Ping; Riley, Monica; Hood, Leroy; DasSarma, Shiladitya

    2000-01-01

    We report the complete sequence of an extreme halophile, Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-bp genome containing 91 insertion sequences representing 12 families and organized into a large chromosome and 2 related minichromosomes. The Halobacterium NRC-1 genome codes for 2,630 predicted proteins, 36% of which are unrelated to any previously reported. Analysis of the genome sequence shows the presence of pathways for uptake and utilization of amino acids, active sodium-proton antiporter and potassium uptake systems, sophisticated photosensory and signal transduction pathways, and DNA replication, transcription, and translation systems resembling more complex eukaryotic organisms. Whole proteome comparisons show the definite archaeal nature of this halophile with additional similarities to the Gram-positive Bacillus subtilis and other bacteria. The ease of culturing Halobacterium and the availability of methods for its genetic manipulation in the laboratory, including construction of gene knockouts and replacements, indicate this halophile can serve as an excellent model system among the archaea. PMID:11016950

  15. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Microsoft Academic Search

    Frank M You; Naxin Huo; Karin R Deal; Yong Q Gu; Ming-Cheng Luo; Patrick E McGuire; Jan Dvorak; Olin D Anderson

    2011-01-01

    BACKGROUND: Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS)

  16. Whole genome sequence (WGS) analysis for exploring plant relationships

    Microsoft Academic Search

    Nicole F Rice; Giovanni M Cordeiro; Catherine J Nock; Daniel LE Waters; Stirling Bowen; Robert J Henry

    2010-01-01

    Shotgun sequencing plant genomic DNA preparations generates large quantities of sequence data in a single run. Using the Illumina GAII, whole genome shot-gun sequence (WGS) data was generated for Oryza sativa cv Nipponbarre, and the rice wild relatives Oryza meridionalis and Oryza australiensis. Two other grass species were also sequenced, Potamophila parviflora, from the Oryzeae tribe and Microlaena stipoides from

  17. Genome Sequence of the Pea Aphid Acyrthosiphon The International Aphid Genomics Consortium"

    E-print Network

    Paris-Sud XI, Université de

    Genome Sequence of the Pea Aphid Acyrthosiphon pisum The International Aphid Genomics Consortium we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple

  18. Ten years of bacterial genome sequencing: comparative-genomics-based discoveries

    Microsoft Academic Search

    Tim T. Binnewies; Yair Motro; Peter F. Hallin; Ole Lund; David Dunn; Tom La; David J. Hampson; Matthew Bellgard; Trudy M. Wassenaar; David W. Ussery

    2006-01-01

    It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of

  19. Genome, Epigenome and RNA sequences of Monozygotic Twins Discordant for Multiple Sclerosis

    SciTech Connect

    Miller, Neil [National Center for Genome Resources

    2010-06-02

    Neil Miller, Deputy Director of Software Engineering at the National Center for Genome Resources, discusses a monozygotic twin study on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  20. The International Rice Genome Sequencing Project: progress and prospects

    Microsoft Academic Search

    T. Sasaki; T. Matsumoto; T. Baba; K. Yamamoto; J. Wu; Y. Katayose; K. Sakata

    The rice genome sequencing project has been pursued as a national project in Japan since 1998. At the same time, a desire to accelerate the sequenc- ing of the entire rice genome led to the formation of the International Rice Genome Sequencing Project (IRGSP), initially comprising five countries. The sequencing strategy is the conventional clone-by-clone shotgun method us- ing P1-derived

  1. The Jackson Laboratory: The Mouse Genome Sequence Project

    NSDL National Science Digital Library

    Part of the Mouse Genome Informatics program (last reported on in the NSDL Scout Report for the Life Sciences on March 19, 2004) at the Jackson Laboratory, this website presents The Mouse Genome Sequence (MGS) project. MGS is designed "to integrate emerging mouse genomic sequence data with the genetic and biological data available in MGD and GXD." The site links to Eukaryotic Genome Annotation Projects, as well as Sequence Analysis Tools including MouseBlast and Genome Analysis. The site also offers basic background information about the Mouse Genome Sequencing Initiative, and provides site users with access to groups involved in mouse genome sequencing, the BAC clone library, request forms for targeted sequencing, and more.

  2. Genome sequence of the Brown Norway rat yields insights into

    E-print Network

    Pachter, Lior

    Genome sequence of the Brown Norway rat yields insights into mammalian evolution Rat Genome ........................................................................................................................................................................................................................... The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development Norway (BN) rat strain. The sequence represents a high-quality `draft' covering over 90% of the genome

  3. Statistical Properties of Open Reading Frames in Complete Genome Sequences

    Microsoft Academic Search

    Wentian Li

    1999-01-01

    Some statistical properties of open reading frames in all currently available complete genome sequences are analyzed (seventeen prokatyotic genomes, and 16 chromosome sequences from the yeast genome). The size distribution of open reading frames is characterized by various techniques, such as quantile tables, QQ-plots, rank- size plots (Zipf's plots), and spatial densities. The issue of the influence of CG% on

  4. Analysis of Singleton ORFans in Fully Sequenced Microbial Genomes

    E-print Network

    Fischer, Daniel

    Analysis of Singleton ORFans in Fully Sequenced Microbial Genomes Naomi Siew1,2 and Daniel Fischer2 analysis of singleton ORFans in the first 60 fully sequenced microbial genomes. We show that al- though as more genomes of closely related organ- isms become available. To better address the ques- tions about

  5. Combined Evidence Annotation of Transposable Elements in Genome Sequences

    Microsoft Academic Search

    Hadi Quesneville; Olivier Andrieu; Delphine Autard; Danielle Nouaud; Michael Ashburner; Dominique Anxolabehere

    2005-01-01

    Transposable elements (TEs) are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from

  6. SBH Performance on Genomic Sequences 1 Sequencing by Hybridization A Simulation Study of

    E-print Network

    Shamir, Ron

    were randomly generated or randomly selected from the genomic databases of: a) S. cervisae, b) E.coliSBH Performance on Genomic Sequences 1 Sequencing by Hybridization ­ A Simulation Study of Performance on Genomic Sequences Doron Lipson1 , Ziv Nevo, Ari Frank, Dolev Dotan, Zohar Yakhini2 Computer

  7. Simple sequence repeats in bryophyte mitochondrial genomes.

    PubMed

    Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

    2014-02-01

    Abstract Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

  8. Porcine parvovirus: DNA sequence and genome organization.

    PubMed

    Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I

    1989-10-01

    We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV. PMID:2794971

  9. Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences

    Microsoft Academic Search

    Robert K. Jansen; Linda A. Raubeson; Jeffrey L. Boore; Claude W. dePamphilis; Timothy W. Chumley; Rosemarie C. Haberle; Stacia K. Wyman; Andrew J. Alverson; Rhiannon Peery; Sallie J. Herman; H. Matthew Fourcade; Jennifer V. Kuehl; Joel R. McNeal; James Leebens-Mack; Liying Cui

    2005-01-01

    During the past decade, there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. There are 45 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next 5 years. Several groups of researchers including ours have

  10. Initial sequencing and comparative analysis of the mouse genome

    E-print Network

    Eddy, Sean

    and knockin techniques17­22 . For these and other reasons, the Human Genome Project (HGP) recognized from its ........................................................................................................................................................................................................................... The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from

  11. Genomic Sequence Comparisons, 1987-2003 Final Report

    SciTech Connect

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  12. Draft Genome Sequence of Bacillus amyloliquefaciens B-1895

    PubMed Central

    Melnikov, Vyacheslav G.; Chistyakov, Vladimir A.

    2014-01-01

    In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

  13. Draft Genome Sequence of Bacillus amyloliquefaciens B-1895.

    PubMed

    Karlyshev, Andrey V; Melnikov, Vyacheslav G; Chistyakov, Vladimir A

    2014-01-01

    In this report, we present a draft genome sequence of Bacillus amyloliquefaciens strain B-1895. Comparison with the genome of a reference strain demonstrated similar overall organization, as well as differences involving large gene clusters. PMID:24948774

  14. Genome sequencing of the important oilseed crop Sesamum indicum L

    PubMed Central

    2013-01-01

    The Sesame Genome Working Group (SGWG) has been formed to sequence and assemble the sesame (Sesamum indicum L.) genome. The status of this project and our planned analyses are described. PMID:23369264

  15. Initial impact of the sequencing of the human genome

    E-print Network

    Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

    The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

  16. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter [University of Chicago

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  17. Validation of rice genome sequence by optical mapping

    Microsoft Academic Search

    Shiguo Zhou; Michael C Bechner; Chris P Churas; Louise Pape; Sally A Leong; Rod Runnheim; Dan K Forrest; Steve Goldstein; Miron Livny; David C Schwartz

    2007-01-01

    BACKGROUND: Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. RESULTS: To facilitate ongoing sequencing finishing and validation

  18. Draft Genome Sequence of the Archiascomycetous Yeast Saitoella complicata.

    PubMed

    Yamauchi, Kenta; Kondo, Shinji; Hamamoto, Makiko; Takahashi, Yurika; Ogura, Yoshitoshi; Hayashi, Tetsuya; Nishida, Hiromi

    2015-01-01

    The draft genome sequence of the archiasomycetous yeast Saitoella complicata was determined. The assembly of newly and previously sequenced data sets resulted in 104 contigs (total of 14.1 Mbp; N 50, 239 kbp). On the newly assembled genome, a total of 6,933 protein-coding sequences (7,119 transcripts, including alternative splicing forms) were identified. PMID:26021914

  19. Draft Genome Sequence of the Archiascomycetous Yeast Saitoella complicata

    PubMed Central

    Yamauchi, Kenta; Hamamoto, Makiko; Takahashi, Yurika; Ogura, Yoshitoshi; Hayashi, Tetsuya

    2015-01-01

    The draft genome sequence of the archiasomycetous yeast Saitoella complicata was determined. The assembly of newly and previously sequenced data sets resulted in 104 contigs (total of 14.1 Mbp; N50, 239 kbp). On the newly assembled genome, a total of 6,933 protein-coding sequences (7,119 transcripts, including alternative splicing forms) were identified. PMID:26021914

  20. Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)

    SciTech Connect

    Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  1. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions fr...

  2. Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus

    E-print Network

    2007-01-01

    genome sequences and make comparisons (within angiosperms, seed plants,genome sequence from Korean Ginseng (Panax schiseng Nees) and comparative analysis of sequence evolution among 17 vascular plants.genomes of all other vascular plant taxa examined, a similar sequence

  3. Comparative DNA Sequence Analysis of Wheat and Rice Genomes

    PubMed Central

    Sorrells, Mark E.; La Rota, Mauricio; Bermudez-Kandianis, Catherine E.; Greene, Robert A.; Kantety, Ramesh; Munkvold, Jesse D.; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J.; Qi, Lili L.; Echalier, Benjamin; Gill, Bikram S.; Matthews, David E.; Lazo, Gerard R.; Chao, Shiaoman; Anderson, Olin D.; Edwards, Hugh; Linkiewicz, Anna M.; Dubcovsky, Jorge; Akhunov, Eduard D.; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T.; Peng, Junhua; Lapitan, Nora L.V.; Gonzalez-Hernandez, Jose L.; Anderson, James A.; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F.; Choi, Dong-Woog; Close, Timothy J.; Dilbirligi, Muharrem; Gill, Kulvinder S.; Steber, Camille; Walker-Simmons, Mary K.; McGuire, Patrick E.; Qualset, Calvin O.

    2003-01-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  4. Comparative DNA sequence analysis of wheat and rice genomes.

    PubMed

    Sorrells, Mark E; La Rota, Mauricio; Bermudez-Kandianis, Catherine E; Greene, Robert A; Kantety, Ramesh; Munkvold, Jesse D; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J; Qi, Lili L; Echalier, Benjamin; Gill, Bikram S; Matthews, David E; Lazo, Gerard R; Chao, Shiaoman; Anderson, Olin D; Edwards, Hugh; Linkiewicz, Anna M; Dubcovsky, Jorge; Akhunov, Eduard D; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T; Peng, Junhua; Lapitan, Nora L V; Gonzalez-Hernandez, Jose L; Anderson, James A; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F; Choi, Dong-Woog; Close, Timothy J; Dilbirligi, Muharrem; Gill, Kulvinder S; Steber, Camille; Walker-Simmons, Mary K; McGuire, Patrick E; Qualset, Calvin O

    2003-08-01

    The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

  5. Sequencing and Assembly of the 22-Gb Loblolly Pine Genome

    PubMed Central

    Zimin, Aleksey; Stevens, Kristian A.; Crepeau, Marc W.; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L.; de Jong, Pieter J.; Neale, David B.; Salzberg, Steven L.; Yorke, James A.; Langley, Charles H.

    2014-01-01

    Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer “super-reads,” rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp. PMID:24653210

  6. The reference genome sequence of Saccharomyces cerevisiae: then and now.

    PubMed

    Engel, Stacia R; Dietrich, Fred S; Fisk, Dianna G; Binkley, Gail; Balakrishnan, Rama; Costanzo, Maria C; Dwight, Selina S; Hitz, Benjamin C; Karra, Kalpana; Nash, Robert S; Weng, Shuai; Wong, Edith D; Lloyd, Paul; Skrzypek, Marek S; Miyasato, Stuart R; Simison, Matt; Cherry, J Michael

    2014-03-01

    The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called "S288C 2010," was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science. PMID:24374639

  7. University of Tokyo-Institute of Medical Science: Human Genome Center

    NSDL National Science Digital Library

    The Human Genome Center was established in 1991 at the University of Tokyo's Institute of Medical Science. In pursuit of progress in the areas of human disease diagnosis, care, and prevention, the Center conducts genome research in Japan and participates in "international activities in database construction, mapping, and sequencing of the human genome." The Genome Center website contains links to its nine Laboratories which conduct research in the following areas: Genome Structure, Sequence Analysis, Molecular Medicine, and DNA Information Analysis, to name a few. Laboratory pages contain information about research, publications, staff, and services. The Center site also links to a number of databases and software tools including a database of Japanese Single Nucleotide Polymorphisms (JSNP), Microbial Genome Database for Comparative Analysis (MBGD), PSI-BLAST, TFBIND (software for searching transcription factor binding sites), and more.

  8. Mapping the Human Reference Genome’s Missing Sequence by Three-Way Admixture in Latino Genomes

    PubMed Central

    Genovese, Giulio; Handsaker, Robert E.; Li, Heng; Kenny, Eimear E.; McCarroll, Steven A.

    2013-01-01

    A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

  9. APPLIED GENOMICS TECHNOLOGY CENTER www.agtc.med.wayne.edu

    E-print Network

    Berdichevsky, Victor

    APPLIED GENOMICS TECHNOLOGY CENTER www.agtc.med.wayne.edu CURRENT SERVICES CONTACT INFORMATION Dr. Susan J. Land, Ph.D. Laboratory Director ABOUT THE FACILITY The Applied Genomics Technology Center (AGTC-of-the-art, fee-for-service genomics center that provides a wide range of genomic technologies to the medical

  10. Complete genome sequences of cellular life forms: glimpses of theoretical evolutionary genomics

    Microsoft Academic Search

    Eugene V Koonin; Arcady R Mushegian

    1996-01-01

    The availability of complete genome sequences of cellular life forms creates the opportunity to explore the functional content of the genomes and evolutionary relationships between them at a new qualitative level. With the advent of these sequences, the construction of a minimal gene set sufficient for sustaining cellular life and reconstruction of the genome of the last common ancestor of

  11. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    Microsoft Academic Search

    Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

    2002-01-01

    Background: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D.

  12. Insights from twenty years of bacterial genome sequencing

    SciTech Connect

    Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jun, Se Ran [ORNL; Nookaew, Intawat [ORNL; Leuze, Michael Rex [ORNL; Ahn, Tae-Hyuk [ORNL; Karpinets, Tatiana V [ORNL; Lund, Ole [Technical University of Denmark; Kora, Guruprasad H [ORNL; Wassenaar, Trudy [Molecular Microbiology & Genomics Consultants, Zotzenheim, Germany; Poudel, Suresh [ORNL; Ussery, David W [ORNL

    2015-01-01

    Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.

  13. First complete genome sequence of infectious laryngotracheitis virus

    Microsoft Academic Search

    Sang-Won Lee; Philip F Markham; John F Markham; Ivonne Petermann; Amir H Noormohammadi; Glenn F Browning; Nino P Ficorilli; Carol A Hartley; Joanne M Devlin

    2011-01-01

    Background  Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide.\\u000a To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial\\u000a sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not\\u000a been determined previously. This study aimed

  14. Compressing Genomic Sequence Fragments Using SlimGene

    Microsoft Academic Search

    Christos Kozanitis; Chris Saunders; Semyon Kruglyak; Vineet Bafna; George Varghese

    2010-01-01

    \\u000a With the advent of next generation sequencing technologies, the cost of sequencing whole genomes is poised to go below $1000\\u000a per human individual in a few years. As more and more genomes are sequenced, analysis methods are undergoing rapid development,\\u000a making it tempting to store sequencing data for long periods of time so that the data can be re-analyzed with

  15. Sequencing the Human Genome: A Historical Perspective on Challenges for Systems Integration

    Microsoft Academic Search

    Lee Rowen

    The sequence of the human genomewas declared finished on April 14, 2003. Analyses have been published in the journal Nature for chromosomes 6, 7, 14, 20, 21, 22 andY, with the other chromosomes to followin 2004. Although the Human Genome Project\\u000a officially began in 1990, most of the publicly accessible sequence data were produced by 20 genome centers in six

  16. Genome Project Standards in a New Era of Sequencing

    SciTech Connect

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.

  17. Whole genome sequencing in support of wellness and health maintenance

    PubMed Central

    2013-01-01

    Background Whole genome sequencing is poised to revolutionize personalized medicine, providing the capacity to classify individuals into risk categories for a wide range of diseases. Here we begin to explore how whole genome sequencing (WGS) might be incorporated alongside traditional clinical evaluation as a part of preventive medicine. The present study illustrates novel approaches for integrating genotypic and clinical information for assessment of generalized health risks and to assist individuals in the promotion of wellness and maintenance of good health. Methods Whole genome sequences and longitudinal clinical profiles are described for eight middle-aged Caucasian participants (four men and four women) from the Center for Health Discovery and Well Being (CHDWB) at Emory University in Atlanta. We report multivariate genotypic risk assessments derived from common variants reported by genome-wide association studies (GWAS), as well as clinical measures in the domains of immune, metabolic, cardiovascular, musculoskeletal, respiratory, and mental health. Results Polygenic risk is assessed for each participant for over 100 diseases and reported relative to baseline population prevalence. Two approaches for combining clinical and genetic profiles for the purposes of health assessment are then presented. First we propose conditioning individual disease risk assessments on observed clinical status for type 2 diabetes, coronary artery disease, hypertriglyceridemia and hypertension, and obesity. An approximate 2:1 ratio of concordance between genetic prediction and observed sub-clinical disease is observed. Subsequently, we show how more holistic combination of genetic, clinical and family history data can be achieved by visualizing risk in eight sub-classes of disease. Having identified where their profiles are broadly concordant or discordant, an individual can focus on individual clinical results or genotypes as they develop personalized health action plans in consultation with a health partner or coach. Conclusion The CHDWB will facilitate longitudinal evaluation of wellness-focused medical care based on comprehensive self-knowledge of medical risks. PMID:23806097

  18. Finishing The Euchromatic Sequence Of The Human Genome

    SciTech Connect

    Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

    2004-09-07

    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

  19. Accurate whole human genome sequencing using reversible terminator chemistry

    Microsoft Academic Search

    David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

    2008-01-01

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

  20. Complete Genome Sequences of Helicobacter pylori Clarithromycin-Resistant Strains

    PubMed Central

    Binh, Tran Thanh; Suzuki, Rumiko; Shiota, Seiji; Kwon, Dong Hyeon

    2013-01-01

    We report the complete genome sequences of two Helicobacter pylori clarithromycin-resistant strains. Clarithromycin (CLR)-resistant strains were obtained under the exposure of H. pylori strain 26695 on agar plates with low clarithromycin concentrations. The genome data provide insights into the genomic changes of H. pylori under selection by clarithromycin in vitro. PMID:24233587

  1. Complete Genome Sequences of Helicobacter pylori Clarithromycin-Resistant Strains.

    PubMed

    Binh, Tran Thanh; Suzuki, Rumiko; Shiota, Seiji; Kwon, Dong Hyeon; Yamaoka, Yoshio

    2013-01-01

    We report the complete genome sequences of two Helicobacter pylori clarithromycin-resistant strains. Clarithromycin (CLR)-resistant strains were obtained under the exposure of H. pylori strain 26695 on agar plates with low clarithromycin concentrations. The genome data provide insights into the genomic changes of H. pylori under selection by clarithromycin in vitro. PMID:24233587

  2. Complete Genome Sequences of Helicobacter pylori Rifampin-Resistant Strains

    PubMed Central

    Chelysheva, Vera; Selezneva, Oksana; Akopian, Tatyana; Alexeev, Dmitry; Govorun, Vadim

    2013-01-01

    Here we present the complete genome sequences of two Helicobacter pylori rifampin-resistant (Rifr) strains (Rif1 and Rif2). Rifr strains were obtained by in vitro selection of H. pylori 26695 on agar plates with 20 µg/ml rifampin. The genome data provide insights on the genomic diversity of H. pylori under selection by rifampin. PMID:23833139

  3. On the sequencing of the human genome Robert H. Waterston*

    E-print Network

    Batzoglou, Serafim

    . The international Human Ge- nome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics. One was the product of the international Human Genome Project (HGP), and the other was the productOn the sequencing of the human genome Robert H. Waterston* , Eric S. Lander , and John E. Sulston

  4. SEQUENCING THE PIG GENOME USING A BAC BY BAC APPROACH

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have generated a highly contiguous physical map covering >98% of the pig genome in just 176 contigs. The map is localized to the genome through integration with the UIVC RH map as well BAC end sequence alignments to the human genome. Over 265k HindIII restriction digest fingerprints totaling 16.2...

  5. Genome-level homology and phylogeny of Vibrionaceae (Gammaproteobacteria: Vibrionales) with three new complete genome sequences

    E-print Network

    Dikow, R. B.; Smith, William Leo

    2013-04-11

    Background Phylogenetic hypotheses based on complete genome data are presented for the Gammaproteobacteria family Vibrionaceae. Two taxon samplings are presented: one including all those taxa for which the genome sequences are complete in terms...

  6. On the current status of Phakopsora pachyrhizi genome sequencing

    PubMed Central

    Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

    2014-01-01

    Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

  7. Draft Genome Sequence of Mycobacterium heraklionense Strain Davo.

    PubMed

    Greninger, Alexander L; Cunningham, Gail; Chiu, Charles Y; Miller, Steve

    2015-01-01

    We report the draft genome sequence of Mycobacterium heraklionense strain Davo, isolated from a fine-needle aspirate of a right-ankle soft-tissue mass. This is the first draft genome sequence of Mycobacterium heraklionense, a nonpigmented rapidly growing mycobacterium. PMID:26205863

  8. Draft Genome Sequence of Tannerella forsythia Type Strain ATCC 43037.

    PubMed

    Friedrich, Valentin; Pabinger, Stephan; Chen, Tsute; Messner, Paul; Dewhirst, Floyd E; Schäffer, Christina

    2015-01-01

    Tannerella forsythia is an oral pathogen implicated in the development of periodontitis. Here, we report the draft genome sequence of the Tannerella forsythia strain ATCC 43037. The previously available genome of this designation (NCBI reference sequence NC_016610.1) was discovered to be derived from a different strain, FDC 92A2 (= ATCC BAA-2717). PMID:26067981

  9. Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...

  10. Draft Genome Sequence of Neurospora crassa Strain FGSC 73.

    PubMed

    Baker, Scott E; Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V; Simmons, Blake A; McCluskey, Kevin

    2015-01-01

    We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models. PMID:25838471

  11. Draft Genome Sequence of Xanthomonas sacchari Strain LMG 476.

    PubMed

    Pieretti, Isabelle; Bolot, Stéphanie; Carrère, Sébastien; Barbe, Valérie; Cociancich, Stéphane; Rott, Philippe; Royer, Monique

    2015-01-01

    We report the high-quality draft genome sequence of Xanthomonas sacchari strain LMG 476, isolated from sugarcane. The genome comparison of this strain with a previously sequenced X. sacchari strain isolated from a distinct environmental source should provide further insights into the adaptation of this species to different habitats and its evolution. PMID:25792064

  12. Draft Genome Sequence of Aspergillus oryzae Strain 3.042

    PubMed Central

    Zhao, Guozhong; Yao, Yunping; Qi, Wei; Wang, Chunling; Hou, Lihua; Zeng, Bin

    2012-01-01

    Aspergillus oryzae is the most important fungus for the traditional fermentation in China and is particularly important in soy sauce fermentation. We report the 36,547,279-bp draft genome sequence of A. oryzae 3.042 and compared it to the published genome sequence of A. oryzae RIB40. PMID:22933657

  13. Sequence and comparative analysis of the chicken genome provide unique

    E-print Network

    Edwards, Scott

    evolution International Chicken Genome Sequencing Consortium* *Lists of participants and affiliations appear is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced the Aves, their Mesozoic dinosaur predecessors, and Crocodilia; the Lepidosauria (lizards, snakes

  14. Draft Genome Sequence of Tannerella forsythia Type Strain ATCC 43037

    PubMed Central

    Friedrich, Valentin; Pabinger, Stephan; Chen, Tsute; Messner, Paul; Dewhirst, Floyd E.

    2015-01-01

    Tannerella forsythia is an oral pathogen implicated in the development of periodontitis. Here, we report the draft genome sequence of the Tannerella forsythia strain ATCC 43037. The previously available genome of this designation (NCBI reference sequence NC_016610.1) was discovered to be derived from a different strain, FDC 92A2 (= ATCC BAA-2717). PMID:26067981

  15. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii

    PubMed Central

    Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  16. Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.

    PubMed

    Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

    2013-01-01

    Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

  17. Complete Genome Sequence of Melissococcus plutonius ATCC 35311 ?

    PubMed Central

    Okumura, Kayo; Arai, Rie; Okura, Masatoshi; Kirikae, Teruo; Takamatsu, Daisuke; Osaki, Makoto; Miyoshi-Akiyama, Tohru

    2011-01-01

    We report the first completely annotated genome sequence of Melissococcus plutonius ATCC 35311. M. plutonius is a one-genus, one-species bacterium and the etiological agent of European foulbrood of the honeybee. The genome sequence will provide new insights into the molecular mechanisms underlying its pathogenicity. PMID:21622755

  18. Complete Genome Sequence of Burkholderia cepacia Strain LO6.

    PubMed

    Belcaid, Mahdi; Kang, Yun; Tuanyok, Apichai; Hoang, Tung T

    2015-01-01

    Burkholderia cepacia strain LO6 is a betaproteobacterium that was isolated from a cystic fibrosis patient. Here we report the 6.4 Mb draft genome sequence assembled into 2 contigs. This genome sequence will aid the transcriptomic profiling of this bacterium and help us to better understand the mechanisms specific to pulmonary infections. PMID:26067955

  19. Complete Genome Sequence of Burkholderia cepacia Strain LO6

    PubMed Central

    Belcaid, Mahdi; Kang, Yun; Tuanyok, Apichai

    2015-01-01

    Burkholderia cepacia strain LO6 is a betaproteobacterium that was isolated from a cystic fibrosis patient. Here we report the 6.4 Mb draft genome sequence assembled into 2 contigs. This genome sequence will aid the transcriptomic profiling of this bacterium and help us to better understand the mechanisms specific to pulmonary infections. PMID:26067955

  20. Use of Whole Genome Sequence Data To Infer Baculovirus Phylogeny

    Microsoft Academic Search

    ELISABETH A. HERNIOU; TERESA LUQUE; XINWEN CHEN; JUST M. VLAK; DOREEN WINSTANLEY; JENNIFER S. CORY; D. R. O'Reilly

    2001-01-01

    Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance

  1. Initial sequencing and analysis of the human genome

    Microsoft Academic Search

    Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

    2001-01-01

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

  2. Draft Genome Sequence of Tolypothrix boutellei Strain VB521301

    PubMed Central

    Chandrababunaidu, Mathu Malar; Singh, Deeksha; Sen, Diya; Bhan, Sushma; Das, Subhadeep; Gupta, Akash

    2015-01-01

    We report here the draft genome sequence of the filamentous nitrogen-fixing cyanobacterium Tolypothrix boutellei strain VB521301. The organism is lipid rich and hydrophobic and produces polyunsaturated fatty acids which can be harnessed for industrial purpose. The draft genome sequence assembled into 11,572,263 bp with 70 scaffolds and 7,777 protein coding genes. PMID:25700407

  3. De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.

    PubMed

    Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

    2015-01-01

    The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

  4. Enhancing genome assemblies by integrating non-sequence based data

    Microsoft Academic Search

    Thomas N Heider; James Lindsay; Chenwei Wang; Rachel J O’Neill; Andrew J Pask

    2011-01-01

    Introduction  Many genome projects were underway before the advent of high-throughput sequencing and have thus been supported by a wealth\\u000a of genome information from other technologies. Such information frequently takes the form of linkage and physical maps, both\\u000a of which can provide a substantial amount of data useful in de novo sequencing projects. Furthermore, the recent abundance of genome resources enables

  5. The human genome sequence: impact on health care

    Microsoft Academic Search

    M. D. Bashyam; S. E. Hasnain

    2003-01-01

    The recent sequencing of the human genome, resulting from two independent global efforts, is poised to revolutionize all aspects of human health. This landmark achievement has also vindicated two different methodologies that can now be used to target other important large genomes. The human genome sequence has revealed several novel\\/surprising features notably the probable presence of a mere 30-35,000 genes.

  6. Genome sequence of the human malaria parasite Plasmodium falciparum

    Microsoft Academic Search

    Malcolm J. Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W. Hyman; Jane M. Carlton; Arnab Pain; Sharen Bowman; Ian T. Paulsen; Keith James; Kim Rutherford; Steven L. Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J. Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W. Mather; Akhil B. Vaidya; Alan H. Fairlamb; Martin J. Fraunholz; David S. Roos; Stuart A. Ralph; Geoffrey I. McFadden; Leda M. Cummings; G. Mani Subramanian; Chris Mungall; J. Craig Venter; Daniel J. Carucci; Stephen L. Hoffman; Chris Newbold; Ronald W. Davis; Claire M. Fraser; Bart Barrell

    2002-01-01

    The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date.

  7. Single-molecule DNA sequencing technologies for future genomics research.

    PubMed

    Gupta, Pushpendra K

    2008-11-01

    During the current genomics revolution, the genomes of a large number of living organisms have been fully sequenced. However, with the advent of new sequencing technologies, genomics research is now at the threshold of a second revolution. Several second-generation sequencing platforms became available in 2007, but a further revolution in DNA resequencing technologies is being witnessed in 2008, with the launch of the first single-molecule DNA sequencer (Helicos Biosciences), which has already been used to resequence the genome of the M13 virus. This review discusses several single-molecule sequencing technologies that are expected to become available during the next few years and explains how they might impact on genomics research. PMID:18722683

  8. Whole-genome sequencing in outbreak analysis.

    PubMed

    Gilchrist, Carol A; Turner, Stephen D; Riley, Margaret F; Petri, William A; Hewlett, Erik L

    2015-07-01

    In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  9. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Microsoft Academic Search

    Tina T. Hu; Pedro Pattyn; Erica G. Bakker; Jun Cao; Jan-Fang Cheng; Richard M. Clark; Noah Fahlgren; Jeffrey A. Fawcett; Jane Grimwood; Heidrun Gundlach; Georg Haberer; Jesse D. Hollister; Stephan Ossowski; Robert P. Ottilar; Asaf A. Salamov; Korbinian Schneeberger; Manuel Spannagl; Xi Wang; Liang Yang; Mikhail E. Nasrallah; Joy Bergelson; James C. Carrington; Brandon S. Gaut; Jeremy Schmutz; Klaus F. X. Mayer; Yves Van de Peer; Igor V. Grigoriev; Magnus Nordborg; Detlef Weigel; Ya-Long Guo

    2011-01-01

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN\\/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how

  10. Standards for Sequencing Viral Genomes in the Era of High-Throughput Sequencing

    PubMed Central

    Beitzel, Brett; Chain, Patrick S. G.; Davenport, Matthew G.; Donaldson, Eric; Frieman, Matthew; Kugelman, Jeffrey; Kuhn, Jens H.; O’Rear, Jules; Sabeti, Pardis C.; Wentworth, David E.; Wiley, Michael R.; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher

    2014-01-01

    ABSTRACT Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

  11. Standards for sequencing viral genomes in the era of high-throughput sequencing.

    PubMed

    Ladner, Jason T; Beitzel, Brett; Chain, Patrick S G; Davenport, Matthew G; Donaldson, Eric F; Frieman, Matthew; Kugelman, Jeffrey R; Kuhn, Jens H; O'Rear, Jules; Sabeti, Pardis C; Wentworth, David E; Wiley, Michael R; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher; Palacios, Gustavo

    2014-01-01

    Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five "standard" categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

  12. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    SciTech Connect

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  13. Emerging Knowledge from Genome Sequencing of Crop Species

    Microsoft Academic Search

    Delfina Barabaschi; Davide Guerra; Katia Lacrima; Paolo Laino; Vania Michelotti; Simona Urso; Giampiero Valè; Luigi Cattivelli

    Extensive insights into the genome composition, organization, and evolution have been gained from the plant genome sequencing\\u000a and annotation ongoing projects. The analysis of crop genomes provided surprising evidences with important implications in\\u000a plant origin and evolution: genome duplication, ancestral re-arrangements and unexpected polyploidization events opened new\\u000a doors to address fundamental questions related to species proliferation, adaptation, and functional modulations.

  14. Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus

    PubMed Central

    Wang, Qing-Yan; Huang, Yan-Yan; Song, Li-Fu; Du, Qi-Shi; Yu, Bo; Chen, Dong

    2014-01-01

    We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies. PMID:25395648

  15. Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo electric organ

    E-print Network

    Vertes, Akos

    Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo discovery Open reading frame (ORF) Uncharacterized open reading frames (ORFs) in human genomic sequence Elsevier B.V. All rights reserved. 1. Introduction The availability of complete genomic sequences

  16. Basics of Genome Sequence Analysis in Bioinformatics -- its Fundamental Ideas and Problems

    Microsoft Academic Search

    Tomonori Suzuki; Satoru Miyazaki

    2009-01-01

    The genome sequences are one of the most fundamental data among various omics analyses. So far, basic bioinformatics tools have developing to treat genome sequences. First step of genome sequence analysis is to predict or assign \\

  17. Community-wide analysis of microbial genome sequence signatures

    PubMed Central

    Dick, Gregory J; Andersson, Anders F; Baker, Brett J; Simmons, Sheri L; Thomas, Brian C; Yelton, A Pepper; Banfield, Jillian F

    2009-01-01

    Background Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape them. Results We analyzed metagenomic sequence data from two acidophilic biofilm communities, including composite genomes reconstructed for nine archaea, three bacteria, and numerous associated viruses, as well as thousands of unassigned fragments from strain variants and low-abundance organisms. Genome signatures, in the form of tetranucleotide frequencies analyzed by emergent self-organizing maps, segregated sequences from all known populations sharing < 50 to 60% average amino acid identity and revealed previously unknown genomic clusters corresponding to low-abundance organisms and a putative plasmid. Signatures were pervasive genome-wide. Clusters were resolved because intra-genome differences resulting from translational selection or protein adaptation to the intracellular (pH ~5) versus extracellular (pH ~1) environment were small relative to inter-genome differences. We found that these genome signatures stem from multiple influences but are primarily manifested through codon composition, which we propose is the result of genome-specific mutational biases. Conclusions An important conclusion is that shared environmental pressures and interactions among coevolving organisms do not obscure genome signatures in acid mine drainage communities. Thus, genome signatures can be used to assign sequence fragments to populations, an essential prerequisite if metagenomics is to provide ecological and biochemical insights into the functioning of microbial communities. PMID:19698104

  18. Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC

    Microsoft Academic Search

    Dirk Pöhler; Nadine Werner; Rasmus Steinkamp; Burkhard Morgenstern

    2005-01-01

    Comparative analysis of genomic sequences is a powerful approach to discover functional sites in these sequences. Herein, we present a WWW-based software system for multiple alignment of genomic sequences. We use the local alignment tool CHAOS to rapidly identify chains of pairwise similarities. These similarities are used as anchor points to speed up the DIALIGN multiple-alignment program. Finally,thevisualizationtoolABCisusedforinteract- ive graphical

  19. Strain-specific and pooled genome sequences for populations of Drosophila melanogaster from three continents.

    PubMed

    Bergman, Casey M; Haddrill, Penelope R

    2015-01-01

    To contribute to our general understanding of the evolutionary forces that shape variation in genome sequences in nature, we have sequenced genomes from 50 isofemale lines and six pooled samples from populations of Drosophila melanogaster on three continents. Analysis of raw and reference-mapped reads indicates the quality of these genomic sequence data is very high. Comparison of the predicted and experimentally-determined Wolbachia infection status of these samples suggests that strain or sample swaps are unlikely to have occurred in the generation of these data. Genome sequences are freely available in the European Nucleotide Archive under accession ERP009059. Isofemale lines can be obtained from the Drosophila Species Stock Center. PMID:25717372

  20. Identification of Optimum Sequencing Depth Especially for De Novo Genome Assembly of Small Genomes Using Next Generation Sequencing Data

    PubMed Central

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources. PMID:23593174

  1. Data structures and compression algorithms for genomic sequence data

    PubMed Central

    Brandon, Marty C.; Wallace, Douglas C.; Baldi, Pierre

    2009-01-01

    Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data. Results: The general idea is to encode only the differences between a genome sequence and a reference sequence, using absolute or relative coordinates for the location of the differences. These locations and the corresponding differential variants can be encoded into binary strings using various entropy coding methods, from fixed codes such as Golomb and Elias codes, to variables codes, such as Huffman codes. We demonstrate the approach and various tradeoffs using highly variables human mitochondrial genome sequences as a testbed. With only a partial level of optimization, 3615 genome sequences occupying 56 MB in GenBank are compressed down to only 167 KB, achieving a 345-fold compression rate, using the revised Cambridge Reference Sequence as the reference sequence. Using the consensus sequence as the reference sequence, the data can be stored using only 133 KB, corresponding to a 433-fold level of compression, roughly a 23% improvement. Extensions to nuclear genomes and high-throughput sequencing data are discussed. Availability: Data are publicly available from GenBank, the HapMap web site, and the MITOMAP database. Supplementary materials with additional results, statistics, and software implementations are available from http://mammag.web.uci.edu/bin/view/Mitowiki/ProjectDNACompression. Contact: pfbaldi@ics.uci.edu PMID:19447783

  2. The Brachypodium genome sequence: a resource for oat genomics research

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oat (Avena sativa) is an important cereal crop used as both an animal feed and for human consumption. Genetic and genomic research on oat is hindered because it is hexaploid and possesses a large (13 Gb) genome. Diploid Avena relatives have been employed for genetic and genomic studies, but only mod...

  3. Reference genome sequence of the model plant Setaria

    SciTech Connect

    Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

    2012-01-01

    We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

  4. Complete genome sequence of Gordonia bronchialis type strain (3410T)

    SciTech Connect

    Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Jando, Marlen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  5. Community-wide analysis of microbial genome sequence signatures

    Microsoft Academic Search

    Gregory J Dick; Anders F Andersson; Brett J Baker; Sheri L Simmons; Brian C Thomas; A Pepper Yelton; Jillian F Banfield

    2009-01-01

    Background  Analyses of DNA sequences from cultivated microorganisms have revealed genome-wide, taxa-specific nucleotide compositional\\u000a characteristics, referred to as genome signatures. These signatures have far-reaching implications for understanding genome\\u000a evolution and potential application in classification of metagenomic sequence fragments. However, little is known regarding\\u000a the distribution of genome signatures in natural microbial communities or the extent to which environmental factors shape\\u000a them.

  6. Complete genome sequence of Spirosoma linguale type strain (1T)

    SciTech Connect

    Lail, Kathleen [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Schutze, Andrea [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chen, Feng [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Spirosoma linguale Migula 1894 is the type species of the genus. S. linguale is a free-living and non-pathogenic organism, known for its peculiar ringlike and horseshoe-shaped cell morphology. Here we describe the features of this organism, together with the complete ge-nome sequence and annotation. This is only the third completed genome sequence of a member of the family Cytophagaceae. The 8,491,258 bp long genome with its eight plas-mids, 7,069 protein-coding and 60 RNA genes is part of the Genomic Encyclopedia of Bacte-ria and Archaea project.

  7. STATUS OF THE RB51 GENOME SEQUENCING PROJECT

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The shotgun sequencing of the B. abortus vaccine strain, RB51 genome is nearly complete. Thus far, approximately 49,000 recombinant clones have been sequenced, generating approximately 34,300,000-bp of raw DNA sequence data. The resulting data has been compiled and aligned using the B. abortus st...

  8. Discrete-Length Repeated Sequences in Eukaryotic Genomes

    Microsoft Academic Search

    William R. Pearson; John F. Morrow

    1981-01-01

    Two of the four repeated DNA sequences near the 5' end of the silk fibroin gene hybridize with discrete-length families of repeated DNA. These two families comprise 0.5% of the animal's genome. A repeated sequence with a conserved length has also been found in the short class of moderately repeated sequences in the sea urchin. The discrete length, interspersion, and

  9. prot4EST: Translating Expressed Sequence Tags from neglected genomes

    Microsoft Academic Search

    James D Wasmuth; Mark L Blaxter

    2004-01-01

    Background: The genomes of an increasing number of species are being investigated through generation of expressed sequence tags (ESTs). However, ESTs are prone to sequencing errors and typically define incomplete transcripts, making downstream annotation difficult. Annotation would be greatly improved with robust polypeptide translations. Many current solutions for EST translation require a large number of full-length gene sequences for training

  10. Lacunarity Analysis of Genomic Sequences: A Potential Bio-Sequence Analysis Method

    Microsoft Academic Search

    Gopakumar G; Achuthsankar S. Nair

    2011-01-01

    This paper proposes the use of lacunarity analysis of genomic sequences as a potential bio-sequence analysis method. In the present work the fractal property of DNA sequences is confirmed using the lacunarity analysis of their Chaos Game Representation matrices. In another study, the distribution of various n-mers in a genomic sequence is investigated based on the lacunarity analysis of one-dimensional

  11. Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens

    PubMed Central

    Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E.; Bakker, Freek T.

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

  12. Complete genome sequence of Ferroglobus placidus AEDII12DO

    PubMed Central

    Anderson, Iain; Risso, Carla; Holmes, Dawn; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Samuel; Saunders, Elizabeth; Brettin, Thomas; Detter, John C.; Han, Cliff; Tapia, Roxanne; Larimer, Frank; Land, Miriam; Hauser, Loren; Woyke, Tanja; Lovley, Derek; Kyrpides, Nikos; Ivanova, Natalia

    2011-01-01

    Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryarchaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemolithoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and annotation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was sequenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project. PMID:22180810

  13. Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome

    Microsoft Academic Search

    Andreia J Amaral; Hendrik-Jan Megens; Hindrik HD Kerstens; Henri CM Heuven; Bert Dibbits; Richard PMA Crooijmans; Johan T den Dunnen; Martien AM Groenen

    2009-01-01

    BACKGROUND: Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of

  14. Draft Genome Sequence of Stenotrophomonas maltophilia Strain UV74 Reveals Extensive Variability within Its Genomic Group

    PubMed Central

    Conchillo-Solé, Oscar; Yero, Daniel; Coves, Xavier; Huedo, Pol; Martínez-Servat, Sònia

    2015-01-01

    We report the draft genome sequence of Stenotrophomonas maltophilia UV74, isolated from a vascular ulcer. This draft genome sequence shall contribute to the understanding of the evolution and pathogenicity of this species, particularly regarding isolates of clinical origin. PMID:26067959

  15. A physical map of the papaya genome with integrated genetic map and genome sequence

    Microsoft Academic Search

    Qingyi Yu; Eric Tong; Rachel L Skelton; John E Bowers; Meghan R Jones; Jan E Murray; Shaobin Hou; Peizhu Guan; Ricelle A Acob; Ming-Cheng Luo; Paul H Moore; Maqsudul Alam; Andrew H Paterson; Ray Ming

    2009-01-01

    BACKGROUND: Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we

  16. Draft Genome Sequence of Stenotrophomonas maltophilia Strain UV74 Reveals Extensive Variability within Its Genomic Group.

    PubMed

    Conchillo-Solé, Oscar; Yero, Daniel; Coves, Xavier; Huedo, Pol; Martínez-Servat, Sònia; Daura, Xavier; Gibert, Isidre

    2015-01-01

    We report the draft genome sequence of Stenotrophomonas maltophilia UV74, isolated from a vascular ulcer. This draft genome sequence shall contribute to the understanding of the evolution and pathogenicity of this species, particularly regarding isolates of clinical origin. PMID:26067959

  17. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    SciTech Connect

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  18. Accurate whole human genome sequencing using reversible terminator chemistry.

    PubMed

    Bentley, David R; Balasubramanian, Shankar; Swerdlow, Harold P; Smith, Geoffrey P; Milton, John; Brown, Clive G; Hall, Kevin P; Evers, Dirk J; Barnes, Colin L; Bignell, Helen R; Boutell, Jonathan M; Bryant, Jason; Carter, Richard J; Keira Cheetham, R; Cox, Anthony J; Ellis, Darren J; Flatbush, Michael R; Gormley, Niall A; Humphray, Sean J; Irving, Leslie J; Karbelashvili, Mirian S; Kirk, Scott M; Li, Heng; Liu, Xiaohai; Maisinger, Klaus S; Murray, Lisa J; Obradovic, Bojan; Ost, Tobias; Parkinson, Michael L; Pratt, Mark R; Rasolonjatovo, Isabelle M J; Reed, Mark T; Rigatti, Roberto; Rodighiero, Chiara; Ross, Mark T; Sabot, Andrea; Sankar, Subramanian V; Scally, Aylwyn; Schroth, Gary P; Smith, Mark E; Smith, Vincent P; Spiridou, Anastassia; Torrance, Peta E; Tzonev, Svilen S; Vermaas, Eric H; Walter, Klaudia; Wu, Xiaolin; Zhang, Lu; Alam, Mohammed D; Anastasi, Carole; Aniebo, Ify C; Bailey, David M D; Bancarz, Iain R; Banerjee, Saibal; Barbour, Selena G; Baybayan, Primo A; Benoit, Vincent A; Benson, Kevin F; Bevis, Claire; Black, Phillip J; Boodhun, Asha; Brennan, Joe S; Bridgham, John A; Brown, Rob C; Brown, Andrew A; Buermann, Dale H; Bundu, Abass A; Burrows, James C; Carter, Nigel P; Castillo, Nestor; Chiara E Catenazzi, Maria; Chang, Simon; Neil Cooley, R; Crake, Natasha R; Dada, Olubunmi O; Diakoumakos, Konstantinos D; Dominguez-Fernandez, Belen; Earnshaw, David J; Egbujor, Ugonna C; Elmore, David W; Etchin, Sergey S; Ewan, Mark R; Fedurco, Milan; Fraser, Louise J; Fuentes Fajardo, Karin V; Scott Furey, W; George, David; Gietzen, Kimberley J; Goddard, Colin P; Golda, George S; Granieri, Philip A; Green, David E; Gustafson, David L; Hansen, Nancy F; Harnish, Kevin; Haudenschild, Christian D; Heyer, Narinder I; Hims, Matthew M; Ho, Johnny T; Horgan, Adrian M; Hoschler, Katya; Hurwitz, Steve; Ivanov, Denis V; Johnson, Maria Q; James, Terena; Huw Jones, T A; Kang, Gyoung-Dong; Kerelska, Tzvetana H; Kersey, Alan D; Khrebtukova, Irina; Kindwall, Alex P; Kingsbury, Zoya; Kokko-Gonzales, Paula I; Kumar, Anil; Laurent, Marc A; Lawley, Cynthia T; Lee, Sarah E; Lee, Xavier; Liao, Arnold K; Loch, Jennifer A; Lok, Mitch; Luo, Shujun; Mammen, Radhika M; Martin, John W; McCauley, Patrick G; McNitt, Paul; Mehta, Parul; Moon, Keith W; Mullens, Joe W; Newington, Taksina; Ning, Zemin; Ling Ng, Bee; Novo, Sonia M; O'Neill, Michael J; Osborne, Mark A; Osnowski, Andrew; Ostadan, Omead; Paraschos, Lambros L; Pickering, Lea; Pike, Andrew C; Pike, Alger C; Chris Pinkard, D; Pliskin, Daniel P; Podhasky, Joe; Quijano, Victor J; Raczy, Come; Rae, Vicki H; Rawlings, Stephen R; Chiva Rodriguez, Ana; Roe, Phyllida M; Rogers, John; Rogert Bacigalupo, Maria C; Romanov, Nikolai; Romieu, Anthony; Roth, Rithy K; Rourke, Natalie J; Ruediger, Silke T; Rusman, Eli; Sanches-Kuiper, Raquel M; Schenker, Martin R; Seoane, Josefina M; Shaw, Richard J; Shiver, Mitch K; Short, Steven W; Sizto, Ning L; Sluis, Johannes P; Smith, Melanie A; Ernest Sohna Sohna, Jean; Spence, Eric J; Stevens, Kim; Sutton, Neil; Szajkowski, Lukasz; Tregidgo, Carolyn L; Turcatti, Gerardo; Vandevondele, Stephanie; Verhovsky, Yuli; Virk, Selene M; Wakelin, Suzanne; Walcott, Gregory C; Wang, Jingwen; Worsley, Graham J; Yan, Juying; Yau, Ling; Zuerlein, Mike; Rogers, Jane; Mullikin, James C; Hurles, Matthew E; McCooke, Nick J; West, John S; Oaks, Frank L; Lundberg, Peter L; Klenerman, David; Durbin, Richard; Smith, Anthony J

    2008-11-01

    DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications. PMID:18987734

  19. Complete Genome Sequence of Staphylococcus aureus Tager 104, a Sequence Type 49 Ancestor

    PubMed Central

    Davis, Richard; Hossain, Mohammad J.; Liles, Mark R.

    2013-01-01

    We report here the complete genome sequence of Staphylococcus aureus Tager 104, originally isolated from a cutaneous abscess in 1947 by Morris Tager. Sequence typing of the strain revealed its membership in sequence type 49 (ST49), a previously unknown multilocus sequence type (MLST) in clinical samples. PMID:24029757

  20. Assembly of large genomes using second-generation sequencing

    PubMed Central

    Schatz, Michael C.; Delcher, Arthur L.; Salzberg, Steven L.

    2010-01-01

    Second-generation sequencing technology can now be used to sequence an entire human genome in a matter of days and at low cost. Sequence read lengths, initially very short, have rapidly increased since the technology first appeared, and we now are seeing a growing number of efforts to sequence large genomes de novo from these short reads. In this Perspective, we describe the issues associated with short-read assembly, the different types of data produced by second-gen sequencers, and the latest assembly algorithms designed for these data. We also review the genomes that have been assembled recently from short reads and make recommendations for sequencing strategies that will yield a high-quality assembly. PMID:20508146

  1. Draft Genome Sequences of 24 Microbial Strains Assembled from Direct Sequencing from 4 Stool Samples

    PubMed Central

    Hernández, Álvaro; White, Bryan A.; O’Brien, Daniel; Ahlquist, David; Boardman, Lisa

    2015-01-01

    The ability to assemble genomes from metagenomic sequencing avoids the need for culture and any associated culture biases. We assembled 24 essentially complete draft genomes from metagenomic pair-end and size-selected mate pair sequencing from 4 stool samples, 2 from subjects diagnosed with colorectal cancer and 2 from healthy controls. PMID:26021920

  2. Draft genome sequences of 24 microbial strains assembled from direct sequencing from 4 stool samples.

    PubMed

    Jeraldo, Patricio; Hernández, Álvaro; White, Bryan A; O'Brien, Daniel; Ahlquist, David; Boardman, Lisa; Chia, Nicholas

    2015-01-01

    The ability to assemble genomes from metagenomic sequencing avoids the need for culture and any associated culture biases. We assembled 24 essentially complete draft genomes from metagenomic pair-end and size-selected mate pair sequencing from 4 stool samples, 2 from subjects diagnosed with colorectal cancer and 2 from healthy controls. PMID:26021920

  3. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    Microsoft Academic Search

    Hervé Tettelin; Vega Masignani; Michael J. Cieslewicz; Jonathan A. Eisen; Scott Peterson; Michael R. Wessels; Ian T. Paulsen; Karen E. Nelson; Immaculada Margarit; Timothy D. Read; Lawrence C. Madoff; Alex M. Wolf; Maureen J. Beanan; Lauren M. Brinkac; Sean C. Daugherty; Robert T. Deboy; A. Scott Durkin; James F. Kolonay; Ramana Madupu; Matthew R. Lewis; Diana Radune; Nadezhda B. Fedorova; David Scanlan; Hoda Khouri; Stephanie Mulligan; Heather A. Carty; Robin T. Cline; Susan E. van Aken; John Gill; Maria Scarselli; Marirosa Mora; Emilia T. Iacobini; Cecilia Brettoni; Giuliano Galli; Massimo Mariani; Filippo Vegni; Domenico Maione; Daniela Rinaudo; Rino Rappuoli; John L. Telford; Dennis L. Kasper; Guido Grandi; Claire M. Fraser

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined

  4. Genome sequencing and analysis of the model grass Brachypodium distachyon

    SciTech Connect

    Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

    2010-01-01

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

  5. Complete genome sequence of Cellulomonas flavigena type strain (134T)

    SciTech Connect

    Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Foster, Brian [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

    2010-01-01

    Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. Genome sequencing and analysis of the model grass Brachypodium distachyon.

    PubMed

    2010-02-11

    Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

  7. The Release 6 reference sequence of the Drosophila melanogaster genome

    PubMed Central

    Carlson, Joseph W.; Wan, Kenneth H.; Park, Soo; Mendez, Ivonne; Galle, Samuel E.; Booth, Benjamin W.; Pfeiffer, Barret D.; George, Reed A.; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V.; Andreyeva, Evgeniya N.; Boldyreva, Lidiya V.; Marra, Marco; Carvalho, A. Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F.; Rubin, Gerald M.; Karpen, Gary H.

    2015-01-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

  8. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    PubMed

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

  9. Draft Genome Sequence of Pseudomonas syringae pv. persicae NCPPB 2254.

    PubMed

    Zhao, Wenjun; Jiang, Hongshan; Tian, Qian; Hu, Jie

    2015-01-01

    Pseudomonas syringae pv. persicae is a pathogen that causes bacterial decline of stone fruit. Here, we report the draft genome sequence for P. syringae pv. persicae, which was isolated from Prunus persica. PMID:26044420

  10. Complete Genome Sequence of Staphylococcus aureus Phage GRCS.

    PubMed

    Swift, Steven M; Nelson, Daniel C

    2014-01-01

    The Staphylococcus aureus phage GRCS was isolated from a sewage treatment facility in India and has shown potential for phage therapy in a mouse model of bacteremia. Here, we report the complete genome sequence of this bacteriophage. PMID:24723702

  11. Complete genome sequence of Rahnella aquatilis CIP 78.65.

    PubMed

    Martinez, Robert J; Bruce, David; Detter, Chris; Goodwin, Lynne A; Han, James; Han, Cliff S; Held, Brittany; Land, Miriam L; Mikhailova, Natalia; Nolan, Matt; Pennacchio, Len; Pitluck, Sam; Tapia, Roxanne; Woyke, Tanja; Sobecky, Patricia A

    2012-06-01

    Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis. PMID:22582378

  12. Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050

    PubMed Central

    Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

    2014-01-01

    Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

  13. Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050.

    PubMed

    Dabrazhynetskaya, Alena; Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

    2014-01-01

    Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

  14. Initial genome sequencing and analysis of multiple myeloma

    E-print Network

    Lander, Eric S.

    Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

  15. Draft Genome Sequence of Aneurinibacillus migulanus Strain Nagano

    PubMed Central

    Alenezi, Faizah N.; Weitz, Hedda J.; Ben Rebah, Hassen; Luptakova, Lenka; Jaspars, Marcel; Woodward, Stephen

    2015-01-01

    Aneurinibacillus migulanus is characterized by inhibition of growth of a range of plant-pathogenic bacteria and fungi. Here, we report the high-quality draft genome sequences of A. migulanus Nagano. PMID:25838487

  16. Genome sequence of the fish pathogen Flavobacterium columnare ATCC 49512

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Flavobacterium columnare is a Gram-negative, rod shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. ...

  17. Fulfilling the Promise of a Sequenced Human Genome – Part I

    SciTech Connect

    Green, Eric [National Human Genome Research Institute

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

  18. Fulfilling the Promise of a Sequenced Human Genome – Part II

    SciTech Connect

    Green, Eric [National Human Genome Research Institute

    2009-05-27

    Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

  19. Draft Genome Sequence of Pseudomonas syringae pv. persicae NCPPB 2254

    PubMed Central

    Zhao, Wenjun; Tian, Qian; Hu, Jie

    2015-01-01

    Pseudomonas syringae pv. persicae is a pathogen that causes bacterial decline of stone fruit. Here, we report the draft genome sequence for P. syringae pv. persicae, which was isolated from Prunus persica. PMID:26044420

  20. Compressing Genomic Sequence Fragments Using SlimGene

    NASA Astrophysics Data System (ADS)

    Kozanitis, Christos; Saunders, Chris; Kruglyak, Semyon; Bafna, Vineet; Varghese, George

    With the advent of next generation sequencing technologies, the cost of sequencing whole genomes is poised to go below 1000 per human individual in a few years. As more and more genomes are sequenced, analysis methods are undergoing rapid development, making it tempting to store sequencing data for long periods of time so that the data can be re-analyzed with the latest techniques. The challenging open research problems, huge influx of data, and rapidly improving analysis techniques have created the need to store and transfer very large volumes of data.

  1. Complete genome sequence of a novel vitivirus isolated from grapevine.

    PubMed

    Al Rwahnih, Maher; Sudarshana, Mysore R; Uyemoto, Jerry K; Rowhani, Adib

    2012-09-01

    A novel virus-like sequence from grapevine was identified by Illumina sequencing. The complete genome is 7,551 nucleotides in length, with polyadenylation at the 3' end. Translation of the sequence revealed five open reading frames (ORFs). The genomic organization was most similar to those of vitiviruses. The polymerase (ORF1) and coat protein (ORF4) genes shared 31 to 49% nucleotide and 40 to 70% amino acid sequence identities, respectively, with other grapevine vitiviruses. The virus was tentatively named grapevine virus F (GVF). PMID:22879616

  2. THE RICE GENOME: The Cereal of the World's Poor Takes Center Stage

    NSDL National Science Digital Library

    Ronald P. Cantrell (International Rice Research Institute (IRRI); )

    2002-04-05

    Access to the article is free, however registration and sign-in are required. The milestone publication of not one, but two, draft genome sequences of rice (Oryza sativa) brought the cereal crop of the world's poor to center stage. In their Perspectives, Cantrell and Reeves discuss the potential impacts of these sequences for humankind from the standpoints of food security and combating malnutrition.

  3. A compressing method for genome sequence cluster using sequence alignment

    Microsoft Academic Search

    Kwang Su Jung; Nam Hee Yu; Seung Jung Shin; Keun Ho Ryu

    2008-01-01

    After identifying the function of a protein, biologists produce new useful proteins by substituting some residues of the identified protein. These new proteins have high sequence homology (similarity). We define a sequence cluster as a cluster that is constituted of similar sequences. As another example of a sequence cluster, we consider a SNP (single nucleotide polymorphism) cluster. A SNP is

  4. Complete Genome Sequences of Helicobacter pylori Rifampin-Resistant Strains.

    PubMed

    Momynaliev, Kuvat; Chelysheva, Vera; Selezneva, Oksana; Akopian, Tatyana; Alexeev, Dmitry; Govorun, Vadim

    2013-01-01

    Here we present the complete genome sequences of two Helicobacter pylori rifampin-resistant (Rif(r)) strains (Rif1 and Rif2). Rif(r) strains were obtained by in vitro selection of H. pylori 26695 on agar plates with 20 µg/ml rifampin. The genome data provide insights on the genomic diversity of H. pylori under selection by rifampin. PMID:23833139

  5. Intra-species sequence comparisons for annotating genomes

    SciTech Connect

    Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

    2004-07-15

    Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

  6. Complete Genome Sequence of Mycoplasma synoviae Strain WVU 1853T.

    PubMed

    May, Meghan A; Kutish, Gerald F; Barbet, Anthony F; Michaels, Dina L; Brown, Daniel R

    2015-01-01

    A hybrid sequence assembly of the complete Mycoplasma synoviae type strain WVU 1853(T) genome was compared to that of strain MS53. The findings support prior conclusions about M. synoviae, based on the genome of that otherwise uncharacterized field strain, and provide the first evidence of epigenetic modifications in M. synoviae. PMID:26021934

  7. Draft Genome Sequence of Rhodococcus sp. Strain 311R

    PubMed Central

    Ehsani, Elham; Jauregui, Ruy; Geffers, Robert; Jareck, Michael; Boon, Nico; Pieper, Dietmar H.

    2015-01-01

    Here, we report the draft genome sequence of Rhodococcus sp. strain 311R, which was isolated from a site contaminated with alkanes and aromatic compounds. Strain 311R shares 90% of the genome of Rhodococcus erythropolis SK121, which is the closest related bacteria. PMID:25999565

  8. Genome sequence of the cultivated cotton Gossypium arboreum

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

  9. MAIZE CHLOROTIC DWARF VIRUS GENOME SEQUENCE AND POLYPROTEIN CLEAVAGE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genomic sequence (11.8 kb) of the severe Ohio Maize chlorotic dwarf virus isolate (MCDV-S, genus Waikavirus) was determined from overlapping cDNA clones. Approximately 400 kDa polyprotein encoded by the viral genome is post-translationally cleaved into several smaller functional proteins. Wher...

  10. Draft Genome Sequence of Entomopathogenic Serratia liquefaciens Strain FK01

    PubMed Central

    Taira, Erika; Mon, Hiroaki; Mori, Kazuki; Akasaka, Taiki; Tashiro, Kousuke; Yasunaga-Aoki, Chisa; Lee, Jae Man; Kusakabe, Takahiro

    2014-01-01

    In the present study, we determined the draft genome sequence of the entomopathogenic bacterium Serratia liquefaciens FK01, which is highly virulent to the silkworm. The draft genome is ~5.28 Mb in size, and the G+C content is 55.8%. PMID:24970828

  11. Complete Genome Sequence of Antarctic Bacterium Psychrobacter sp. Strain G

    PubMed Central

    Che, Shuai; Song, Lai; Song, Weizhi; Yang, Meng

    2013-01-01

    Here, we report the complete genome sequence of Psychrobacter sp. strain G, isolated from King George Island, Antarctica, which can produce lipolytic enzymes at low temperatures. The genomics information of this strain will facilitate the study of the physiology, cold adaptation properties, and evolution of this genus. PMID:24051316

  12. Response to ‘pervasive sequence patents cover the entire human genome

    PubMed Central

    2014-01-01

    A response toPervasive sequence patents cover the entire human genome by J Rosenfeld and C Mason. Genome Med 2013, 5:27. See related Correspondence by Rosenfeld and Mason, http://genomemedicine.com/content/5/3/27 and related letter by Rosenfeld and Mason, http://genomemedicine.com/content/6/2/15 PMID:25031614

  13. Draft Genome Sequence of Pseudomonas sp. nov. H2.

    PubMed

    Loftie-Eaton, Wesley; Suzuki, Haruo; Bashford, Kelsie; Heuer, Holger; Stragier, Pieter; De Vos, Paul; Settles, Matthew L; Top, Eva M

    2015-01-01

    We report the draft genome sequence of Pseudomonas sp. nov. H2, isolated from creek sediment in Moscow, ID, USA. The strain is most closely related to Pseudomonas putida. However, it has a slightly smaller genome that appears to have been impacted by horizontal gene transfer and poorly maintains IncP-1 plasmids. PMID:25838493

  14. Complete Genome Sequence of Mycoplasma synoviae Strain WVU 1853T

    PubMed Central

    Kutish, Gerald F.; Barbet, Anthony F.; Michaels, Dina L.

    2015-01-01

    A hybrid sequence assembly of the complete Mycoplasma synoviae type strain WVU 1853T genome was compared to that of strain MS53. The findings support prior conclusions about M. synoviae, based on the genome of that otherwise uncharacterized field strain, and provide the first evidence of epigenetic modifications in M. synoviae. PMID:26021934

  15. Genome Sequence of a Salinibacterium sp. Isolated from Antarctic Soil

    PubMed Central

    Shin, Seung Chul; Kim, Su Jin; Ahn, Do Hwan; Lee, Jong Kyu; Lee, Hyoungseok; Lee, Jungeun; Hong, Soon Gyu; Lee, Yung Mi

    2012-01-01

    The draft genome of Salinibacterium sp. PAMC 21357, isolated from permafrost soil of Antarctica, was determined. Here we present a 3.1-Mb draft genome sequence of Salinibacterium sp. that could provide further insight into the genetic determination of its cold-adaptive properties. PMID:22493208

  16. Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium

    PubMed Central

    Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

    2014-01-01

    High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

  17. Alfresco---A Workbench for Comparative Genomic Sequence Analysis

    Microsoft Academic Search

    Niclas Jareborg; Richard Durbin

    2000-01-01

    Comparative analysis of genomic sequences provides a powerful tool for identifying regions of potential biologic function; by comparing corresponding regions of genomes from suitable species, protein coding or regulatory regions can be identified by their homology. This requires the use of several specific types of computational analysis tools. Many programs exist for these types of analysis; not many exist for

  18. Complete Genome Sequence of Lactococcus lactis subsp. cremoris A76

    PubMed Central

    Quinquis, Benoit; Ehrlich, Stanislas Dusko; Sorokin, Alexei

    2012-01-01

    We report the complete genome sequence of Lactococcus lactis subsp. cremoris A76, a dairy strain isolated from a cheese production outfit. Genome analysis detected two contiguous islands fitting to the L. lactis subsp. lactis rather than to the L. lactis subsp. cremoris lineage. This indicates the existence of genetic exchange between the diverse subspecies, presumably related to the technological process. PMID:22328746

  19. Complete genome sequence of Lactococcus lactis subsp. cremoris A76.

    PubMed

    Bolotin, Alexander; Quinquis, Benoit; Ehrlich, Stanislas Dusko; Sorokin, Alexei

    2012-03-01

    We report the complete genome sequence of Lactococcus lactis subsp. cremoris A76, a dairy strain isolated from a cheese production outfit. Genome analysis detected two contiguous islands fitting to the L. lactis subsp. lactis rather than to the L. lactis subsp. cremoris lineage. This indicates the existence of genetic exchange between the diverse subspecies, presumably related to the technological process. PMID:22328746

  20. Complete genome sequence of Pronghorn Virus, a Pestivirus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

  1. Complete genome sequence of pronghorn virus, a pestivirus.

    PubMed

    Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

    2014-01-01

    The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

  2. RESEARCH Open Access Genomic and small RNA sequencing of

    E-print Network

    Green, Pamela

    . Included within the Andropogoneae are major crops such as maize, Sorghum bicolor (sorghum), sugarcane of sorghum as a reference genome sequence for Andropogoneae grasses Kankshita Swaminathan1,2 , Magdy origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome

  3. Draft Genome Sequence of Rhodococcus sp. Strain 311R.

    PubMed

    Ehsani, Elham; Jauregui, Ruy; Geffers, Robert; Jareck, Michael; Boon, Nico; Pieper, Dietmar H; Vilchez-Vargas, Ramiro

    2015-01-01

    Here, we report the draft genome sequence of Rhodococcus sp. strain 311R, which was isolated from a site contaminated with alkanes and aromatic compounds. Strain 311R shares 90% of the genome of Rhodococcus erythropolis SK121, which is the closest related bacteria. PMID:25999565

  4. Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila

    Microsoft Academic Search

    Hiromi Takarada; Mitsuo Sekine; Hiroki Kosugi; Yasunori Matsuo; Takatomo Fujisawa; Seiha Omata; Emi Kishi; Ai Shimizu; Naofumi Tsukatani; Satoshi Tanikawa; Nobuyuki Fujita; Shigeaki Harayama

    2008-01-01

    The soil actinomycete Kocuria rhizophila belongs to the suborder Micrococcineae, a divergent bacterial group for which only a limited amount of genomic information is currently available. K. rhizophila is also important in industrial applications; e.g., it is commonly used as a standard quality control strain for antimicrobial susceptibility testing. Sequencing and annotation of the genome of K. rhizophila DC2201 (NBRC

  5. Fractals related to long DNA sequences and complete genomes

    Microsoft Academic Search

    Bai-Lin Hao; H. C. Lee; Shu-Yu Zhang

    2000-01-01

    In visualizing very long DNA sequences, including the complete genomes of several bacteria, yeast and segments of human genes, we encounter fractal-like patterns underlying these biological objects of prominent importance. The method used here to visualize genomes of organisms may well be used as a convenient tool to trace, e.g., evolutionary relatedness of species. We describe the method and explain

  6. The Complete Genome Sequence of Mycoplasma bovis Strain Hubei-1

    Microsoft Academic Search

    Yuan Li; Huajun Zheng; Yang Liu; Yanwei Jiang; Jiuqing Xin; Wei Chen; Zhiqiang Song; Herman Tse

    2011-01-01

    Infection by Mycoplasma bovis (M. bovis) can induce diseases, such as pneumonia and otitis media in young calves and mastitis and arthritis in older animals. Here, we report the finished and annotated genome sequence of M. bovis strain Hubei-1, a strain isolated in 2008 that caused calf pneumonia on a Chinese farm. The genome of M. bovis strain Hubei-1 contains

  7. A snapshot of the emerging tomato genome sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

  8. Draft genome sequence of Therminicola potens strain JR

    SciTech Connect

    Byrne-Bailey, K.G.; Wrighton, K.C.; Melnyk, R.A.; Agbo, P.; Hazen, T.C.; Coates, J.D.

    2010-07-01

    'Thermincola potens' strain JR is one of the first Gram-positive dissimilatory metal-reducing bacteria (DMRB) for which there is a complete genome sequence. Consistent with the physiology of this organism, preliminary annotation revealed an abundance of multiheme c-type cytochromes that are putatively associated with the periplasm and cell surface in a Gram-positive bacterium. Here we report the complete genome sequence of strain JR.

  9. Sequence and Organization of the Neodiprion lecontei Nucleopolyhedrovirus Genome

    Microsoft Academic Search

    Hilary A. M. Lauzon; Christopher J. Lucarotti; Peter J. Krell; Qili Feng; Arthur Retnakaran; Basil M. Arif

    2004-01-01

    All fully sequenced baculovirus genomes, with the exception of the dipteran Culex nigripalpus nucleopoly- hedrovirus (CuniNPV), have previously been from Lepidoptera. This study reports the sequencing and char- acterization of a hymenopteran baculovirus, Neodiprion lecontei nucleopolyhedrovirus (NeleNPV), from the red- headed pine sawfly. NeleNPV has the smallest genome so far published (81,755 bp) and has a GC content of only

  10. The Genome Sequence of the SARS-Associated Coronavirus

    Microsoft Academic Search

    Marco A. Marra; Steven J. M. Jones; Caroline R. Astell; Robert A. Holt; Angela Brooks-Wilson; Yaron S. N. Butterfield; Jaswinder Khattra; Jennifer K. Asano; Sarah A. Barber; Susanna Y. Chan; Alison Cloutier; Shaun M. Coughlin; Doug Freeman; Noreen Girn; Obi L. Griffith; Stephen R. Leach; Michael Mayo; Helen McDonald; Stephen B. Montgomery; Pawan K. Pandoh; Anca S. Petrescu; A. Gordon Robertson; Jacqueline E. Schein; Asim Siddiqui; Duane E. Smailus; Jeff M. Stott; George S. Yang; Francis Plummer; Anton Andonov; Harvey Artsob; Nathalie Bastien; Kathy Bernard; Timothy F. Booth; Donnie Bowness; Michael Drebot; Lisa Fernando; Ramon Flick; Michael Garbutt; Michael Garbutt; Allen Grolla; Heinz Feldmann; Adrienne Meyers; Amin Kabani; Yan Li; Susan Normand; Ute Stroher; Graham A. Tipples; Shaun Tyler; Robert Vogrig; Diane Ward; Robert C. Brunham; Mel Krajden; Martin Petric; Danuta M. Skowronski; Chris Upton; Rachel L. Roper

    2003-01-01

    We sequenced the 29,751-base genome of the severe acute respiratory syndrome (SARS)-associated coronavirus known as the Tor2 isolate. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously

  11. Draft genome sequence of Gluconobacter thailandicus NBRC 3257

    PubMed Central

    Matsutani, Minenosuke; Yakushi, Toshiharu

    2014-01-01

    Gluconobacter thailandicus strain NBRC 3257, isolated from downy cherry (Prunus tomentosa), is a strict aerobic rod-shaped Gram-negative bacterium. Here, we report the features of this organism, together with the draft genome sequence and annotation. The draft genome sequence is composed of 107 contigs for 3,446,046 bp with 56.17% G+C content and contains 3,360 protein-coding genes and 54 RNA genes. PMID:25197448

  12. Genome sequence of the biocontrol strain Pseudomonas fluorescens F113.

    PubMed

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A; Giddens, Stephen R; Coppoolse, Eric R; Muriel, Candela; Stiekema, Willem J; Rainey, Paul B; Dowling, David; O'Gara, Fergal; Martín, Marta; Rivilla, Rafael

    2012-03-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  13. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    PubMed Central

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  14. Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii

    Microsoft Academic Search

    Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

    1996-01-01

    The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

  15. Comparative Genome Analysis at the Sequence Level in the Brassicaceae

    Microsoft Academic Search

    Chris Town; Renate Schmidt; Ian Bancroft

    \\u000a In the world of plant genome sequencing, the cultivated Brassica species have been relatively under-resourced compared with other crop species largely due to their position in the economic\\u000a hierarchy of perceived importance. Thus, with the completion of the Arabidopsis thaliana genome in the year 2000, the limited sequencing efforts undertaken in the Brassica crops and other species of the Brassicaceae

  16. Complete chloroplast genome sequences of Solanum bulbocastanum , Solanum lycopersicum and comparative analyses with other Solanaceae genomes

    Microsoft Academic Search

    Henry Daniell; Seung-Bum Lee; Justin Grevich; Christopher Saski; Tania Quesada-Vargas; Chittibabu Guda; Jeffrey Tomkins; Robert K. Jansen

    2006-01-01

    Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and

  17. Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani

    Microsoft Academic Search

    Holger Br; Gerhard Gottschalkb

    The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

  18. Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani

    Microsoft Academic Search

    Holger Brüggemann; Gerhard Gottschalk

    2004-01-01

    The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

  19. Genome sequence of the date palm Phoenix dactylifera L

    PubMed Central

    Al-Mssallem, Ibrahim S.; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M.; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O.; Jia, Shangang; Yin, An; Alhuzimi, Eman M.; Alsaihati, Burair A.; Al-Owayyed, Saad A.; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A.; Sun, Gaoyuan; Majrashi, Majed A.; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A.; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F.; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R.; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

    2013-01-01

    Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm’s unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

  20. Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences

    PubMed Central

    Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

    2002-01-01

    As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

  1. Characterizing the walnut genome through analyses of BAC end sequences.

    PubMed

    Wu, Jiajie; Gu, Yong Q; Hu, Yuqin; You, Frank M; Dandekar, Abhaya M; Leslie, Charles A; Aradhya, Mallikarjuna; Dvorak, Jan; Luo, Ming-Cheng

    2012-01-01

    Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2 Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut. PMID:22101470

  2. Genomic Sequence around Butterfly Wing Development Genes: Annotation and Comparative Analysis

    Microsoft Academic Search

    Inês C. Conceição; Anthony D. Long; Jonathan D. Gruber; Patrícia Beldade

    2011-01-01

    BackgroundAnalysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available

  3. Choosing a benchtop sequencing machine to characterise Helicobacter pylori genomes.

    PubMed

    Perkins, Timothy T; Tay, Chin Yen; Thirriot, Fanny; Marshall, Barry

    2013-01-01

    The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori. PMID:23840736

  4. Choosing a Benchtop Sequencing Machine to Characterise Helicobacter pylori Genomes

    PubMed Central

    Perkins, Timothy T.; Tay, Chin Yen; Thirriot, Fanny; Marshall, Barry

    2013-01-01

    The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori. PMID:23840736

  5. RESTseq – Efficient Benchtop Population Genomics with RESTriction Fragment SEQuencing

    PubMed Central

    Stolle, Eckart; Moritz, Robin F. A.

    2013-01-01

    We present RESTseq, an improved approach for a cost efficient, highly flexible and repeatable enrichment of DNA fragments from digested genomic DNA using Next Generation Sequencing platforms including small scale Personal Genome sequencers. Easy adjustments make it suitable for a wide range of studies requiring SNP detection or SNP genotyping from fine-scale linkage mapping to population genomics and population genetics also in non-model organisms. We demonstrate the validity of our approach by comparing two honeybee and several stingless bee samples. PMID:23691128

  6. Complete genome sequence of Serratia plymuthica strain AS12

    SciTech Connect

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  7. A standard variation file format for human genome sequences.

    PubMed

    Reese, Martin G; Moore, Barry; Batchelor, Colin; Salas, Fidel; Cunningham, Fiona; Marth, Gabor T; Stein, Lincoln; Flicek, Paul; Yandell, Mark; Eilbeck, Karen

    2010-01-01

    Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment. PMID:20796305

  8. Complete genome sequence of Ferroglobus placidus AEDII12DO

    SciTech Connect

    Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Risso, Carla [University of Massachusetts, Amherst; Holmes, Dawn [University of Massachusetts, Amherst; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Lovley, Derek [University of Massachusetts, Amherst; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryar- chaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemoli- thoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and anno- tation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was se- quenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project.

  9. Complete mitochondrial genome sequence of Aoluguya reindeer (Rangifer tarandus).

    PubMed

    Ju, Yan; Liu, Huamiao; Rong, Min; Yang, Yifeng; Wei, Haijun; Shao, Yuanchen; Chen, Xiumin; Xing, Xiumei

    2014-12-01

    Abstract The complete mitochondria genome of the reindeer, Rangifer tarandus, was determined by accurate polymerase chain reaction. The entire genome is 16,357?bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a D-loop region, all of which are arranged in a typical vertebrate manner. The overall base composition of the reindeer's mitochondrial genome is 33.7% of A, 23.1% of C, 30.1% of T and 13.2%of G. A termination associated sequence and several conserved central sequence block domains were discovered within the control region. PMID:25469816

  10. Analysis of Common k-mers for Whole Genome Sequences Using SSB-Tree

    Microsoft Academic Search

    Jeong-Hyeon Choi Hwan-Gue Cho

    2002-01-01

    As sequenced genomes become larger and sequencing process becomes faster, there is a need to develop a tool to analyze sequences in the whole genomic scale. However, on-memory algorithms such as sux tree and sux array are not applicable to the analysis of whole genome sequence set, since the size of individual whole genome ranges from several million base pairs

  11. GS-Aligner: A Novel Tool for Aligning Genomic Sequences Using Bit-Level Operations

    Microsoft Academic Search

    Arthur Chun-Chieh Shih; Wen-Hsiung Li

    2003-01-01

    A novel algorithm, GS-Aligner, that uses bit-level operations was developed for aligning genomic sequences. GS- Aligner is efficient in terms of both time and space for aligning two very long genomic sequences and for identifying genomic rearrangements such as translocations and inversions. It is suitable for aligning fairly divergent sequences such as human and mouse genomic sequences. It consists of

  12. Sequence analysis and organization of the Neodiprion abietis nucleopolyhedrovirus genome.

    PubMed

    Duffy, Simon P; Young, Aaron M; Morin, Benoit; Lucarotti, Christopher J; Koop, Ben F; Levin, David B

    2006-07-01

    Of 30 baculovirus genomes that have been sequenced to date, the only nonlepidopteran baculoviruses include the dipteran Culex nigripalpus nucleopolyhedrovirus and two hymenopteran nucleopolyhedroviruses that infect the sawflies Neodiprion lecontei (NeleNPV) and Neodiprion sertifer (NeseNPV). This study provides a complete sequence and genome analysis of the nucleopolyhedrovirus that infects the balsam fir sawfly Neodiprion abietis (Hymenoptera, Symphyta, Diprionidae). The N. abietis nucleopolyhedrovirus (NeabNPV) is 84,264 bp in size, with a G+C content of 33.5%, and contains 93 predicted open reading frames (ORFs). Eleven predicted ORFs are unique to this baculovirus, 10 ORFs have a putative sequence homologue in the NeleNPV genome but not the NeseNPV genome, and 1 ORF (neab53) has a putative sequence homologue in the NeseNPV genome but not the NeleNPV genome. Specific repeat sequences are coincident with major genome rearrangements that distinguish NeabNPV and NeleNPV. Genes associated with these repeat regions encode a common amino acid motif, suggesting that they are a family of repeated contiguous gene clusters. Lepidopteran baculoviruses, similarly, have a family of repeated genes called the bro gene family. However, there is no significant sequence similarity between the NeabNPV and bro genes. Homologues of early-expressed genes such as ie-1 and lef-3 were absent in NeabNPV, as they are in the previously sequenced hymenopteran baculoviruses. Analyses of ORF upstream sequences identified potential temporally distinct genes on the basis of putative promoter elements. PMID:16809301

  13. A nucleotide composition constraint of genome sequences.

    PubMed

    Zhang, Chun-Ting; Zhang, Ren

    2004-04-01

    Let a, c, g and t denote the occurrence frequencies of A, C, G and T, respectively, in a genome. We calculated the statistical quantity S = a2 + c2 + g2 + t2 for each of 809 genomes (11 archaea, 42 bacteria, 3 eukaryota, 90 phages, 36 viroids and 627 viruses) and 236 plasmids. We found that S < 1/3 is strictly valid for almost all of the above genomes or plasmids. As a direct deduction of the above observation, it is shown that (i) the statistical quantity S is a kind of genome order index, which is negatively correlated with the Shannon H function; (ii) S < 1/3 suggests that a minimal value of the Shannon H function is required for each genome; (iii) S defined above would be a new biological statistical quantity, useful to describe the composition features of genomes; (iv) By jointly considering the Chargaff Parity Rule 2, it is shown that the genomic G + C content should be in between 0.211 and 0.789. PMID:15130543

  14. Genome sequence and comparative genome analysis of Pseudomonas syringae pv. syringae type strain ATCC 19310.

    PubMed

    Park, Yong-Soon; Jeong, Haeyoung; Sim, Young Mi; Yi, Hwe-Su; Ryu, Choong-Min

    2014-04-01

    Pseudomonas syringae pv. syringae (Psy) is a major bacterial pathogen of many economically important plant species. Despite the severity of its impact, the genome sequence of the type strain has not been reported. Here, we present the draft genome sequence of Psy ATCC 19310. Comparative genomic analysis revealed that Psy ATCC 19310 is closely related to Psy B728a. However, only a few type III effectors, which are key virulence factors, are shared by the two strains, indicating the possibility of host-pathogen specificity and genome dynamics, even under the pathovar level. PMID:24444998

  15. Quantifying Genome Editing Outcomes at Endogenous Loci using SMRT Sequencing

    PubMed Central

    Clark, Joseph; Punjya, Niraj; Sebastiano, Vittorio; Bao, Gang; Porteus, Matthew H

    2014-01-01

    SUMMARY Targeted genome editing with engineered nucleases has transformed the ability to introduce precise sequence modifications at almost any site within the genome. A major obstacle to probing the efficiency and consequences of genome editing is that no existing method enables the frequency of different editing events to be simultaneously measured across a cell population at any endogenous genomic locus. We have developed a novel method for quantifying individual genome editing outcomes at any site of interest using single molecule real time (SMRT) DNA sequencing. We show that this approach can be applied at various loci, using multiple engineered nuclease platforms including TALENs, RNA guided endonucleases (CRISPR/Cas9), and ZFNs, and in different cell lines to identify conditions and strategies in which the desired engineering outcome has occurred. This approach facilitates the evaluation of new gene editing technologies and permits sensitive quantification of editing outcomes in almost every experimental system used. PMID:24685129

  16. Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes

    Microsoft Academic Search

    Sotaro Chiba; Hideki Kondo; Akio Tani; Daisuke Saisho; Wataru Sakamoto; Satoko Kanematsu; Nobuhiro Suzuki

    2011-01-01

    Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The

  17. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  18. Complete genome sequence of Streptobacillus moniliformis type strain (9901T)

    SciTech Connect

    Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Sims, David [Los Alamos National Laboratory (LANL); Meincke, Linda [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sproer, Cathrin [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL)

    2009-01-01

    Streptobacillus moniliformis Levaditi et al. 1925 is the sole and type species of the genus, and is of phylogenetic interest because of its isolated location in the sparsely populated and neither taxonomically nor genomically much accessed family 'Leptotrichiaceae' within the phylum 'Fusobacteria'. S. moniliformis, a Gram-negative, non-motile and pleomorphic bacterium, is the etiologic agent of rat bite fever and Haverhill fever. Strain 9901T, the type strain of the species, was isolated from a patient with rat bite fever. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is only the second completed genome sequence of the order 'Fusobacteriales' and no more than the third sequence from the phylum 'Fusobacteria'. The 1,662,578 bp long chromosome and the 10,702 bp plasmid with a total of 1511 protein-coding and 55 RNA genes are part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. Detecting selection using a single genome sequence of

    E-print Network

    Plotkin, Joshua B.

    on different stages of an organism's life cycle: genes expressed in the ring stage4 of P. falciparum are under differential selective pressures on genes by inspecting a single genome sequence for a footprint of non-synonymous substitutions. Our method rests on a simple observation: if a protein coding region of a nucleotide sequence has

  20. GENOMIC SEQUENCE ANALYSIS OF LEPTOSPIRA BORGPETERSENII SEROVAR HARDJO

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A genomic library from Leptospira borgpetersenii serovar hardjo strain JB197 was prepared by mechanically shearing the DNA and inserting it into a positive selection vector. DNA was prepared from approximately 22,000 random clones and used as templates for automated sequencing. Sequence data was c...

  1. Nucleotide Sequence of Potato Virus Y (N Strain) Genomic RNA

    Microsoft Academic Search

    CHRISTOPHE ROBAGLIA; M. Durand-Tardif; M. Tronchet; G. Boudazin; S. Astier-Manifacier; F. Casse-Delbart

    1989-01-01

    SUMMARY The complete nucleotide sequence of the genomic RNA of the potyvirus potato virus Y strain N (PVYn) was obtained from cloned cDNAs. This sequence is 9704 nucleotides long and can encode a polyprotein of 3063 amino acids. The positions of the cleavage sites at the N terminus of the capsid and cytoplasmic inclusion proteins have been determined. Other putative

  2. Interpreting the Human Genome Sequence, Using Stochastic Grammars

    Microsoft Academic Search

    Richard Durbin

    2001-01-01

    The 3 billion base pair sequence of the human genome is now available, and attention is focusing on annotating it to extract biological meaning. I will discuss what we have obtained, and the methods that are being used to analyse biological sequences. In particular I will discuss approaches using stochastic grammars analogous to those used in computational linguistics, both for

  3. Draft Genome Sequences of Two Toxigenic Corynebacterium ulcerans Strains

    PubMed Central

    Fournier, Eric; Massé, Cynthia; Charest, Hugues; Bernard, Kathryn; Côté, Jean-Charles; Tremblay, Cécile

    2015-01-01

    Here, we present the draft genome sequences of two toxigenic Corynebacterium ulcerans strains isolated from two different patients: one from a blood sample and the other from a scar exudate following surgery. Although these two strains harbor the diphtheria toxin gene tox, no full prophage sequences were found in the flanking regions. PMID:26112794

  4. Characterization of microsatellites revealed by genomic sequencing of Populus trichocarpa

    Microsoft Academic Search

    Gerald A. Tuskan; Lee E. Gunter; Zamin K. Yang; TongMing Yin; Mitchell M. Sewell; Stephen P. DiFazio

    2004-01-01

    Microsatellites or simple sequence repeats (SSRs) are highly polymorphic, codominant markers that have great value for the construction of genetic maps, comparative mapping, population genetic surveys, and paternity analy- ses. Here, we report the development and testing of a set of SSR markers derived from shotgun sequencing from Populus trichocarpa Torr. & A. Gray, a nonenriched genomic DNA library, and

  5. Targeted enrichment of genomic DNA regions for next generation sequencing

    Microsoft Academic Search

    F. Mertens; A. El-Sharawy; S. Sauer; J. Van Helvoort; P. J. Van der Zaag; A. Franke; M. Nilsson; Lehrach. H; A. Brookes

    2011-01-01

    In this review we discuss the latest targeted enrichment methods, and aspects of their utilization along with second generation sequencing for complex genome analysis. In doing so we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next generation sequencing has made great progress

  6. Complete Genome Sequence of the Alfalfa latent virus

    PubMed Central

    Shao, Jonathan; Postnikova, Olga A.

    2015-01-01

    The first complete genome sequence of the Alfalfa latent carlavirus (ALV) was obtained by primer walking and Illumina RNA sequencing. The virus differs substantially from the Czech ALV isolate and the Pea streak virus isolate from Wisconsin. The absence of a clear nucleic acid-binding protein indicates ALV divergence from other carlaviruses. PMID:25883281

  7. Complete Genomic Sequence of Issyk-Kul Virus.

    PubMed

    Atkinson, Barry; Marston, Denise A; Ellis, Richard J; Fooks, Anthony R; Hewson, Roger

    2015-01-01

    Issyk-Kul virus (ISKV) is an ungrouped virus tentatively assigned to the Bunyaviridae family and is associated with an acute febrile illness in several central Asian countries. Using next-generation sequencing technologies, we report here the full-genome sequence for this novel unclassified arboviral pathogen circulating in central Asia. PMID:26139711

  8. Environmental Genome Shotgun Sequencing of the Sargasso Sea

    Microsoft Academic Search

    J. Craig Venter; Karin Remington; John F. Heidelberg; Aaron L. Halpern; Doug Rusch; Dongying Wu; Ian Paulsen; Karen E. Nelson; William Nelson; Derrick E. Fouts; Samuel Levy; Anthony H. Knap; Michael W. Lomas; Ken Nealson; Owen White; Jeremy Peterson; Jeff Hoffman; Rachel Parsons; Holly Baden-Tillson; Cynthia Pfannkoch; Yu-Hui Rogers; Hamilton O. Smith

    2004-01-01

    We have applied ``whole-genome shotgun sequencing'' to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These

  9. Complete Genomic Sequence of Issyk-Kul Virus

    PubMed Central

    Marston, Denise A.; Ellis, Richard J.; Fooks, Anthony R.; Hewson, Roger

    2015-01-01

    Issyk-Kul virus (ISKV) is an ungrouped virus tentatively assigned to the Bunyaviridae family and is associated with an acute febrile illness in several central Asian countries. Using next-generation sequencing technologies, we report here the full-genome sequence for this novel unclassified arboviral pathogen circulating in central Asia. PMID:26139711

  10. Complete genome sequencing and variant analysis of a Pakistani individual.

    PubMed

    Azim, Muhammad Kamran; Yang, Chuanchun; Yan, Zhixiang; Choudhary, Muhammad Iqbal; Khan, Asifullah; Sun, Xiao; Li, Ran; Asif, Huma; Sharif, Sana; Zhang, Yong

    2013-09-01

    We sequenced the genome of a Pakistani male at 25.5x coverage using massively parallel sequencing technology. More than 90% of the sequence reads were mapped to the human reference genome. In subsequent analysis, we identified 3,224,311 single-nucleotide polymorphisms (SNPs), of which 388,532 (12% of the total SNPs) had not been previously recorded in single nucleotide polymorphism database (dbSNP) or the 1000 Genomes Project database. The 5991 non-synonymous coding variants were screened for deleterious or disease-associated SNPs. Analysis of genes with deleterious SNPs identified 'retinoic acid signaling' and 'regulation of transcription' as the enriched Gene Ontology terms. Scanning of non-synonymous SNPs against the OMIM revealed several disease and phenotype-associated variants in Pakistani genome. Comparative analysis with Indian genome sequence revealed >1.8 million shared SNPs; 32% of which were annotated in ~14,000 genes. Gene Ontology (GO) terms analysis of these genes identified 'response to jasmonic acid stimulus', 'aminoglycoside antibiotic metabolic process' and 'glycoside metabolic process' with considerable enrichment. A total of 59,558 of small indels (1-5 bp) and 16,063 large structural variations were found; 54% of which was novel. Substantial number of novel structural variations discovered in Pakistani genome enforced previous inferences that (a) structural variations are major type of variation in the genome and (b) compared with SNPs, they putatively exhibit equivalent or superior functional roles. This genome sequence information will be an important reference for population-wide genomics studies of ethnically diverse South Asian subcontinent. PMID:23842039

  11. The complete genome sequence of the carcinogenic bacterium Helicobacter hepaticus.

    PubMed

    Suerbaum, Sebastian; Josenhans, Christine; Sterzenbach, Torsten; Drescher, Bernd; Brandt, Petra; Bell, Monica; Droge, Marcus; Fartmann, Berthold; Fischer, Hans-Peter; Ge, Zhongming; Horster, Andrea; Holland, Rudi; Klein, Kerstin; Konig, Jochen; Macko, Ludwig; Mendz, George L; Nyakatura, Gerald; Schauer, David B; Shen, Zeli; Weber, Jacqueline; Frosch, Matthias; Fox, James G

    2003-06-24

    Helicobacter hepaticus causes chronic hepatitis and liver cancer in mice. It is the prototype enterohepatic Helicobacter species and a close relative of Helicobacter pylori, also a recognized carcinogen. Here we report the complete genome sequence of H. hepaticus ATCC51449. H. hepaticus has a circular chromosome of 1,799,146 base pairs, predicted to encode 1,875 proteins. A total of 938, 953, and 821 proteins have orthologs in H. pylori, Campylobacter jejuni, and both pathogens, respectively. H. hepaticus lacks orthologs of most known H. pylori virulence factors, including adhesins, the VacA cytotoxin, and almost all cag pathogenicity island proteins, but has orthologs of the C. jejuni adhesin PEB1 and the cytolethal distending toxin (CDT). The genome contains a 71-kb genomic island (HHGI1) and several genomic islets whose G+C content differs from the rest of the genome. HHGI1 encodes three basic components of a type IV secretion system and other virulence protein homologs, suggesting a role of HHGI1 in pathogenicity. The genomic variability of H. hepaticus was assessed by comparing the genomes of 12 H. hepaticus strains with the sequenced genome by microarray hybridization. Although five strains, including all those known to have caused liver disease, were indistinguishable from ATCC51449, other strains lacked between 85 and 229 genes, including large parts of HHGI1, demonstrating extensive variation of genome content within the species. PMID:12810954

  12. Genome Sequence of Luminous Piezophile Photobacterium phosphoreum ANT-2200

    PubMed Central

    Zhang, Sheng-Da; Barbe, Valérie; Garel, Marc; Zhang, Wei-Jia; Chen, Haitao; Santini, Claire-Lise; Murat, Dorothée; Jing, Hongmei; Zhao, Yuan; Lajus, Aurélie; Martini, Séverine; Pradel, Nathalie; Tamburini, Christian

    2014-01-01

    Bacteria of the genus Photobacterium thrive worldwide in oceans and show substantially varied lifestyles, including free-living, commensal, pathogenic, symbiotic, and piezophilic. Here, we present the genome sequence of a luminous, piezophilic Photobacterium phosphoreum strain, ANT-2200, isolated from a water column at 2,200 m depth in the Mediterranean Sea. It is the first genomic sequence of the P. phosphoreum group. An analysis of the sequence provides insight into the adaptation of bacteria to the deep-sea habitat. PMID:24744322

  13. Identification of genes in genomic and EST sequences

    SciTech Connect

    Fields, C.; Adams, M.D.; Kerlavage, A.R.; Dubnick, M.; McCombie, W.R.; Martin-Gallardo, A.; Venter, J.C. [National Inst. of Neurological Disorders and Stroke, Bethesda, MD (United States). Receptor Biochemistry and Molecular Biology Section; White, O. [New Mexico State Univ., Las Cruces, NM (United States). Computing Research Lab.

    1993-12-31

    Currently-available software tools are capable of predicting the locations of most protein-coding genes in anonymous genomic DNA sequences. The use of predicted exxon to select primers for PCR amplification from cDNA libraries allows the complete structures of novel genes to be determined efficiently. As the number of expressed sequence tag (EST) sequences increases, the fraction of genes that can be localized in genomic sequences by searching EST databases will rapidly approach unity. The challenge for automated DNA sequence analysis is now to develop methods for accurately predicting gene structure and alternative splicing patterns. Substantially improving current accuracies in gene structure prediction will require retrospective comparative analysis of sequences from different organisms and gene families.

  14. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  15. A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    FitzGerald, Michael [Broad Institute

    2013-02-12

    Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  16. Genomic multiple sequence alignments: refinement using a genetic algorithm

    PubMed Central

    Wang, Chunlin; Lefkowitz, Elliot J

    2005-01-01

    Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation) score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned) regions of the orthopoxvirus alignment. Overall sequence identity increased only slightly; but significantly, this occurred at the same time that the overall alignment length decreased – through the removal of gaps – by approximately 200 gapped regions representing roughly 1,300 gaps. Conclusion We have implemented a genetic algorithm in parallel mode to optimize multiple genomic sequence alignments initially generated by various alignment tools. Benchmarking experiments showed that the refinement algorithm improved genomic sequence alignments within a reasonable period of time. PMID:16086841

  17. The Diploid Genome Sequence of an Individual Human

    Microsoft Academic Search

    Samuel Levy; Granger Sutton; Pauline C. Ng; Lars Feuk; Aaron L. Halpern; Brian P. Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F. Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R. MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B. Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A. Kravitz; Dana A. Busam; Karen Y. Beeson; Tina C. McIntosh; Karin A. Remington; Josep F. Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E. Frazier; Stephen W. Scherer; Robert L. Strausberg; J. Craig Venter

    2007-01-01

    Presented here is a genome sequence of an individual human. It was produced from ;32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison

  18. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Microsoft Academic Search

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-01-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis\\u000a of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST.\\u000a The routines are used to develop a system for automated annotation of genome DNA sequences.

  19. Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies

    Microsoft Academic Search

    Sacha A. F. T. Van Hijum; Aldert L. Zomer; Oscar P. Kuipers; Jan Kok

    2005-01-01

    With genome sequencing efforts increasing expo- nentially, valuable information accumulates on geno- mic content of the various organisms sequenced. Projector 2 uses (un)finished genomic sequences of an organism as a template to infer linkage informa- tion for a genome sequence assembly of a related organism being sequenced. The remaining gaps between contigs for which no linkage information is present can

  20. Complete Genome Sequences for 59 Burkholderia Isolates, Both Pathogenic and Near Neighbor

    PubMed Central

    Bishop-Lilly, Kimberly A.; Ladner, Jason T.; Daligault, Hajnalka E.; Davenport, Karen W.; Jaissle, James; Frey, Kenneth G.; Koroleva, Galina I.; Bruce, David C.; Coyne, Susan R.; Broomall, Stacey M.; Li, Po-E; Teshima, Hazuki; Gibbons, Henry S.; Palacios, Gustavo F.; Rosenzweig, C. Nicole; Redden, Cassie L.; Xu, Yan; Minogue, Timothy D.; Chain, Patrick S.

    2015-01-01

    The genus Burkholderia encompasses both pathogenic (including Burkholderia mallei and Burkholderia pseudomallei, U.S. Centers for Disease Control and Prevention Category B listed), and nonpathogenic Gram-negative bacilli. Here we present full genome sequences for a panel of 59 Burkholderia strains, selected to aid in detection assay development. PMID:25931592

  1. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  2. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    E-print Network

    2010-01-01

    after the first plant genome sequence was com- pleted [1],of the genome sequence of the flowering plant Arabidopsisgenome ref- erence sequence would fill a great evolutionary gap, but it * Correspondence: dbneale@ucdavis.edu Department of Plant

  3. Application of full mitochondrial genome sequencing using 454 GS FLX pyrosequencing

    Microsoft Academic Search

    Martin Mikkelsen; Eszter Rockenbauer; Andrea Wächter; Liane Fendt; Bettina Zimmermann; Walther Parson; Sandra Abel Nielsen; Tom Gilbert; Eske Willerslev; Niels Morling

    2009-01-01

    The GS FLX pyrosequencing platform using parallel tagged sequencing was tested on 10 Somali individuals for sequencing of the complete mitochondrial genome. The amplicons were sequenced twice with increasing coverage to establish the minimum of coverage needed to produce reliable sequence reads. The genome sequences were compared to previously obtained control regions sequences with Sanger sequencing and 49 SNPs in

  4. Genome Sequence of the Pea Aphid Acyrthosiphon pisum

    PubMed Central

    2010-01-01

    Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. PMID:20186266

  5. Plasmodium knowlesi Genome Sequences from Clinical Isolates Reveal Extensive Genomic Dimorphism

    PubMed Central

    Millar, Scott B.; Sanderson, Theo; Otto, Thomas D.; Lu, Woon Chan; Krishna, Sanjeev; Rayner, Julian C.; Cox-Singh, Janet

    2015-01-01

    Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology. PMID:25830531

  6. Plasmodium knowlesi genome sequences from clinical isolates reveal extensive genomic dimorphism.

    PubMed

    Pinheiro, Miguel M; Ahmed, Md Atique; Millar, Scott B; Sanderson, Theo; Otto, Thomas D; Lu, Woon Chan; Krishna, Sanjeev; Rayner, Julian C; Cox-Singh, Janet

    2015-01-01

    Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology. PMID:25830531

  7. Building a model: developing genomic resources for common milkweed ( Asclepias syriaca ) with low coverage genome sequencing

    Microsoft Academic Search

    Shannon CK Straub; Mark Fishbein; Tatyana Livshultz; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C Cronn; Aaron Liston

    2011-01-01

    Background  Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic\\u000a resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of\\u000a the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development

  8. Mapping the Human Reference Genome's Missing Sequence by Three-Way Admixture in Latino Genomes

    E-print Network

    McCarroll, Steve

    ARTICLE Mapping the Human Reference Genome's Missing Sequence by Three-Way Admixture in Latino Genomes Giulio Genovese,1,2,3,* Robert E. Handsaker,2,3 Heng Li,2,3 Eimear E. Kenny,4,5,6,7,8 and Steven A. McCarroll1,2,3,* A principal obstacle to completing maps and analyses of the human genome involves

  9. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Microsoft Academic Search

    Nikos Kyrpides; Jason Rodriquez; Jason Rodriguez; Dwi Susanti; Claudia Reich; Luke E. Ulrich; James G. Elkins; Kostas Mavromatis; Athanasios Lykidis; Matt Nolan; Linda S. Thompson; Alla L. Lapidus; Alex Copeland; Igor B Zhulin; Chris Detter; Biswarup Mukhopadhyay; James Bristow; William Whitman

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids,

  10. The genome sequence of caenorhabditis briggsae: a platform for comparative genomics

    Microsoft Academic Search

    Lincoln D. Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R. Brent; Nansheng Chen; Asif Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter DEustachio; David H. A. Fitch; Lucinda A. Fulton; Robert E. Fulton; Sam Griffiths-Jones; Todd W. Harris; LaDeana W. Hillier; Ravi Kamath; Patricia E. Kuwabara; Elaine R. Mardis; Marco A. Marra; Tracie L. Miner; Patrick Minx; James C. Mullikin; Robert W. Plumb; Jane Rogers; Jacqueline E. Schein; Marc Sohrmann; John Spieth; Jason E. Stajich; Chaochun Wei; David Willey; Richard K. Wilson; Richard Durbin; Robert H. Waterston

    2003-01-01

    The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome

  11. The ClinSeq Project: Piloting large-scale genome sequencing for research in genomic medicine

    Microsoft Academic Search

    Leslie G. Biesecker; James C. Mullikin; Flavia M. Facio; Clesson Turner; Praveen F. Cherukuri; Robert W. Blakesley; Gerard G. Bouffard; Peter S. Chines; Pedro Cruz; Nancy F. Hansen; Jamie K. Teer; Baishali Maskeri; Alice C. Young; Teri A. Manolio; Alexander F. Wilson; Toren Finkel; Paul Hwang; Andrew Arai; Alan T. Remaley; Vandana Sachdev; Robert Shamburek; Richard O. Cannon; Eric D. Green

    2009-01-01

    ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DN A sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues re- lated to the genetic architecture of

  12. Sequence-Tagged Connectors: A Sequence Approach to Mapping and Scanning the Human Genome

    Microsoft Academic Search

    Gregory G. Mahairas; James C. Wallace; Kim Smith; Steven Swartzell; Ted Holzman; Andrew Keller; Ron Shaker; Jepf Furlong; Janet Young; Shaying Zhao; Mark D. Adams; Leroy Hood

    1999-01-01

    The sequence-tagged connector (STC) strategy proposes to generate sequence tags densely scattered (every 3.3 kilobases) across the human genome by arraying 450,000 bacterial artificial chromosomes (BACs) with randomly cleaved inserts, sequencing both ends of each, and preparing a restriction enzyme fingerprint of each. The STC resource, containing end sequences, fingerprints, and arrayed BACs, creates a map where the interrelationships of

  13. Complete mitochondrial genome sequence of Cheirotonus jansoni (Coleoptera: Scarabaeidae).

    PubMed

    Shao, L L; Huang, D Y; Sun, X Y; Hao, J S; Cheng, C H; Zhang, W; Yang, Q

    2014-01-01

    We sequenced the complete mitochondrial genome (mitogenome) of Cheirotonus jansoni (Coleoptera: Scarabaeidae), an endangered insect species from Southeast Asia. This long legged scarab is widely collected and reared for sale, although it is rare and protected in the wild. The circular genome is 17,249 bp long and contains a typical gene complement: 13 protein-coding genes, 2 rRNA genes, 22 putative tRNA genes, and a non-coding AT-rich region. Its gene order and arrangement are identical to the common type found in most insect mitogenomes. As with all other sequenced coleopteran species, a 5-bp long TAGTA motif was detected in the intergenic space sequence located between trnS(UCN) and nad1. The atypical cox1 start codon is AAC, and the putative initiation codon for the atp8 gene appears to be GTC, instead of the frequently found ATN. By sequence comparison, the 2590-bp long non-coding AT-rich region is the second longest among the coleopterans, with two tandem repeat regions: one is 10 copies of an 88-bp sequence and the other is 2 copies of a 153-bp sequence. Additionally, the A+T content (64%) of the 13 protein-coding genes is the lowest among all sequenced coleopteran species. This newly sequenced genome aids in our understanding of the comparative biology of the mitogenomes of coleopteran species and supplies important data for the conservation of this species. PMID:24634126

  14. Whole genome sequencing in clinical and public health microbiology

    PubMed Central

    Kwong, J. C.; McCallum, N.; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories. As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future. Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

  15. Complete genome sequence of Desulfotomaculum acetoxidans type strain (5575T)

    SciTech Connect

    Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Schroder, Maren [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Gleim, Dorothea [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sims, David [Los Alamos National Laboratory (LANL); Meincke, Linda [Los Alamos National Laboratory (LANL); Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Han, Cliff [Los Alamos National Laboratory (LANL)

    2009-01-01

    Desulfotomaculum acetoxidans Widdel and Pfennig 1977 was one of the first sulfate-reducing bacteria known to grow with acetate as sole energy and carbon source. It is able to oxidize substrates completely to carbon dioxide with sulfate as the electron acceptor, which is reduced to hydrogen sulfide. All available data about this species are based on strain 5575T, isolated from piggery waste in Germany. Here we describe the features of this organ-ism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a Desulfotomaculum species with validly published name. The 4,545,624 bp long single replicon genome with its 4370 protein-coding and 100 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  16. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)

    PubMed Central

    Mavromatis, Konstantinos; Sikorski, Johannes; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Chain, Patrick; Meincke, Linda; Sims, David; Chertkov, Olga; Han, Cliff; Brettin, Thomas; Detter, John C.; Wahrenburg, Claudia; Rohde, Manfred; Pukall, Rüdiger; Göker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family ‘Alicyclobacillaceae’. A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family ‘Alicyclobacillaceae’. The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304673

  17. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IA).

    PubMed

    Mavromatis, Konstantinos; Sikorski, Johannes; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Meincke, Linda; Sims, David; Chertkov, Olga; Han, Cliff; Brettin, Thomas; Detter, John C; Wahrenburg, Claudia; Rohde, Manfred; Pukall, Rüdiger; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family 'Alicyclobacillaceae'. A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family 'Alicyclobacillaceae'. The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304673

  18. Whole genome sequencing in clinical and public health microbiology.

    PubMed

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

  19. Complete genome sequence of Halogeometricum borinquense type strain (PR3).

    PubMed

    Malfatti, Stephanie; Tindall, Brian J; Schneider, Susanne; Fähnrich, Regine; Lapidus, Alla; Labuttii, Kurt; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Chen, Feng; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Anderson, Iain; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; D'haeseleer, Patrik; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Chain, Patrick

    2009-01-01

    Halogeometricum borinquense Montalvo-Rodríguez et al. 1998 is the type species of the genus, and is of phylogenetic interest because of its distinct location between the halobacterial genera Haloquadratum and Halosarcina. H. borinquense requires extremely high salt (NaCl) concentrations for growth. It can not only grow aerobically but also anaerobically using nitrate as electron acceptor. The strain described in this report is a free-living, motile, pleomorphic, euryarchaeon, which was originally isolated from the solar salterns of Cabo Rojo, Puerto Rico. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the halobacterial genus Halogeometricum, and this 3,944,467 bp long six replicon genome with its 3937 protein-coding and 57 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304651

  20. Complete genome sequence of Arthrobacter sp. strain FB24

    SciTech Connect

    Nakatsu, C. H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, T.; Han, Cliff F.; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

    2013-09-30

    Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program.

  1. Complete genome sequence of Klebsiella pneumoniae phage JD001.

    PubMed

    Cui, Zelin; Shen, Wenbin; Wang, Zheng; Zhang, Haotian; Me, Rao; Wang, Yanchun; Zeng, Lingbin; Zhu, Yongzhang; Qin, Jinhong; He, Ping; Guo, Xiaokui

    2012-12-01

    Klebsiella pneumoniae is a member of the family Enterobacteriaceae, opportunistic pathogens that are among the eight most prevalent infectious agents in hospitals. The emergence of multidrug-resistant strains of K. pneumoniae has became a public health problem globally. To develop an effective antimicrobial agent, we isolated a bacteriophage, named JD001, from seawater and sequenced its genome. Comparative genome analysis of phage JD001 with other K. pneumoniae bacteriophages revealed that phage JD001 has little similarity to previously published K. pneumoniae phages KP15, KP32, KP34, and phiKO2. Here we announce the complete genome sequence of JD001 and report major findings from the genomic analysis. PMID:23166250

  2. Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)

    SciTech Connect

    Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Meincke, Linda [Los Alamos National Laboratory (LANL); Sims, David [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Wahrenburg, Claudia [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

    2010-01-01

    Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family Alicyclobacillaceae . A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Alicyclobacillaceae . The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

  3. Sequencing and analysis of an Irish human genome

    PubMed Central

    2010-01-01

    Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge. PMID:20822512

  4. Insights into the evolution of cotton diploids and polyploids from whole-genome re-1 sequencing2

    E-print Network

    Wendel, Jonathan F.

    1 Insights into the evolution of cotton diploids and polyploids from whole-genome re-1 sequencing2, College Station, TX, 7784311 5 Seed Biotechnology Center, University of California-Davis, Davis, CA the composition, evolution, and function of the Gossypium hirsutum (cotton)2 genome is complicated by the joint

  5. The International Pea Genome Sequencing Project: Sequencing and Assembly Progresses Updates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The International Consortium for the Pea Genome Sequencing (ICPG) includes scientists from six countries around the world. Its aim is to provide a high quality reference of the pea genome to the scientific community as well as to the pea breeder community. The consortium proposed a strategy that int...

  6. Genome Sequences of Mycobacteriophages Luchador and Nerujay.

    PubMed

    Pope, Welkin H; Ahmed, Taha; Drobitch, Marissa K; Early, David R; Eljamri, Soukaina; Kasturiarachi, Naomi S; Klonicki, Emily F; Manjooran, Daniel T; Ní Chochlain, Aífe N; Puglionesi, Andrew O; Rajakumar, Vinod; Shindle, Katherine A; Tran, Mai T; Brown, Bryony R; Churilla, Bryce M; Cohen, Karen L; Wilkes, Kellyn E; Grubb, Sarah R; Warner, Marcie H; Bowman, Charles A; Russell, Daniel A; Hatfull, Graham F

    2015-01-01

    Luchador and Nerujay are two newly isolated mycobacteriophages recovered from soil samples using Mycobacterium smegmatis. Their genomes are 53,387 bp and 53,455 bp long and have 96 and 97 predicted open reading frames, respectively. Nerujay is related to subcluster A1 phages, and Luchador represents a new subcluster, A14. PMID:26089414

  7. Genome Sequences of Mycobacteriophages Luchador and Nerujay

    PubMed Central

    Ahmed, Taha; Drobitch, Marissa K.; Early, David R.; Eljamri, Soukaina; Kasturiarachi, Naomi S.; Klonicki, Emily F.; Manjooran, Daniel T.; Ní Chochlain, Aífe N.; Puglionesi, Andrew O.; Rajakumar, Vinod; Shindle, Katherine A.; Tran, Mai T.; Brown, Bryony R.; Churilla, Bryce M.; Cohen, Karen L.; Wilkes, Kellyn E.; Grubb, Sarah R.; Warner, Marcie H.; Bowman, Charles A.; Russell, Daniel A.; Hatfull, Graham F.

    2015-01-01

    Luchador and Nerujay are two newly isolated mycobacteriophages recovered from soil samples using Mycobacterium smegmatis. Their genomes are 53,387 bp and 53,455 bp long and have 96 and 97 predicted open reading frames, respectively. Nerujay is related to subcluster A1 phages, and Luchador represents a new subcluster, A14.

  8. Genome Sequence of Sinorhizobium meliloti Rm41

    PubMed Central

    Weidner, Stefan; Baumgarth, Birgit; Göttfert, Michael; Jaenicke, Sebastian; Pühler, Alfred; Schneiker-Bekel, Susanne; Serrania, Javier; Szczepanowski, Rafael

    2013-01-01

    Sinorhizobium meliloti Rm41 nodulates alfalfa plants, forming indeterminate type nodules. It is characterized by a strain-specific K-antigen able to replace exopolysaccharides in promotion of nodule invasion. We present the Rm41 genome, composed of one chromosome, the chromid pSymB, the megaplasmid pSymA, and the nonsymbiotic plasmid pRme41a. PMID:23405285

  9. Contribution to Sequencing of the Deinococcus radiodurans Genome

    SciTech Connect

    Minton, K.W.

    1999-03-11

    The stated goal of this project was to supply The Institute for Genomic Research (TIGR) with pure DNA from the bacterium Deinocmus radiodurans RI for purposes of complete genomic sequencing by TIGR. We subsequently decided to expand this project to include a second goal; this second goal was the development of a NotI chromosomal map of D. radiodurans R1 using Pulsed Field Gel Electrophoresis (PFGE).

  10. Prediction of probable genes by Fourier analysis of genomic sequences

    Microsoft Academic Search

    Shrish Tiwari; S. Ramachandran; Alok Bhattacharya; Sudha Bhattacharya; Ramakrishna Ramaswamy

    1997-01-01

    Motivation: The major signal in coding regions of genomic sequences is a three-base periodicity. Our aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA. Result: The three-base periodicity in the nucleotide arrange- ment is evidenced as a sharp peak at frequency fº 1=3 in the Fourier

  11. Genome Sequence of Corynebacterium ulcerans Strain FRC11

    PubMed Central

    Benevides, Leandro de Jesus; Viana, Marcus Vinicius Canário; Mariano, Diego César Batista; Rocha, Flávia de Souza; Bagano, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Dorella, Fernanda Alves; Leal, Carlos Augusto Gomes; Carvalho, Alex Fiorini; Soares, Siomar de Castro; Carneiro, Adriana; Ramos, Rommel; Badell-Ocando, Edgar; Guiso, Nicole; Silva, Artur; Figueiredo, Henrique; Guimarães, Luis Carlos

    2015-01-01

    Here, we present the genome sequence of Corynebacterium ulcerans strain FRC11. The genome includes one circular chromosome of 2,442,826 bp (53.35% G+C content), and 2,210 genes were predicted, 2,146 of which are putative protein-coding genes, with 12 rRNAs and 51 tRNAs; 1 pseudogene was also identified. PMID:25767241

  12. Structure, sequence and expression of the hepatitis delta (?) viral genome

    NASA Astrophysics Data System (ADS)

    Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael

    1986-10-01

    Biochemical and electron microscopic data indicate that the human hepatitis ? viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis ? viral infections.

  13. Mitochondrial genome sequence of the bluegill sunfish (Lepomis macrochirus).

    PubMed

    Li, Sheng-Jie; Cai, Lei; Bai, Jun-Jie

    2011-10-01

    The bluegill sunfish (Lepomis macrochirus) belongs to Lepomis genera of the family Centrarchidae, which is an economically important freshwater species in China. This study presents the complete mitochondrial genome of L. macrochirus, which is the first complete sequence from sunfish species. L. macrochirus mitochondrial DNA is 16,489 bp long, with the genome organization and gene order being identical to that of the typical vertebrate. PMID:22165836

  14. Complete genome sequence of the fish pathogen Flavobacterium psychrophilum

    Microsoft Academic Search

    Mekki Boussaha; Valentin Loux; Jean-François Bernardet; Christian Michel; Brigitte Kerouault; Stanislas Mondot; Pierre Nicolas; Robert Bossy; Christophe Caron; Philippe Bessières; Jean-François Gibrat; Stéphane Claverol; Fabien Dumetz; Michel Le Hénaff; Abdenour Benmansour; Eric Duchaud

    2007-01-01

    We report here the complete genome sequence of the virulent strain JIP02\\/86 (ATCC 49511) of Flavobacterium psychrophilum, a widely distributed pathogen of wild and cultured salmonid fish. The genome consists of a 2,861,988–base pair (bp) circular chromosome with 2,432 predicted protein-coding genes. Among these predicted proteins, stress response mediators, gliding motility proteins, adhesins and many putative secreted proteases are probably

  15. Draft genome sequence of Lactobacillus mali KCTC 3596.

    PubMed

    Kim, Dong-Wook; Choi, Sang-Haeng; Kang, Aram; Nam, Seong-Hyeuk; Kim, Dae-Soo; Kim, Ryong Nam; Kim, Aeri; Park, Hong-Seog

    2011-09-01

    We announce the draft genome sequence of the type strain Lactobacillus mali KCTC 3596 (2,652,969 bp, with a G+C content of 36.0%), which is one of the most prevalent lactic acid bacteria present during the manufacturing process of apple juice. The genome consists of 122 large contigs (>100 bp). All of the contigs were assembled by Newbler Assembler 2.3 (454 Life Science). PMID:21742889

  16. Genome sequence of the plant pathogen Ralstonia solanacearum

    Microsoft Academic Search

    M. Salanoubat; S. Genin; F. Artiguenave; J. Gouzy; S. Mangenot; M. Arlat; A. Billault; P. Brottier; J. C. Camus; L. Cattolico; M. Chandler; N. Choisne; C. Claudel-Renard; S. Cunnac; N. Demange; C. Gaspin; M. Lavie; A. Moisan; C. Robert; W. Saurin; T. Schiex; P. Siguier; P. Thébault; M. Whalen; P. Wincker; M. Levy; J. Weissenbach; C. A. Boucher

    2002-01-01

    Ralstonia solanacearum is a devastating, soil-borne plant pathogen with a global distribution and an unusually wide host range. It is a model system for the dissection of molecular determinants governing pathogenicity. We present here the complete genome sequence and its analysis of strain GMI1000. The 5.8-megabase (Mb) genome is organized into two replicons: a 3.7-Mb chromosome and a 2.1-Mb megaplasmid.

  17. The Bacillus subtilis genome sequence: the molecular blueprint of a soil bacterium

    Microsoft Academic Search

    Anil Wipat; Colin R Harwood

    1999-01-01

    The rate at which entire microbial genomes are being sequenced has accelerated rapidly over the past two years, promising to revolutionise our understanding of microbial molecular biology and genetics. The Bacillus subtilis genome sequence is the first complete genome of a free-living soil and rhizosphere bacterium. Data derived from the genome sequence and the systematic functional analysis programme, together with

  18. The first complete chloroplast genome sequence of a lycophyte, Huperzia lucidula (Lycopodiaceae)

    E-print Network

    Olmstead, Richard

    The first complete chloroplast genome sequence of a lycophyte, Huperzia lucidula (Lycopodiaceae complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade, and shotgun sequencing to 8Â depth coverage to obtain the complete chloroplast genome sequence. The genome

  19. Analysis of Complete Genome Sequences of Human Rhinovirus

    PubMed Central

    Palmenberg, Ann C.; Rathe, Jennifer A.; Liggett, Stephen B.

    2010-01-01

    Human Rhinovirus (HRV) infection is the cause of about one-half of asthma and COPD exacerbations. With >100 serotypes in the HRV reference set an effort was undertaken to sequence their complete genomes so as to understand diversity, structural variation, and evolution of the virus. Analysis revealed conserved motifs, hypervariable regions, a potential fourth HRV species, within-serotype variation in field isolates, a non-scanning internal ribosome entry site, and evidence for HRV recombination. Techniques have now been developed using next generation sequencing to generate complete genomes from patient isolates with high throughput, deep coverage, and low costs. Thus relationships can now be sought between obstructive lung phenotypes and variation in HRV genomes in infected patients, and, potential novel therapeutic strategies developed based on HRV sequence. PMID:20471068

  20. The Mycoplasma conjunctivae genome sequencing, annotation and analysis

    PubMed Central

    Calderon-Copete, Sandra P; Wigger, George; Wunderlin, Christof; Schmidheini, Tobias; Frey, Joachim; Quail, Michael A; Falquet, Laurent

    2009-01-01

    Background The mollicute Mycoplasma conjunctivae is the etiological agent leading to infectious keratoconjunctivitis (IKC) in domestic sheep and wild caprinae. Although this pathogen is relatively benign for domestic animals treated by antibiotics, it can lead wild animals to blindness and death. This is a major cause of death in the protected species in the Alps (e.g., Capra ibex, Rupicapra rupicapra). Methods The genome was sequenced using a combined technique of GS-FLX (454) and Sanger sequencing, and annotated by an automatic pipeline that we designed using several tools interconnected via PERL scripts. The resulting annotations are stored in a MySQL database. Results The annotated sequence is deposited in the EMBL database (FM864216) and uploaded into the mollicutes database MolliGen allowing for comparative genomics. Conclusion We show that our automatic pipeline allows for annotating a complete mycoplasma genome and present several examples of analysis in search for biological targets (e.g., pathogenic proteins). PMID:19534756

  1. Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes

    PubMed Central

    Hazkani-Covo, Einat; Zeller, Raymond M.; Martin, William

    2010-01-01

    The natural transfer of DNA from mitochondria to the nucleus generates nuclear copies of mitochondrial DNA (numts) and is an ongoing evolutionary process, as genome sequences attest. In humans, five different numts cause genetic disease and a dozen human loci are polymorphic for the presence of numts, underscoring the rapid rate at which mitochondrial sequences reach the nucleus over evolutionary time. In the laboratory and in nature, numts enter the nuclear DNA via non-homolgous end joining (NHEJ) at double-strand breaks (DSBs). The frequency of numt insertions among 85 sequenced eukaryotic genomes reveal that numt content is strongly correlated with genome size, suggesting that the numt insertion rate might be limited by DSB frequency. Polymorphic numts in humans link maternally inherited mitochondrial genotypes to nuclear DNA haplotypes during the past, offering new opportunities to associate nuclear markers with mitochondrial markers back in time. PMID:20168995

  2. Complete genomic sequence of Pasteurella multocida,Pm70

    PubMed Central

    May, Barbara J.; Zhang, Qing; Li, Ling Ling; Paustian, Michael L.; Whittam, Thomas S.; Kapur, Vivek

    2001-01-01

    We present here the complete genome sequence of a common avian clone of Pasteurella multocida, Pm70. The genome of Pm70 is a single circular chromosome 2,257,487 base pairs in length and contains 2,014 predicted coding regions, 6 ribosomal RNA operons, and 57 tRNAs. Genome-scale evolutionary analyses based on pairwise comparisons of 1,197 orthologous sequences between P. multocida, Haemophilus influenzae, and Escherichia coli suggest that P. multocida and H. influenzae diverged ?270 million years ago and the ? subdivision of the proteobacteria radiated about 680 million years ago. Two previously undescribed open reading frames, accounting for ?1% of the genome, encode large proteins with homology to the virulence-associated filamentous hemagglutinin of Bordetella pertussis. Consistent with the critical role of iron in the survival of many microbial pathogens, in silico and whole-genome microarray analyses identified more than 50 Pm70 genes with a potential role in iron acquisition and metabolism. Overall, the complete genomic sequence and preliminary functional analyses provide a foundation for future research into the mechanisms of pathogenesis and host specificity of this important multispecies pathogen. PMID:11248100

  3. Genome sequence of the human malaria parasite Plasmodium falciparum

    PubMed Central

    Gardner, Malcolm J.; Hall, Neil; Fung, Eula; White, Owen; Berriman, Matthew; Hyman, Richard W.; Carlton, Jane M.; Pain, Arnab; Nelson, Karen E.; Bowman, Sharen; Paulsen, Ian T.; James, Keith; Eisen, Jonathan A.; Rutherford, Kim; Salzberg, Steven L.; Craig, Alister; Kyes, Sue; Chan, Man-Suen; Nene, Vishvanath; Shallom, Shamira J.; Suh, Bernard; Peterson, Jeremy; Angiuoli, Sam; Pertea, Mihaela; Allen, Jonathan; Selengut, Jeremy; Haft, Daniel; Mather, Michael W.; Vaidya, Akhil B.; Martin, David M. A.; Fairlamb, Alan H.; Fraunholz, Martin J.; Roos, David S.; Ralph, Stuart A.; McFadden, Geoffrey I.; Cummings, Leda M.; Subramanian, G. Mani; Mungall, Chris; Venter, J. Craig; Carucci, Daniel J.; Hoffman, Stephen L.; Newbold, Chris; Davis, Ronald W.; Fraser, Claire M.; Barrell, Bart

    2013-01-01

    The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host–parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria. PMID:12368864

  4. The minimum information about a genome sequence (MIGS) specification

    PubMed Central

    Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; dePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; Gil, Ingio San; Wilson, Gareth; Wipat, Anil

    2008-01-01

    With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases. PMID:18464787

  5. Complete genome sequence of Haliscomenobacter hydrossis type strain (OT)

    SciTech Connect

    Daligault, Hajnalka E. [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Zeytun, Ahmet [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Verbarg, Susanne [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Haliscomenobacter hydrossis van Veen et al. 1973 is the type species of the genus Halisco- menobacter, which belongs to order 'Sphingobacteriales'. The species is of interest because of its isolated phylogenetic location in the tree of life, especially the so far genomically un- charted part of it, and because the organism grows in a thin, hardly visible hyaline sheath. Members of the species were isolated from fresh water of lakes and from ditch water. The genome of H. hydrossis is the first completed genome sequence reported from a member of the family 'Saprospiraceae'. The 8,771,651 bp long genome with its three plasmids of 92 kbp, 144 kbp and 164 kbp length contains 6,848 protein-coding and 60 RNA genes, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  6. Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome

    Microsoft Academic Search

    Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

    2002-01-01

    Background  It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most\\u000a informative species and features of genome evolution for comparison remain to be determined.\\u000a \\u000a \\u000a \\u000a \\u000a Results  We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D.

  7. Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing

    PubMed Central

    2011-01-01

    Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models. PMID:21542930

  8. Castor Bean Organelle Genome Sequencing and Worldwide Genetic Diversity Analysis

    PubMed Central

    Chan, Agnes P.; Williams, Amber L.; Rice, Danny W.; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M. J.; Khouri, Hoda M.; Beckstrom-Sternberg, Stephen M.; Allan, Gerard J.; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D.

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  9. Genome sequence and analysis of the tuber crop potato.

    PubMed

    Xu, Xun; Pan, Shengkai; Cheng, Shifeng; Zhang, Bo; Mu, Desheng; Ni, Peixiang; Zhang, Gengyun; Yang, Shuang; Li, Ruiqiang; Wang, Jun; Orjeda, Gisella; Guzman, Frank; Torres, Michael; Lozano, Roberto; Ponce, Olga; Martinez, Diana; De la Cruz, Germán; Chakrabarti, S K; Patil, Virupaksh U; Skryabin, Konstantin G; Kuznetsov, Boris B; Ravin, Nikolai V; Kolganova, Tatjana V; Beletsky, Alexey V; Mardanov, Andrei V; Di Genova, Alex; Bolser, Daniel M; Martin, David M A; Li, Guangcun; Yang, Yu; Kuang, Hanhui; Hu, Qun; Xiong, Xingyao; Bishop, Gerard J; Sagredo, Boris; Mejía, Nilo; Zagorski, Wlodzimierz; Gromadka, Robert; Gawor, Jan; Szczesny, Pawel; Huang, Sanwen; Zhang, Zhonghua; Liang, Chunbo; He, Jun; Li, Ying; He, Ying; Xu, Jianfei; Zhang, Youjun; Xie, Binyan; Du, Yongchen; Qu, Dongyu; Bonierbale, Merideth; Ghislain, Marc; Herrera, Maria del Rosario; Giuliano, Giovanni; Pietrella, Marco; Perrotta, Gaetano; Facella, Paolo; O'Brien, Kimberly; Feingold, Sergio E; Barreiro, Leandro E; Massa, Gabriela A; Diambra, Luis; Whitty, Brett R; Vaillancourt, Brieanne; Lin, Haining; Massa, Alicia N; Geoffroy, Michael; Lundback, Steven; DellaPenna, Dean; Buell, C Robin; Sharma, Sanjeev Kumar; Marshall, David F; Waugh, Robbie; Bryan, Glenn J; Destefanis, Marialaura; Nagy, Istvan; Milbourne, Dan; Thomson, Susan J; Fiers, Mark; Jacobs, Jeanne M E; Nielsen, Kåre L; Sønderkær, Mads; Iovene, Marina; Torres, Giovana A; Jiang, Jiming; Veilleux, Richard E; Bachem, Christian W B; de Boer, Jan; Borm, Theo; Kloosterman, Bjorn; van Eck, Herman; Datema, Erwin; Hekkert, Bas te Lintel; Goverse, Aska; van Ham, Roeland C H J; Visser, Richard G F

    2011-07-14

    Potato (Solanum tuberosum L.) is the world's most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop. PMID:21743474

  10. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    PubMed

    Rivarola, Maximo; Foster, Jeffrey T; Chan, Agnes P; Williams, Amber L; Rice, Danny W; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M J; Khouri, Hoda M; Beckstrom-Sternberg, Stephen M; Allan, Gerard J; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  11. The complete mitochondrial genome sequence of the budgerigar, Melopsittacus undulatus.

    PubMed

    Guan, Xiaojing; Xu, Jun; Smith, Edward J

    2014-03-24

    Abstract Here, we describe the budgie's mitochondrial genome sequence, a resource that can facilitate this parrot's use as a model organism as well as for determining its phylogenetic relatedness to other parrots/Psittaciformes. The estimated total length of the sequence was 18,193?bp. In addition to the to the 13 protein and tRNA and rRNA coding regions, the sequence also includes a duplicated hypervariable region, a feature unique to only a few birds. The two hypervariable regions shared a sequence identity of about 86%. PMID:24660934

  12. Genome sequence and description of Aeromicrobium massiliense sp. nov.

    PubMed Central

    Ramasamy, Dhamodharan; Kokcha, Sahare; Lagier, Jean-Christophe; Nguyen, Thi-Thien; Raoult, Didier

    2012-01-01

    Aeromicrobium massiliense strain JC14Tsp. nov. is the type strain of Aeromicrobium massiliense sp. nov., a new species within the genus Aeromicrobium. This strain, whose genome is described here, was isolated from the fecal microbiota of an asymptomatic patient. Aeromicrobium massiliense is an aerobic rod-shaped gram-positive bacterium. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,322,119 bp long genome contains 3,296 protein-coding and 51 RNA genes. PMID:23408663

  13. Draft genome sequence of Arthrospira platensis C1 (PCC9438)

    PubMed Central

    Cheevadhanarak, Supapon; Paithoonrangsarid, Kalyanee; Prommeenate, Peerada; Kaewngam, Warunee; Musigkain, Apiluck; Tragoonrung, Somvong; Tabata, Satoshi; Kaneko, Takakazu; Chaijaruwanich, Jeerayut; Sangsrakru, Duangjai; Tangphatsornruang, Sithichoke; Chanprasert, Juntima; Tongsima, Sissades; Kusonmano, Kanthida; Jeamton, Wattana; Dulsawat, Sudarat; Klanchui, Amornpan; Vorapreeda, Tayvich; Chumchua, Vasunun; Khannapho, Chiraphan; Thammarongtham, Chinae; Plengvidhya, Vethachai; Subudhi, Sanjukta; Hongsthong, Apiradee; Ruengjitchatchawalya, Marasri; Meechai, Asawin; Senachak, Jittisak; Tanticharoen, Morakot

    2012-01-01

    Arthrospira platensis is a cyanobacterium that is extensively cultivated outdoors on a large commercial scale for consumption as a food for humans and animals. It can be grown in monoculture under highly alkaline conditions, making it attractive for industrial production. Here we describe the complete genome sequence of A. platensis C1 strain and its annotation. The A. platensis C1 genome contains 6,089,210 bp including 6,108 protein-coding genes and 45 RNA genes, and no plasmids. The genome information has been used for further comparative analysis, particularly of metabolic pathways, photosynthetic efficiency and barriers to gene transfer. PMID:22675597

  14. Global divergence of microbial genome sequences mediated by propagating fronts

    PubMed Central

    Vetsigian, Kalin; Goldenfeld, Nigel

    2005-01-01

    We model the competition between homologous recombination and point mutation in microbial genomes, and present evidence for two distinct phases, one uniform, the other genetically diverse. Depending on the specifics of homologous recombination, we find that global sequence divergence can be mediated by fronts propagating along the genome, whose characteristic signature on genome structure is elucidated, and apparently observed in closely related Bacillus strains. Front propagation provides an emergent, generic mechanism for microbial “speciation,” and suggests a classification of microorganisms on the basis of their propensity to support propagating fronts. PMID:15878987

  15. LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report

    SciTech Connect

    Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

    2010-01-26

    Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

  16. Characteristics of cloned repeated DNA sequences in the barley genome

    SciTech Connect

    Anan'ev, E.V.; Bochkanov, S.S.; Ryzhik, M.V.; Sonina, N.V.; Chernyshev, A.I.; Shchipkova, N.I.; Yakovleva, E.Yu.

    1986-12-01

    A partial clone library of barley DNA fragments based on plasmid pBR325 was created. The cloned EcoRI-fragments of chromosomal DNA are from 2 to 14 kbp in length. More than 95% of the barley DNA inserts comprise repeated sequences of different complexity and copy number. Certain of these DNA sequences are from families comprising at least 1% of the barley genome. A significant proportion of the clones hybridize with numerous sets of restriction fragments of genome DNA and they are dispersed throughout the barley chromosomes.

  17. Genome sequence of the stramenopile Blastocystis , a human anaerobic parasite

    Microsoft Academic Search

    Michaël Roussel; Benjamin Noel; Ivan Wawrzyniak; Corinne Da Silva; Marie Diogon; Eric Viscogliosi; Céline Brochier-Armanet; Arnaud Couloux; Julie Poulain; Béatrice Segurens; Véronique Anthouard; Catherine Texier; Nicolas Blot; Philippe Poirier; Geok Choo Ng; Kevin SW Tan; François Artiguenave; Olivier Jaillon; Jean-Marc Aury; Frédéric Delbac; Patrick Wincker; Christian P Vivarès; Hicham El Alaoui

    2011-01-01

    Background  \\u000a Blastocystis is a highly prevalent anaerobic eukaryotic parasite of humans and animals that is associated with various gastrointestinal\\u000a and extraintestinal disorders. Epidemiological studies have identified different subtypes but no one subtype has been definitively\\u000a correlated with disease.\\u000a \\u000a \\u000a \\u000a \\u000a Results  Here we report the 18.8 Mb genome sequence of a Blastocystis subtype 7 isolate, which is the smallest stramenopile genome sequenced to

  18. Complete Mitochondrial Genome Sequence of Lichtheimia ramosa (syn. Lichtheimia hongkongensis).

    PubMed

    Leung, Shui-Yee; Huang, Yi; Lau, Susanna K P; Woo, Patrick C Y

    2014-01-01

    We report the complete mitochondrial genome sequence of Lichtheimia ramosa (syn. Lichtheimia hongkongensis), the first complete mitochondrial DNA sequence of the genus Lichtheimia. This 31.8-kb mitochondrial genome encodes 11 subunits of respiratory chain complexes, 3 ATP synthase subunits, 25 tRNAs, and small and large rRNAs, with the gene order atp9-cox2-atp6-cox3-cox1-nad2-nad3-cob-nad1-nad6-nad5-nad4l-nad4-atp8. PMID:24994796

  19. Establishing a framework for comparative analysis of genome sequences

    SciTech Connect

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  20. Draft genome sequence of the Tibetan antelope

    PubMed Central

    Ge, Ri-Li; Cai, Qingle; Shen, Yong-Yi; San, A; Ma, Lan; Zhang, Yong; Yi, Xin; Chen, Yan; Yang, Lingfeng; Huang, Ying; He, Rongjun; Hui, Yuanyuan; Hao, Meirong; Li, Yue; Wang, Bo; Ou, Xiaohua; Xu, Jiaohui; Zhang, Yongfen; Wu, Kui; Geng, Chunyu; Zhou, Weiping; Zhou, Taicheng; Irwin, David M.; Yang, Yingzhong; Ying, Liu; Bao, Haihua; Kim, Jaebum; Larkin, Denis M.; Ma, Jian; Lewin, Harris A.; Xing, Jinchuan; Platt, Roy N.; Ray, David A.; Auvil, Loretta; Capitanu, Boris; Zhang, Xiufeng; Zhang, Guojie; Murphy, Robert W.; Wang, Jun; Zhang, Ya-Ping; Wang, Jian

    2013-01-01

    The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared with other plain-dwelling mammals, the genome of the Tibetan antelope shows signals of adaptive evolution and gene-family expansion in genes associated with energy metabolism and oxygen transmission. Both the highland American pika, and the Tibetan antelope have signals of positive selection for genes involved in DNA repair and the production of ATPase. Genes associated with hypoxia seem to have experienced convergent evolution. Thus, our study suggests that common genetic mechanisms might have been utilized to enable high-altitude adaptation. PMID:23673643

  1. New complete genome sequences of human rhinoviruses shed light on their phylogeny and genomic features

    PubMed Central

    Tapparel, Caroline; Junier, Thomas; Gerlach, Daniel; Cordey, Samuel; Van Belle, Sandra; Perrin, Luc; Zdobnov, Evgeny M; Kaiser, Laurent

    2007-01-01

    Background Human rhinoviruses (HRV), the most frequent cause of respiratory infections, include 99 different serotypes segregating into two species, A and B. Rhinoviruses share extensive genomic sequence similarity with enteroviruses and both are part of the picornavirus family. Nevertheless they differ significantly at the phenotypic level. The lack of HRV full-length genome sequences and the absence of analysis comparing picornaviruses at the whole genome level limit our knowledge of the genomic features supporting these differences. Results Here we report complete genome sequences of 12 HRV-A and HRV-B serotypes, more than doubling the current number of available HRV sequences. The whole-genome maximum-likelihood phylogenetic analysis suggests that HRV-B and human enteroviruses (HEV) diverged from the last common ancestor after their separation from HRV-A. On the other hand, compared to HEV, HRV-B are more related to HRV-A in the capsid and 3B-C regions. We also identified the presence of a 2C cis-acting replication element (cre) in HRV-B that is not present in HRV-A, and that had been previously characterized only in HEV. In contrast to HEV viruses, HRV-A and HRV-B share also markedly lower GC content along the whole genome length. Conclusion Our findings provide basis to speculate about both the biological similarities and the differences (e.g. tissue tropism, temperature adaptation or acid lability) of these three groups of viruses. PMID:17623054

  2. Rosaceaous Genome Sequencing: Perspectives and Progress

    Microsoft Academic Search

    Bryon Sosinski; Vladimir Shulaev; Amit Dhingra; Ananth Kalyanaraman; Roger Bumgarner; Daniel Rokhsar; Ignazio Verde; Riccardo Velasco; Albert G. Abbott

    \\u000a The long-term goal of plant genomics is to identify, isolate and determine the function of plant genes that are associated\\u000a with both vegetative and reproductive phenotypes. Most phenotypes require the coordinated activity and regulatory control\\u000a of suites of genes over time and in precise positions within the plant. Until recently, the idea of establishing a comprehensive\\u000a approach to isolate and

  3. Sequence modelling and an extensible data model for genomic database

    SciTech Connect

    Li, Peter Wei-Der (California Univ., San Francisco, CA (United States) Lawrence Berkeley Lab., CA (United States))

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  4. Sequence modelling and an extensible data model for genomic database

    SciTech Connect

    Li, Peter Wei-Der [California Univ., San Francisco, CA (United States)]|[Lawrence Berkeley Lab., CA (United States)

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  5. Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing

    SciTech Connect

    Nierman, William C.

    2000-02-14

    At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phred Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.

  6. Genomic insight into the common carp ( Cyprinus carpio ) genome by sequencing analysis of BAC-end sequences

    Microsoft Academic Search

    Peng Xu; Jiongtang Li; Yan Li; Runzi Cui; Jintu Wang; Jian Wang; Yan Zhang; Zixia Zhao; Xiaowen Sun

    2011-01-01

    Background  Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae\\u000a species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively\\u000a underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development,\\u000a linkage map and physical map integration,

  7. Genome sequencing highlights the dynamic early history of dogs.

    PubMed

    Freedman, Adam H; Gronau, Ilan; Schweizer, Rena M; Ortega-Del Vecchyo, Diego; Han, Eunjung; Silva, Pedro M; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Beale, Holly; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R; Parker, Heidi G; Lee, Clarence; Tadigotla, Vasisht; Wilton, Alan; Siepel, Adam; Bustamante, Carlos D; Harkins, Timothy T; Nelson, Stanley F; Ostrander, Elaine A; Marques-Bonet, Tomas; Wayne, Robert K; Novembre, John

    2014-01-01

    To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11-16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary. PMID:24453982

  8. SEQUENCE AND COMPARATIVE ANALYSIS OF THE CHICKEN GENOME PROVIDE UNIQUE PERSPECTIVES ON VERTEBRATE EVOLUTION.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence ...

  9. Draft Genome Sequence of “Terrisporobacter othiniensis” Isolated from a Blood Culture from a Human Patient

    PubMed Central

    Lund, Lars Christian; Sydenham, Thomas Vognbjerg; Høgh, Silje Vermedal; Skov, Marianne; Kemp, Michael

    2015-01-01

    “Terrisporobacter othiniensis” (proposed species) was isolated from a blood culture. Genomic DNA was sequenced using a MiSeq benchtop sequencer (Illumina) and assembled using the SPAdes genome assembler. This resulted in a draft genome sequence comprising 3,980,019 bp in 167 contigs containing 3,449 coding sequences, 7 rRNAs, and 58 tRNAs. PMID:25744994

  10. Implications of the plastid genome sequence of typha (typhaceae, poales) for understanding genome evolution in poaceae.

    PubMed

    Guisinger, Mary M; Chumley, Timothy W; Kuehl, Jennifer V; Boore, Jeffrey L; Jansen, Robert K

    2010-02-01

    Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes. PMID:20091301

  11. Complete genome sequence of Allochromatium vinosum DSM 180T

    SciTech Connect

    Weissgerber, Thomas [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Zigann, Renate [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Bruce, David [Los Alamos National Laboratory (LANL); Chang, Yun-Juan [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Land, Miriam L [ORNL; Munk, Christine [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Dahl, Christiane [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany

    2011-01-01

    Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacte- rium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from fresh- water, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp ge- nome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Ge- nome Institute Community Sequencing Program.

  12. Selective enrichment of damaged DNA molecules for ancient genome sequencing.

    PubMed

    Gansauge, Marie-Theres; Meyer, Matthias

    2014-09-01

    Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA--the presence of deoxyuracils--for selective enrichment of endogenous DNA against a complex background of contamination during DNA library preparation. By applying the method to Neanderthal DNA extracts that are heavily contaminated with present-day human DNA, we show that the fraction of useful sequence information increases ? 10-fold and that the resulting sequences are more efficiently depleted of human contamination than when using purely computational approaches. Furthermore, we show that U selection can lead to a four- to fivefold increase in the proportion of endogenous DNA sequences relative to those of microbial contaminants in some samples. U selection may thus help to lower the costs for ancient genome sequencing of nonhuman samples also. PMID:25081630

  13. Easy quantitative assessment of genome editing by sequence trace decomposition

    PubMed Central

    Brinkman, Eva K.; Chen, Tao; Amendola, Mario; van Steensel, Bas

    2014-01-01

    The efficacy and the mutation spectrum of genome editing methods can vary substantially depending on the targeted sequence. A simple, quick assay to accurately characterize and quantify the induced mutations is therefore needed. Here we present TIDE, a method for this purpose that requires only a pair of PCR reactions and two standard capillary sequencing runs. The sequence traces are then analyzed by a specially developed decomposition algorithm that identifies the major induced mutations in the projected editing site and accurately determines their frequency in a cell population. This method is cost-effective and quick, and it provides much more detailed information than current enzyme-based assays. An interactive web tool for automated decomposition of the sequence traces is available. TIDE greatly facilitates the testing and rational design of genome editing strategies. PMID:25300484

  14. Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.

    PubMed

    Grativol, Clícia; Regulski, Michael; Bertalan, Marcelo; McCombie, W Richard; da Silva, Felipe Rodrigues; Zerlotini Neto, Adhemar; Vicentini, Renato; Farinelli, Laurent; Hemerly, Adriana Silva; Martienssen, Robert A; Ferreira, Paulo Cavalcanti Gomes

    2014-07-01

    Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene-enrichment strategy, we have compared assemblies using methyl-filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA-seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single-nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop. PMID:24773339

  15. AACR 2014: NCI/NIH-Sponsored Session: Large-Scale Genomics Data for the Research Community through the NCI Center for Cancer Genomics

    Cancer.gov

    The NCI’s Center for Cancer Genomics (CCG), which includes the Office of Cancer Genomics and The Cancer Genome Atlas Program Office, provides the research community access to large-scale molecular characterization data, which is largely sequence-based. CCG programs aim to improve patient outcome through identification of valid molecular targets and associated molecular markers (prognostic or diagnostic), in and across diseases investigated, which should ultimately lead to the rapid development of novel, more effective therapies.

  16. Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

    PubMed Central

    Vallania, Francesco; Ramos, Enrique; Cresci, Sharon; Mitra, Robi D.; Druley, Todd E.

    2012-01-01

    As DNA sequencing technology has markedly advanced in recent years2, it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought3. In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease4,5. Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants6-8. However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators. To address this need, we have developed a pooled sequencing approach1,9 and a novel software package1 for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (http://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA. PMID:22760212

  17. Initial SARS coronavirus genome sequence analysis using a bioinformatics platform

    Microsoft Academic Search

    Hong Luo; Jingchu Luo

    2004-01-01

    A dedicated anti-SARS bioinformatics web site was setup in April 2003 at the Centre of bioinformatics (CBI), Peking University (http:\\/\\/antisars.cbi.pku.edu.cn\\/<\\/u>). A special bioinformatics platform was constructed to analyse the sequence and structure data of SARS coronavirus and other viruses. A total file of 32 SARS coronavirus genome sequences was retrieved from GenBank and mismatches in 30 sites were revealed from

  18. Drosophila Genomic Sequence Annotation Using the BLOCKS+ Database

    Microsoft Academic Search

    Jorja G. Henikoff; Steven Henikoff

    2008-01-01

    A simple and general homology-based method for gene finding was applied to the 2.9-Mb Drosophila melanogaster Adh region, the target sequence of the Genome Annotation Assessment Project (GASP). Each strand of the entire sequence was used as query of the BLOCKS+ database of conserved regions of proteins. This led to functional assignments for more than one-third of the genes and

  19. The Complete Genome Sequence of Escherichia coli K-12

    Microsoft Academic Search

    Frederick R. Blattner; Guy Plunkett III; Craig A. Bloch; Nicole T. Perna; Valerie Burland; Monica Riley; Julio Collado-Vides; Jeremy D. Glasner; Christopher K. Rode; George F. Mayhew; Jason Gregor; Nelson Wayne Davis; Heather A. Kirkpatrick; Michael A. Goeden; Debra J. Rose; Bob Mau; Ying Shao

    2007-01-01

    The 4,639,221- base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome

  20. The complete genome sequence of the gastric pathogen Helicobacter pylori

    Microsoft Academic Search

    Jean-F. Tomb; Owen White; Anthony R. Kerlavage; Rebecca A. Clayton; Granger G. Sutton; Robert D. Fleischmann; Karen A. Ketchum; Hans Peter Klenk; Steven Gill; Brian A. Dougherty; Karen Nelson; John Quackenbush; Lixin Zhou; Ewen F. Kirkness; Scott Peterson; Brendan Loftus; Delwood Richardson; Robert Dodson; Hanif G. Khalak; Anna Glodek; Keith McKenney; Lisa M. Fitzegerald; Norman Lee; Mark D. Adams; Erin K. Hickey; Douglas E. Berg; Jeanine D. Gocayne; Teresa R. Utterback; Jeremy D. Peterson; Jenny M. Kelley; Matthew D. Cotton; Janice M. Weidman; Claire Fujii; Cheryl Bowman; Larry Watthey; Erik Wallin; William S. Hayes; Mark Borodovsky; Peter D. Karp; Hamilton O. Smith; Claire M. Fraser; J. Craig Venter

    1997-01-01

    Helicobacter pylori, strain 26695, has a circular genome of 1,667,867 base pairs and 1,590 predicted coding sequences. Sequence analysis indicates that H. pylori has well-developed systems for motility, for scavenging iron, and for DNA restriction and modification. Many putative adhesins, lipoproteins and other outer membrane proteins were identified, underscoring the potential complexity of host-pathogen interaction. Based on the large number

  1. Sequence and organization of the human mitochondrial genome

    Microsoft Academic Search

    S. Anderson; A. T. Bankier; B. G. Barrell; M. H. L. de Bruijn; A. R. Coulson; J. Drouin; I. C. Eperon; D. P. Nierlich; B. A. Roe; F. Sanger; P. H. Schreier; A. J. H. Smith; R. Staden; I. G. Young

    1981-01-01

    The complete sequence of the 16,569-base pair human mitochondrial genome is presented. The genes for the 12S and 16S rRNAs, 22 tRNAs, cytochrome c oxidase subunits I, II and III, ATPase subunit 6, cytochrome b and eight other predicted protein coding genes have been located. The sequence shows extreme economy in that the genes have none or only a few

  2. Alastrim Smallpox Variola Minor Virus Genome DNA Sequences

    Microsoft Academic Search

    Sergei N. Shchelkunov; Alexei V. Totmenin; Vladimir N. Loparev; Pavel F. Safronov; Valery V. Gutorov; Vladimir E. Chizhikov; Janice C. Knight; Joseph M. Parsons; Robert F. Massung; Joseph J. Esposito

    2000-01-01

    Alastrim variola minor virus, which causes mild smallpox, was first recognized in Florida and South America in the late 19th century. Genome linear double-stranded DNA sequences (186,986 bp) of the alastrim virus Garcia-1966, a laboratory reference strain from an outbreak associated with 0.8% case fatalities in Brazil in 1966, were determined except for a 530-bp fragment of hairpin-loop sequences at

  3. Draft genome sequence of Bacillus endophyticus 2102.

    PubMed

    Lee, Yong-Jik; Lee, Sang-Jae; Kim, Sun Hong; Lee, Sang Jun; Kim, Byoung-Chan; Lee, Han-Seung; Jeong, Haeyoung; Lee, Dong-Woo

    2012-10-01

    Bacillus endophyticus 2102 is an endospore-forming, plant growth-promoting rhizobacterium isolated from a hypersaline pond in South Korea. Here we present the draft sequence of B. endophyticus 2102, which is of interest because of its potential use in the industrial production of algaecides and bioplastics and for the treatment of industrial textile effluents. PMID:23012284

  4. Plant DNA barcoding using chloroplast genome sequences

    Microsoft Academic Search

    Catherine J Nock; Daniel LE Waters; Mervyn Shepherd; Peter C Bundock; Robert J Henry

    2011-01-01

    Chloroplast DNA sequence data have played a critical role in the development of plant DNA barcodes. While the mitochondrial locus CO1 is well accepted as an efficient DNA barcode for animals, no single locus has been identified that can discriminate between all plant species. There has been considerable debate about the selection of the most suitable chloroplast loci, and the

  5. Plant DNA Barcoding using chloroplast genome sequences

    Microsoft Academic Search

    Catherine J Nock; Daniel LE Waters; Mervyn Shepherd; Peter C Bundock; Robert J Henry

    2011-01-01

    Chloroplast DNA sequence data have played a critical role in the development of plant DNA barcodes. While the mitochondrial locus CO1 is well accepted as an efficient DNA barcode for animals, no single locus has been identified that can discriminate between all plant species. There has been considerable debate about the selection of the most suitable chloroplast loci, and the

  6. Overview of PSB track on gene structure identification in large-scale genomic sequence

    SciTech Connect

    Uberbacher, E.C.; Xu, Y.

    1998-12-31

    The recent funding of more than a dozen major genome centers to begin community-wide high-throughput sequencing of the human genome has created a significant new challenge for the computational analysis of DNA sequence and the prediction of gene structure and function. It has been estimated that on average from 1996 to 2003, approximately 2 million bases of newly finished DNA sequence will be produced every day and be made available on the Internet and in central databases. The finished (fully assembled) sequence generated each day will represent approximately 75 new genes (and their respective proteins), and many times this number will be represented in partially completed sequences. The information contained in these is of immeasurable value to medical research, biotechnology, the pharmaceutical industry and researchers in a host of fields ranging from microorganism metabolism, to structural biology, to bioremediation. Sequencing of microorganisms and other model organisms is also ramping up at a very rapid rate. The genomes for yeast and several microorganisms such as H. influenza have recently been fully sequenced, although the significance of many genes remains to be determined.

  7. Genome Sequence of the Lager Brewing Yeast, an Interspecies Hybrid

    PubMed Central

    Nakao, Yoshihiro; Kanamori, Takeshi; Itoh, Takehiko; Kodama, Yukiko; Rainieri, Sandra; Nakamura, Norihisa; Shimonaga, Tomoko; Hattori, Masahira; Ashikari, Toshihiko

    2009-01-01

    This work presents the genome sequencing of the lager brewing yeast (Saccharomyces pastorianus) Weihenstephan 34/70, a strain widely used in lager beer brewing. The 25 Mb genome comprises two nuclear sub-genomes originating from Saccharomyces cerevisiae and Saccharomyces bayanus and one circular mitochondrial genome originating from S. bayanus. Thirty-six different types of chromosomes were found including eight chromosomes with translocations between the two sub-genomes, whose breakpoints are within the orthologous open reading frames. Several gene loci responsible for typical lager brewing yeast characteristics such as maltotriose uptake and sulfite production have been increased in number by chromosomal rearrangements. Despite an overall high degree of conservation of the synteny with S. cerevisiae and S. bayanus, the syntenies were not well conserved in the sub-telomeric regions that contain lager brewing yeast characteristic and specific genes. Deletion of larger chromosomal regions, a massive unilateral decrease of the ribosomal DNA cluster and bilateral truncations of over 60 genes reflect a post-hybridization evolution process. Truncations and deletions of less efficient maltose and maltotriose uptake genes may indicate the result of adaptation to brewing. The genome sequence of this interspecies hybrid yeast provides a new tool for better understanding of lager brewing yeast behavior in industrial beer production. PMID:19261625

  8. Characterizing the citrus cultivar Carrizo genome through 454 shotgun sequencing.

    PubMed

    Belknap, William R; Wang, Yi; Huo, Naxin; Wu, Jiajie; Rockhold, David R; Gu, Yong Q; Stover, Ed

    2011-12-01

    The citrus cultivar Carrizo is the single most important rootstock to the US citrus industry and has resistance or tolerance to a number of major citrus diseases, including citrus tristeza virus, foot rot, and Huanglongbing (HLB, citrus greening). A Carrizo genomic sequence database providing approximately 3.5×genome coverage (haploid genome size approximately 367 Mb) was populated through 454 GS FLX shotgun sequencing. Analysis of the repetitive DNA fraction indicated a total interspersed repeat fraction of 36.5%. Assembly and characterization of abundant citrus Ty3/gypsy elements revealed a novel type of element containing open reading frames encoding a viral RNA-silencing suppressor protein (RNA binding protein, rbp) and a plant cytokinin riboside 5?-monophosphate phosphoribohydrolase-related protein (LONELY GUY, log). Similar gypsy elements were identified in the Populus trichocarpa genome. Gene-coding region analysis indicated that 24.4% of the nonrepetitive reads contained genic regions. The depth of genome coverage was sufficient to allow accurate assembly of constituent genes, including a putative phloem-expressed gene. The development of the Carrizo database (http://citrus.pw.usda.gov/) will contribute to characterization of agronomically significant loci and provide a publicly available genomic resource to the citrus research community. PMID:22133378

  9. Complete Genome Sequence of Bacillus megaterium Myophage Mater

    PubMed Central

    Lancaster, Jacob C.; Hodde, Mary K.; Hernandez, Adriana C.

    2015-01-01

    Bacillus megaterium is a ubiquitous, soil inhabiting Gram-positive bacterium that is a common model organism and is used in industrial applications for protein production. The following reports the complete sequencing and annotation of the genome of B. megaterium myophage Mater and describes the major features identified. PMID:25593262

  10. Genome Sequence of Mycoplasma capricolum subsp. capripneumoniae Strain M1601

    PubMed Central

    Chu, Yuefeng; Gao, Pengchen; Zhao, Ping; He, Ying; Liao, Nancy; Jackman, Shaun; Zhao, Yongjun; Birol, Inanc; Duan, Xiaobo; Lu, Zhongxin

    2011-01-01

    Mycoplasma capricolum subsp. capripneumoniae is the causative agent of contagious caprine pleuropneumonia, a devastating disease of goats listed by the World Organization for Animal Health. Here we report the first complete genome sequence of this organism (strain M1601, a clinically isolated strain from China). PMID:21994928

  11. Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002

    PubMed Central

    Yuan, Kat (Xiaoli); Adam, Zaky; Tambong, James; Lévesque, C. André; Chen, Wen; Lewis, Christopher T.; De Boer, Solke H.

    2014-01-01

    Pectobacterium wasabiae, originally causing soft rot disease in horseradish in Japan, was recently found to cause blackleg-like symptoms on potato in the United States, Canada, and Europe. A draft genome sequence of a Canadian potato isolate of P. wasabiae CFIA1002 will enhance the characterization of its pathogenicity and host specificity features. PMID:24831134

  12. Genome Sequence of Enterotoxigenic Escherichia coli Strain B2C.

    PubMed

    Madhavan, T P Vipin; Steen, Jason A; Hugenholtz, Philip; Sakellaris, Harry

    2014-01-01

    Enterotoxigenic Escherichia coli (ETEC) is a major cause of diarrheal disease around the globe, causing an estimated 380,000 deaths annually. The disease is caused by a wide variety of strains. Here, we report the genome sequence of ETEC strain B2C, which was isolated from an American soldier in Vietnam. PMID:24723709

  13. Draft Genome Sequence of Bacillus subtilis strain KATMIRA1933.

    PubMed

    Karlyshev, Andrey V; Melnikov, Vyacheslav G; Chikindas, Michael L

    2014-01-01

    In this report, we present a draft sequence of Bacillus subtilis KATMIRA1933. Previous studies demonstrated probiotic properties of this strain partially attributed to production of an antibacterial compound, subtilosin. Comparative analysis of this strain's genome with that of a commercial probiotic strain, B. subtilis Natto, is presented. PMID:24948771

  14. SCALABLE MAPPING AND COMPRESSION HIGH THROUGHPUT GENOME SEQUENCING DATA

    E-print Network

    SCALABLE MAPPING AND COMPRESSION OF HIGH THROUGHPUT GENOME SEQUENCING DATA by Faraz Hach B Faculty of Applied Sciences c Faraz Hach 2013 SIMON FRASER UNIVERSITY Summer 2013 All rights reserved, particularly if cited appropriately. #12;APPROVAL Name: Faraz Hach Degree: Doctor of Philosophy Title of Thesis

  15. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome

    Microsoft Academic Search

    Timothy J. Ley; Li Ding; Bob Fulton; Michael D. McLellan; Ken Chen; David Dooling; Brian H. Dunford-Shore; Sean McGrath; Matthew Hickenbotham; Lisa Cook; Rachel Abbott; David E. Larson; Dan C. Koboldt; Craig Pohl; Scott Smith; Amy Hawkins; Scott Abbott; Devin Locke; LaDeana W. Hillier; Tracie Miner; Lucinda Fulton; Vincent Magrini; Todd Wylie; Jarret Glasscock; Joshua Conyers; Nathan Sander; Xiaoqi Shi; John R. Osborne; Patrick Minx; David Gordon; Asif Chinwalla; Yu Zhao; Rhonda E. Ries; Jacqueline E. Payton; Peter Westervelt; Michael H. Tomasson; Mark Watson; Jack Baty; Jennifer Ivanovich; Sharon Heath; William D. Shannon; Rakesh Nagarajan; Matthew J. Walter; Daniel C. Link; Timothy A. Graubert; John F. DiPersio; Richard K. Wilson; Elaine R. Mardis

    2008-01-01

    Acute myeloid leukaemia is a highly malignant haematopoietic tumour that affects about 13,000 adults in the United States each year. The treatment of this disease has changed little in the past two decades, because most of the genetic events that initiate the disease remain undiscovered. Whole-genome sequencing is now possible at a reasonable cost and timeframe to use this approach

  16. Draft Genome Sequence of Mycobacterium farcinogenes NCTC 10955

    PubMed Central

    Croce, Olivier; Robert, Catherine; Raoult, Didier

    2014-01-01

    We report the draft genome sequence of Mycobacterium farcinogenes NCTC 10955 (=DSM 43637T), a nontuberculosis species responsible for bovine farcy. The strain described here is composed of 6,139,893 bp, with a G+C content of 65.73%, and contains 5,816 protein-coding genes and 76 RNA genes. PMID:24874688

  17. Genome Sequencing and Informatics: New Tools for Biochemical Discoveries

    Microsoft Academic Search

    Milton H. Saier

    1998-01-01

    During the past 3 years, we have experienced a major revolution in the biological sciences resulting from a tre- mendous flux of information generated by genome- sequencing efforts. Our understanding of microorganisms, the metabolic processes they catalyze, the genetic appara- tuses encoding cellular proteinaceous constituents, and the pathological conditions caused by these organisms has greatly benefited from the availability of

  18. Sequencing of the Populus Rhizosphere The Plant Genome Group

    E-print Network

    Sequencing of the Populus Rhizosphere The Plant Genome Group Environmental Sciences Division Oak associated with the soil-root interface or rhizosphere influence the response of plants to fluctuations discovered in the leaves, roots, and stems of Populus. When viewed in total, a single large perennial plant

  19. Draft Genome Sequence of Buttiauxella agrestis, Isolated from Surface Water

    PubMed Central

    Kahler, Amy; Strockbine, Nancy; Gladney, Lori; Hill, Vincent R.

    2014-01-01

    MI agar is routinely used for quantifying Escherichia coli in drinking water. A suspect E. coli colony isolated from a water sample was identified as Buttiauxella agrestis. The whole genome sequence of B. agrestis was determined to understand the genetic basis for its phenotypic resemblance to E. coli on MI agar. PMID:25323724

  20. Genome Sequence of an Alphabaculovirus Isolated from Choristoneura murinana

    PubMed Central

    Erlandson, Martin A.; Theilmann, David A.

    2014-01-01

    The genome sequence of a baculovirus from Choristoneura murinana is 124,689 bp, with a G+C content of 50%, and contains 148 putative open reading frames. The virus is a member of the group I alphabaculoviruses and is most closely related to several other viruses that infect Choristoneura species. PMID:24482509

  1. Tandem Clusters of Membrane Proteins in Complete Genome Sequences

    E-print Network

    Kihara, Daisuke

    of genes coding for membrane proteins was investigated in 16 complete genomes: 4 archaea, 11 bacteria similarity. Furthermore, the prediction of higher order structures may be utilized in order to com- pensate for the limitation of the sequence similarity search for functional identification. Aurora and Rose (1998) used

  2. Complete Genome Sequence of Anaplasma marginale subsp. centrale

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Anaplasma marginale subsp. centrale is a naturally attenuated subtype that has been used as a vaccine for a century. We sequenced the genome of this organism and compared it to those of virulent senso stricto A. marginale strains. The comparison markedly narrows the number of outer membrane protein ...

  3. Complete Genome Sequence of Agrobacterium tumefaciens Ach5.

    PubMed

    Huang, Ya-Yi; Cho, Shu-Ting; Lo, Wen-Sui; Wang, Yi-Chieh; Lai, Erh-Min; Kuo, Chih-Horng

    2015-01-01

    Agrobacterium tumefaciens is a phytopathogenic bacterium that causes crown gall disease. The strain Ach5 was isolated from yarrow (Achillea ptarmica L.) and is the wild-type progenitor of other derived strains widely used for plant transformation. Here, we report the complete genome sequence of this bacterium. PMID:26044425

  4. Complete Genome Sequence of Biocontrol Strain Pseudomonas fluorescens LBUM 223

    PubMed Central

    Roquigny, Roxane; Arseneault, Tanya; Gadkar, Vijay J.; Novinscak, Amy

    2015-01-01

    Pseudomonas fluorescens LBUM 223 is a plant growth-promoting rhizobacterium (PGPR) with biocontrol activity against various plant pathogens. It produces the antimicrobial metabolite phenazine-1-carboxylic acid, which is involved in the biocontrol of Streptomyces scabies, the causal agent of common scab of potato. Here, we report the complete genome sequence of P. fluorescens LBUM 223. PMID:25953163

  5. Draft genome sequence of probiotic strain Lactobacillus rhamnosus R0011.

    PubMed

    Tompkins, Thomas A; Barreau, Guillaume; de Carvalho, Vanessa G

    2012-02-01

    Lactobacillus rhamnosus R0011 is a commercially available probiotic that is widely used in human dietary supplements and pharmaceutical products. We prepared a draft genome sequence consisting of 10 contigs totaling 2,900,620 bases and a G+C content of 46.7% for this strain. PMID:22275100

  6. Draft Genome Sequence of Alicyclobacillus acidoterrestris Strain ATCC 49025

    PubMed Central

    Pasvolsky, Ronit; Sela, Noa; Green, Stefan J.; Zakin, Varda

    2013-01-01

    Alicyclobacillus acidoterrestris is a spore-forming Gram-positive, thermo-acidophilic, nonpathogenic bacterium which contaminates commercial pasteurized fruit juices. The draft genome sequence for A. acidoterrestris strain ATCC 49025 is reported here, providing genetic data relevant to the successful adaptation and survival of this strain in its ecological niche. PMID:24009113

  7. Draft Genome Sequence of Alicyclobacillus acidoterrestris Strain ATCC 49025.

    PubMed

    Shemesh, Moshe; Pasvolsky, Ronit; Sela, Noa; Green, Stefan J; Zakin, Varda

    2013-01-01

    Alicyclobacillus acidoterrestris is a spore-forming Gram-positive, thermo-acidophilic, nonpathogenic bacterium which contaminates commercial pasteurized fruit juices. The draft genome sequence for A. acidoterrestris strain ATCC 49025 is reported here, providing genetic data relevant to the successful adaptation and survival of this strain in its ecological niche. PMID:24009113

  8. Complete Genome Sequence of Uropathogenic Escherichia coli Strain CI5.

    PubMed

    Mehershahi, Kurosh S; Abraham, Soman N; Chen, Swaine L

    2015-01-01

    Escherichia coli represents the primary etiological agent responsible for urinary tract infections, one of the most common infections in humans. We report here the complete genome sequence of uropathogenic Escherichia coli strain CI5, a clinical pyelonephritis isolate used for studying pathogenesis. PMID:26021932

  9. Complete Genome Sequence of Agrobacterium tumefaciens Ach5

    PubMed Central

    Huang, Ya-Yi; Cho, Shu-Ting; Lo, Wen-Sui; Wang, Yi-Chieh; Lai, Erh-Min

    2015-01-01

    Agrobacterium tumefaciens is a phytopathogenic bacterium that causes crown gall disease. The strain Ach5 was isolated from yarrow (Achillea ptarmica L.) and is the wild-type progenitor of other derived strains widely used for plant transformation. Here, we report the complete genome sequence of this bacterium. PMID:26044425

  10. Complete Genome Sequence of Bordetella pertussis D420

    PubMed Central

    Boinett, Christine J.; Harris, Simon R.; Langridge, Gemma C.; Trainor, Elizabeth A.; Merkel, Tod J.

    2015-01-01

    Bordetella pertussis is the causative agent of whooping cough, a highly contagious, acute respiratory illness that has seen resurgence despite the use of vaccines. We present the complete genome sequence of a clinical strain of B. pertussis, D420, which is representative of a currently circulating clade of this pathogen. PMID:26067980

  11. Complete Genome Sequence of Bordetella pertussis D420.

    PubMed

    Boinett, Christine J; Harris, Simon R; Langridge, Gemma C; Trainor, Elizabeth A; Merkel, Tod J; Parkhill, Julian

    2015-01-01

    Bordetella pertussis is the causative agent of whooping cough, a highly contagious, acute respiratory illness that has seen resurgence despite the use of vaccines. We present the complete genome sequence of a clinical strain of B. pertussis, D420, which is representative of a currently circulating clade of this pathogen. PMID:26067980

  12. Genome Sequence of Lactobacillus rhamnosus Strain CNCM I-3698

    PubMed Central

    Tareb, R.; Bernardeau, M.

    2015-01-01

    Lactobacillus rhamnosus CNCM I-3698 is a commercially available probiotic that is used in animal feed as an additive. Here, we announce the draft genome sequence for this strain, consisting of 71 contigs corresponding to 2,966,480 bp and a G+C content of 46.69%. PMID:26067954

  13. Comparison of genomic sequences using the Hamming distance

    Microsoft Academic Search

    Hildete Prisco Pinheiro; Aluísio de Souza Pinheiro; Pranab Kumar Sen

    2005-01-01

    The paper considers the problem of homogeneity among groups by comparison of genomic sequences. Some alternative procedures that attach less emphasis on the likelihood approach, and more on alternative measures that deal with similar homogeneity problems are considered here. On this approach, a one-sided hypothesis test is considered and the classical ANOVA decomposition can be directly adapted to sample measures

  14. Time-dependent ARMA modeling of genomic sequences

    Microsoft Academic Search

    Jerzy S. Zielinski; Nidhal Bouaynaya; Dan Schonfeld; William O'neill

    2008-01-01

    BACKGROUND: Over the past decade, many investigators have used sophisticated time series tools for the analysis of genomic sequences. Specifically, the correlation of the nucleotide chain has been studied by examining the properties of the power spectrum. The main limitation of the power spectrum is that it is restricted to stationary time series. However, it has been observed over the

  15. Draft genome sequence of the mulberry tree Morus notabilis.

    PubMed

    He, Ningjia; Zhang, Chi; Qi, Xiwu; Zhao, Shancen; Tao, Yong; Yang, Guojun; Lee, Tae-Ho; Wang, Xiyin; Cai, Qingle; Li, Dong; Lu, Mengzhu; Liao, Sentai; Luo, Guoqing; He, Rongjun; Tan, Xu; Xu, Yunmin; Li, Tian; Zhao, Aichun; Jia, Ling; Fu, Qiang; Zeng, Qiwei; Gao, Chuan; Ma, Bi; Liang, Jiubo; Wang, Xiling; Shang, Jingzhe; Song, Penghua; Wu, Haiyang; Fan, Li; Wang, Qing; Shuai, Qin; Zhu, Juanjuan; Wei, Congjin; Zhu-Salzman, Keyan; Jin, Dianchuan; Wang, Jinpeng; Liu, Tao; Yu, Maode; Tang, Cuiming; Wang, Zhenjiang; Dai, Fanwei; Chen, Jiafei; Liu, Yan; Zhao, Shutang; Lin, Tianbao; Zhang, Shougong; Wang, Junyi; Wang, Jian; Yang, Huanming; Yang, Guangwei; Wang, Jun; Paterson, Andrew H; Xia, Qingyou; Ji, Dongfeng; Xiang, Zhonghuai

    2013-01-01

    Human utilization of the mulberry-silkworm interaction started at least 5,000 years ago and greatly influenced world history through the Silk Road. Complementing the silkworm genome sequence, here we describe the genome of a mulberry species Morus notabilis. In the 330-Mb genome assembly, we identify 128 Mb of repetitive sequences and 29,338 genes, 60.8% of which are supported by transcriptome sequencing. Mulberry gene sequences appear to evolve ~3 times faster than other Rosales, perhaps facilitating the species' spread worldwide. The mulberry tree is among a few eudicots but several Rosales that have not preserved genome duplications in more than 100 million years; however, a neopolyploid series found in the mulberry tree and several others suggest that new duplications may confer benefits. Five predicted mulberry miRNAs are found in the haemolymph and silk glands of the silkworm, suggesting interactions at molecular levels in the plant-herbivore relationship. The identification and analyses of mulberry genes involved in diversifying selection, resistance and protease inhibitor expressed in the laticifers will accelerate the improvement of mulberry plants. PMID:24048436

  16. Plant-Pathogen Interactions: From Genome Sequences to Genetic Networks

    Microsoft Academic Search

    Felipe Arredondo; Nathan Bruce; Marcus Chibucos; Daolong Dou; Lee Falin; Adriana Fereirra; Nik Galloway; Regina Hanlon; Rays Jiang; Shiv Kale; Konstantinos Krampis; Robert Presler; Brian Smith; Vignesh Sundararajan; Ken Tian; Trudy Torto-Alalibo; Sucheta Tripathy; Lachelle Waller; Xia Wang; Lecong Zhou

    Interconnected genetic regulatory networks govern the interactions of hosts and pathogens as a result of an ongoing co-evolutionary battle between the organisms. We are building data collections and tool sets for dissecting host-pathogen genetic networks, with a principal focus on oomycete pathogens of plants. To catalog the interacting genes we have sequenced the genomes of the oomycetes Phytophthora sojae, Phytophthora

  17. Genome Sequence of Porphyromonas gingivalis Strain HG66 (DSM 28984)

    PubMed Central

    Yoder-Himes, Deborah Ruth; Mizgalska, Danuta; Nguyen, Ky-Anh; Potempa, Jan; Olsen, Ingar

    2014-01-01

    Porphyromonas gingivalis is considered a major etiologic agent in adult periodontitis. Gingipains are among its most important virulence factors, but their release is unique in strain HG66. We present the genome sequence of HG66 with a single contig of 2,441,680 bp and a G+C content of 48.1%. PMID:25291768

  18. Genome sequence of the human malaria parasite Plasmodium falciparum

    E-print Network

    Arnold, Jonathan

    Genome sequence of the human malaria parasite Plasmodium falciparum Malcolm J. Gardner1 , Neil Hall ........................................................................................................................................................................................................................... The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted

  19. Genome Sequence of a Nicotine-Degrading Strain of Arthrobacter

    PubMed Central

    Yao, Yuxiang; Ren, Huixue; Yu, Hao; Wang, Lijuan

    2012-01-01

    We announce a 4.63-Mb genome assembly of an isolated bacterium that is the first sequenced nicotine-degrading Arthrobacter strain. Nicotine catabolism genes of the nicotine-degrading plasmid pAO1 were predicted, but plasmid function genes were not found. These results will help to better illustrate the molecular mechanism of nicotine degradation by Arthrobacter. PMID:23012289

  20. Complete Genome Sequence of Citrobacter freundii Myophage Moogle.

    PubMed

    Nguyen, Quynh T; Luna, Adrian J; Hernandez, Adriana C; Kuty Everett, Gabriel F

    2015-01-01

    Citrobacter freundii is an opportunistic pathogen that has been linked to nosocomial infections, such as brain abscesses and pneumonia. Further study on phages infecting C. freundii may provide therapeutics for these infections. Here, we announce the complete genome sequence of the FelixO1-like myophage Moogle and describe its features. PMID:25635026

  1. Complete genome sequence of a novel pararetrovirus isolated from soybean

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We report the complete genome sequence of Soybean Putnam pararetrovirus (SPPRV), a new pararetrovirus isolated from a soybean field in Putnam County, Ohio, USA. Comparison of SPPRV with other plant-infecting pararetroviruses places it in the genus Caulimovirus of the family Caulimoviridae....

  2. Complete Genome Sequence of Clinical Isolate Pantoea ananatis LMG 5342

    PubMed Central

    Chan, Wai Yin; Rezzonico, Fabio; Bühlmann, Andreas; Venter, Stephanus N.; Blom, Jochen; Goesmann, Alexander; Frey, Jürg E.; Smits, Theo H. M.; Duffy, Brion; Coutinho, Teresa A.

    2012-01-01

    The enterobacterium Pantoea ananatis is an ecologically versatile species. It has been found in the environment, as plant epiphyte and endophyte, as an emerging phytopathogen, and as a presumptive, opportunistic human pathogen. Here, we report the complete genome sequence of P. ananatis LMG 5342, isolated from a human wound. PMID:22374951

  3. Len Gen: The international lentil genome sequencing project

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have been sequencing CDC Redberry using NGS of paired-end and mate-pair libraries over a wide range of sizes and technologies. The most recent draft (v0.7) of approximately 150x coverage produced scaffolds covering over half the genome (2.7 Gb of the expected 4.3 Gb). Long reads from PacBio sequ...

  4. Complete Genome Sequence of the Haloalkaliphilic, Hydrogen Producing Halanaerobium hydrogenoformans

    SciTech Connect

    Brown, Steven D [ORNL; Begemann, Matthew B [University of Wisconsin, Madison; Mormile, Dr. Melanie R. [Missouri University of Science and Technology; Wall, Judy D. [University of Missouri; Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Samual [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Elias, Dwayne A [ORNL

    2011-01-01

    Halanaerobium hydrogenoformans is an alkaliphilic bacterium capable of biohydrogen production at pH 11 and 7% (w/v) salt. We present the 2.6 Mb genome sequence to provide insights into its physiology and potential for bioenergy applications.

  5. Draft Genome Sequence of the Cellulolytic Fungus Chaetomium globosum

    PubMed Central

    Ma, Li-Jun; Grabherr, Manfred; Birren, Bruce W.

    2015-01-01

    Chaetomium globosum is a filamentous fungus typically isolated from cellulosic substrates. This species also causes superficial infections of humans and, more rarely, can cause cerebral infections. Here, we report the genome sequence of C. globosum isolate CBS 148.51, which will facilitate the study and comparative analysis of this fungus. PMID:25720678

  6. Complete Genome Sequence of Uropathogenic Escherichia coli Strain CI5

    PubMed Central

    Mehershahi, Kurosh S.; Abraham, Soman N.

    2015-01-01

    Escherichia coli represents the primary etiological agent responsible for urinary tract infections, one of the most common infections in humans. We report here the complete genome sequence of uropathogenic Escherichia coli strain CI5, a clinical pyelonephritis isolate used for studying pathogenesis. PMID:26021932

  7. Complete Genome Sequence of Actinobaculum schaalii Strain CCUG 27420

    PubMed Central

    Kristiansen, Rikke; Dueholm, Morten S.; Bank, Steffen; Nielsen, Per Halkjær; Karst, Søren M.; Cattoir, Vincent; Lienhard, Reto; Grisold, Andrea J.; Olsen, Anne Buchhave; Reinhard, Mark; Søby, Karen Marie; Christensen, Jens Jørgen; Prag, Jørgen

    2014-01-01

    Complete genome sequencing of the emerging uropathogen Actinobaculum schaalii indicates that an important mechanism of its virulence is attachment pili, which allow the organism to adhere to the surface of animal cells, greatly enhancing the ability of this organism to colonize the urinary tract. PMID:25189588

  8. Complete Genome Sequence of Actinobaculum schaalii Strain CCUG 27420.

    PubMed

    Kristiansen, Rikke; Dueholm, Morten S; Bank, Steffen; Nielsen, Per Halkjær; Karst, Søren M; Cattoir, Vincent; Lienhard, Reto; Grisold, Andrea J; Olsen, Anne Buchhave; Reinhard, Mark; Søby, Karen Marie; Christensen, Jens Jørgen; Prag, Jørgen; Thomsen, Trine R

    2014-01-01

    Complete genome sequencing of the emerging uropathogen Actinobaculum schaalii indicates that an important mechanism of its virulence is attachment pili, which allow the organism to adhere to the surface of animal cells, greatly enhancing the ability of this organism to colonize the urinary tract. PMID:25189588

  9. Draft Genome Sequence of Bacillus megaterium Type Strain ATCC 14581.

    PubMed

    Arya, Gitanjali; Petronella, Nicholas; Crosthwait, Jennifer; Carrillo, Catherine D; Shwed, Philip S

    2014-01-01

    Bacillus megaterium is a Gram-positive, rod-shaped, spore-forming bacterium of biotechnological importance. Here, we report a 5.7-Mbp draft genome sequence of B. megaterium ATCC 14581, which is the type strain of the species. PMID:25395629

  10. Complete Genome Sequence of Bacillus megaterium Myophage Mater.

    PubMed

    Lancaster, Jacob C; Hodde, Mary K; Hernandez, Adriana C; Kuty Everett, Gabriel F

    2015-01-01

    Bacillus megaterium is a ubiquitous, soil inhabiting Gram-positive bacterium that is a common model organism and is used in industrial applications for protein production. The following reports the complete sequencing and annotation of the genome of B. megaterium myophage Mater and describes the major features identified. PMID:25593262

  11. Draft Genome Sequence of Bacillus megaterium Type Strain ATCC 14581

    PubMed Central

    Arya, Gitanjali; Petronella, Nicholas; Crosthwait, Jennifer; Carrillo, Catherine D.

    2014-01-01

    Bacillus megaterium is a Gram-positive, rod-shaped, spore-forming bacterium of biotechnological importance. Here, we report a 5.7-Mbp draft genome sequence of B. megaterium ATCC 14581, which is the type strain of the species. PMID:25395629

  12. Genome Sequences of Six Paenibacillus larvae Siphoviridae Phages

    PubMed Central

    Carson, Susan; Bruff, Emily; DeFoor, William; Dums, Jacob; Groth, Adam; Hatfield, Taylor; Iyer, Aruna; Joshi, Kalyani; McAdams, Sarah; Miles, Devon; Miller, Delanie; Oufkir, Abdoullah; Raynor, Brinkley; Riley, Sara; Roland, Shelby; Rozier, Horace; Talley, Sarah

    2015-01-01

    Six sequenced and annotated genomes of Paenibacillus larvae phages isolated from the combs of American foulbrood-diseased beehives are 37 to 45 kbp and have approximately 42% G+C content and 60 to 74 protein-coding genes. Phage Lily is most divergent from Diva, Rani, Redbud, Shelly, and Sitara. PMID:26089405

  13. Genome Sequences of Six Wheat-Infecting Fusarium Species Isolates

    PubMed Central

    Moolhuijzen, Paula M.; Manners, John M.; Wilcox, Stephen A.; Bellgard, Matthew I.

    2013-01-01

    Fusarium pathogens represent a major constraint to wheat and barley production worldwide. To facilitate future comparative studies of Fusarium species that are pathogenic to wheat, the genome sequences of four Fusarium pseudograminearum isolates, a single Fusarium acuminatum isolate, and an organism from the Fusarium incarnatum-F. equiseti species complex are reported. PMID:24009115

  14. Comparative Genome Sequence Analysis of Multidrug-Resistant Acinetobacter baumannii

    Microsoft Academic Search

    Mark D. Adams; Karrie Goglin; Neil Molyneaux; Kristine M. Hujer; Heather Lavender; Jennifer J. Jamison; Ian J. MacDonald; Kristienna M. Martin; Thomas Russo; Anthony A. Campagnari; Andrea M. Hujer; Robert A. Bonomo; Steven R. Gill

    2008-01-01

    The recent emergence of multidrug resistance (MDR) in Acinetobacter baumannii has raised concern in health care settings worldwide. In order to understand the repertoire of resistance determinants and their organization and origins, we compared the genome sequences of three MDR and three drug-susceptible A. baumannii isolates. The entire MDR phenotype can be explained by the acquisition of discrete resistance determinants

  15. Genomic Sequence Is Highly Predictive of Local Nucleosome Depletion

    E-print Network

    Yuan, Guo-Cheng "GC"

    . Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N. Recent genome-wide experiments have identi- fied high resolution nucleosome positions in yeast [2 required for nucleosome packaging [16­19]. Also, certain short DNA sequences have been found

  16. FEMALE-SPECIFIC DNA SEQUENCES IN THE CHICKEN GENOME

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Female-specific regions in the chicken's genome were detected in silico. Eight fragments out of 21 that were in-silico W-specific, were shown to produce PCR products only in females. Some of these fragments gave a female-specific product in turkeys and peacocks. We sequenced all eight fragments in o...

  17. The dynamics of genome replication using deep sequencing

    PubMed Central

    Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.

    2014-01-01

    Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142

  18. A novel DNA sequence motif in human and mouse genomes

    PubMed Central

    Zhang, Shilu; Du, Fang; Ji, Hongkai

    2015-01-01

    We report a novel DNA sequence motif in human and mouse genomes. This motif has several interesting features indicating that it is highly likely to be an unknown functional sequence element. The motif is highly enriched in promoter regions. Locations of the motif sites in the genome have strong tendency to be clustered together. Motif sites are associated with increased phylogenetic conservation as well as elevated DNase I hypersensitivity (DHS) in ENCODE cell lines. Clustered motif sites are found in promoter regions of a substantial fraction of the protein-coding genes in the genome. All together, these indicate that the motif may have important functions associated with a large number of genes. PMID:25990515

  19. Complete genome sequence of Oceanithermus profundus type strain (506T)

    SciTech Connect

    Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Zhang, Xiaojing [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Ruhl, Alina [U.S. Department of Energy, Joint Genome Institute; Mwirichia, Romano [University of Munster, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Land, Miriam L [ORNL

    2011-01-01

    Oceanithermus profundus Miroshnichenko et al. 2003 is the type species of the genus Oceanithermus, which belongs to the family Thermaceae. The genus currently comprises two species whose members are thermophilic and are able to reduce sulfur compounds and nitrite. The organism is adapted to the salinity of sea water, is able to utilize a broad range of carbohydrates, some proteinaceous substrates, organic acids and alcohols. This is the first completed genome sequence of a member of the genus Oceanithermus and the fourth sequence from the family Thermaceae. The 2,439,291 bp long genome with its 2,391 protein-coding and 54 RNA genes consists of one chromosome and a 135,351 bp long plasmid, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  20. Short-sequence DNA repeats in prokaryotic genomes

    Microsoft Academic Search

    Belkum van A. F; STEWART SCHERER; Alphen van A. J. W; HENRI VERBRUGH

    1998-01-01

    Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic\\u000a and many prokaryotic genomes. These loci harbor short or long stretches of\\u000a repeated nucleotide sequence motifs. DNA sequence motifs in a single locus\\u000a can be identical and\\/or heterogeneous. SSRs are encountered in many\\u000a different branches of the prokaryote kingdom. They are found in genes\\u000a encoding products as diverse as

  1. Heterozygous genome assembly via binary classification of homologous sequence

    PubMed Central

    2015-01-01

    Background Genome assemblers to date have predominantly targeted haploid reference reconstruction from homozygous data. When applied to diploid genome assembly, these assemblers perform poorly, owing to the violation of assumptions during both the contigging and scaffolding phases. Effective tools to overcome these problems are in growing demand. Increasing parameter stringency during contigging is an effective solution to obtaining haplotype-specific contigs; however, effective algorithms for scaffolding such contigs are lacking. Methods We present a stand-alone scaffolding algorithm, ScaffoldScaffolder, designed specifically for scaffolding diploid genomes. The algorithm identifies homologous sequences as found in "bubble" structures in scaffold graphs. Machine learning classification is used to then classify sequences in partial bubbles as homologous or non-homologous sequences prior to reconstructing haplotype-specific scaffolds. We define four new metrics for assessing diploid scaffolding accuracy: contig sequencing depth, contig homogeneity, phase group homogeneity, and heterogeneity between phase groups. Results We demonstrate the viability of using bubbles to identify heterozygous homologous contigs, which we term homolotigs. We show that machine learning classification trained on these homolotig pairs can be used effectively for identifying homologous sequences elsewhere in the data with high precision (assuming error-free reads). Conclusion More work is required to comparatively analyze this approach on real data with various parameters and classifiers against other diploid genome assembly methods. However, the initial results of ScaffoldScaffolder supply validity to the idea of employing machine learning in the difficult task of diploid genome assembly. Software is available at http://bioresearch.byu.edu/scaffoldscaffolder. PMID:25952609

  2. Genome sequencing: missing a stage, John SulstonSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    Interviewee: John Sulston DNAi Location:Genome>The project>players>Private Genome shotgun: missing a stage John Sulston, a key figure in the public genome project, speaks about the difficulties posed by missing a step in the sequencing process.

  3. Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU

    E-print Network

    Ussery, David W.

    Comparative Microbial Genomics group Centerfor%-8531)1-803,% - or - Where Does Vibrio cholera come from? #12;Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU #12;Comparative Microbial Genomics

  4. Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU

    E-print Network

    Ussery, David W.

    Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU Minimal genomes in bacterial Genera Dave Ussery European Conference on Synthetic Biology: Design 2007 #12;Comparative Microbial Genomics group Centerfor

  5. Co-barcoded sequence reads from long DNA fragments: a cost-effective solution for “perfect genomesequencing

    PubMed Central

    Peters, Brock A.; Liu, Jia; Drmanac, Radoje

    2015-01-01

    Next generation sequencing (NGS) technologies, primarily based on massively parallel sequencing, have touched and radically changed almost all aspects of research worldwide. These technologies have allowed for the rapid analysis, to date, of the genomes of more than 2,000 different species. In humans, NGS has arguably had the largest impact. Over 100,000 genomes of individual humans (based on various estimates) have been sequenced allowing for deep insights into what makes individuals and families unique and what causes disease in each of us. Despite all of this progress, the current state of the art in sequence technology is far from generating a “perfect genomesequence and much remains to be understood in the biology of human and other organisms’ genomes. In the article that follows, we outline why the “perfect genome” in humans is important, what is lacking from current human whole genome sequences, and a potential strategy for achieving the “perfect genome” in a cost effective manner. PMID:25642240

  6. Complete mitochondrial genome sequence of the Tyrolean Iceman.

    PubMed

    Ermini, Luca; Olivieri, Cristina; Rizzi, Ermanno; Corti, Giorgio; Bonnal, Raoul; Soares, Pedro; Luciani, Stefania; Marota, Isolina; De Bellis, Gianluca; Richards, Martin B; Rollo, Franco

    2008-11-11

    The Tyrolean Iceman was a witness to the Neolithic-Copper Age transition in Central Europe 5350-5100 years ago, and his mummified corpse was recovered from an Alpine glacier on the Austro-Italian border in 1991 [1]. Using a mixed sequencing procedure based on PCR amplification and 454 sequencing of pooled amplification products, we have retrieved the first complete mitochondrial-genome sequence of a prehistoric European. We have then compared it with 115 related extant lineages from mitochondrial haplogroup K. We found that the Iceman belonged to a branch of mitochondrial haplogroup K1 that has not yet been identified in modern European populations. This is the oldest complete Homo sapiens mtDNA genome generated to date. The results point to the potential significance of complete-ancient-mtDNA studies in addressing questions concerning the genetic history of human populations that the phylogeography of modern lineages is unable to tackle. PMID:18976917

  7. Complete genome sequence of arracacha virus B: a novel cheravirus.

    PubMed

    Adams, I P; Glover, R; Souza-Richards, R; Bennett, S; Hany, U; Boonham, N

    2013-04-01

    The complete genome sequences of RNA1 and RNA2 of the oca strain of the potato virus arracacha virus B were determined using next-generation sequencing. The RNA1 molecule is predicted to encode a 259-kDa polyprotein with homology to proteins of the cheraviruses apple latent spherical virus (ALSV) and cherry rasp leaf virus (CRLV). The RNA2 molecule is predicted to encode a 102-kDa polyprotein which also has homology to the corresponding protein of ALSV and, to a lesser degree, CRLV (30 % for RNA1, 24 % for RNA2). Detailed analysis of the genome sequence confirms that AVB is a distinct member of the genus Cheravirus. PMID:23192172

  8. Realistic artificial DNA sequences as negative controls for computational genomics

    PubMed Central

    Caballero, Juan; Smit, Arian F. A.; Hood, Leroy; Glusman, Gustavo

    2014-01-01

    A common practice in computational genomic analysis is to use a set of ‘background’ sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such ‘background’ sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by ‘shuffling’ real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/. PMID:24803667

  9. Realistic artificial DNA sequences as negative controls for computational genomics.

    PubMed

    Caballero, Juan; Smit, Arian F A; Hood, Leroy; Glusman, Gustavo

    2014-07-01

    A common practice in computational genomic analysis is to use a set of 'background' sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such 'background' sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by 'shuffling' real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/. PMID:24803667

  10. Motivators for participation in a whole-genome sequencing study: implications for translational genomics research.

    PubMed

    Facio, Flavia M; Brooks, Stephanie; Loewenstein, Johanna; Green, Susannah; Biesecker, Leslie G; Biesecker, Barbara B

    2011-12-01

    The promise of personalized medicine depends on the ability to integrate genetic sequencing information into disease risk assessment for individuals. As genomic sequencing technology enters the realm of clinical care, its scale necessitates answers to key social and behavioral research questions about the complexities of understanding, communicating, and ultimately using sequence information to improve health. Our study captured the motivations and expectations of research participants who consented to participate in a research protocol, ClinSeq, which offers to return a subset of the data generated through high-throughput sequencing. We present findings from an exploratory study of 322 participants, most of whom identified themselves as white, non-Hispanic, and coming from higher socio-economic groups. Participants aged 45-65 years answered open-ended questions about the reasons they consented to ClinSeq and about what they anticipated would come of genomic sequencing. Two main reasons for participating were as follows: a conviction to altruism in promoting research, and a desire to learn more about genetic factors that contribute to one's own health risk. Overall, participants expected genomic research to help improve understanding of disease causes and treatments. Our findings offer a first glimpse into the motivations and expectations of individuals seeking their own genomic information, and provide initial insights into the value these early adopters of technology place on information generated by high-throughput sequencing studies. PMID:21731059

  11. Draft Genome Sequence of Mycobacterium obuense Strain UC1, Isolated from Patient Sputum

    PubMed Central

    Greninger, Alexander L.; Cunningham, Gail; Hsu, Elaine D.; Yu, Joanna M.; Chiu, Charles Y.

    2015-01-01

    We report the draft genome sequence of Mycobacterium obuense strain UC1 from a patient sputum sample. This is the first draft genome sequence of Mycobacterium obuense, a rapidly growing scotochromogenic mycobacterium. PMID:26067960

  12. Draft Genome Sequence of the Versatile Alkane-Degrading Bacterium Aquabacterium sp. Strain NJ1

    PubMed Central

    Shiwa, Yuh; Yoshikawa, Hirofumi; Zylstra, Gerben J.

    2014-01-01

    The draft genome sequence of a soil bacterium, Aquabacterium sp. strain NJ1, capable of utilizing both liquid and solid alkanes, was deciphered. This is the first report of an Aquabacterium genome sequence. PMID:25477416

  13. Complete Genome Sequence of Listeria seeligeri, a Nonpathogenic Member of the Genus Listeria?

    PubMed Central

    Steinweg, Christiane; Kuenne, Carsten T.; Billion, André; Mraheil, Mobarak A.; Domann, Eugen; Ghai, Rohit; Barbuddhe, Sukhadeo B.; Kärst, Uwe; Goesmann, Alexander; Pühler, Alfred; Weisshaar, Bernd; Wehland, Jürgen; Lampidis, Robert; Kreft, Jürgen; Goebel, Werner; Chakraborty, Trinad; Hain, Torsten

    2010-01-01

    We report the complete and annotated genome sequence of the nonpathogenic Listeria seeligeri SLCC3954 serovar 1/2b type strain harboring the smallest completely sequenced genome of the genus Listeria. PMID:20061480

  14. Lessons Learned From 24 Completely Sequenced AML Genomes - Timothy Ley, TCGA Scientific Symposium 2011

    Cancer.gov

    Home News and Events Multimedia Library Videos Lessons Learned From 24 Completely Sequenced AML Genomes - Timothy Ley Lessons Learned From 24 Completely Sequenced AML Genomes - Timothy Ley, TCGA Scientific Symposium 2011 You will need Adobe Flash

  15. Synaptotagmin gene content of the sequenced genomes

    E-print Network

    Craxton, Molly

    2004-07-06

    cladogram tree of relationships between the Syts. The multiple alignments are arranged in the same way, with N-terminus and linker regions in figs 3,4,5 and C2A to C- terminus regions in figs 6,7,8. Intron positions, alternative splicing and RNA... sequences are altered in specific Syts but overall, these regions range from N-terminus to C-terminus indicating a sophisticated control of many functions. linker of Syt1 in Anopheles, Drosophila, Mus and Homo (fig. 3). The functional consequences...

  16. Two new complete genome sequences offer insight into host and tissue specificity of plant pathogenic Xanthomonas spp.

    E-print Network

    2011-01-01

    important in plant disease. Complete genome sequences ofGenome Sequences Offer Insight into Host and Tissue Specificity of Plantplant-inducible promoter box and a ?10 box-like sequence, from the genome

  17. A map of human genome variation from population scale sequencing

    PubMed Central

    2011-01-01

    The 1000 Genomes Project aims to provide a deep characterisation of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. We present results of the pilot phase of the project, designed to develop and compare different strategies for genome wide sequencing with high throughput sequencing platforms. We undertook three projects: low coverage whole genome sequencing of 179 individuals from four populations, high coverage sequencing of two mother-father-child trios, and exon targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million SNPs, 1 million short insertions and deletions and 20,000 structural variants, the majority of which were previously undescribed. We show that over 95% of the currently accessible variants found in any individual are present in this dataset; on average, each person carries approximately 250 to 300 loss of function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios we directly estimate the rate of de novo germline base substitution mutations to be approximately 10?8 per base pair per generation. We find many putative functional variants with large allele frequency differences between populations. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. PMID:20981092

  18. Tracking a Hospital Outbreak of Carbapenem-Resistant Klebsiella pneumoniae with Whole-Genome Sequencing

    PubMed Central

    Snitkin, Evan S.; Zelazny, Adrian M.; Thomas, Pamela J.; Stock, Frida; Henderson, David K.; Palmore, Tara N.; Segre, Julia A.

    2012-01-01

    The Gram-negative bacteria Klebsiella pneumoniae is a major cause of nosocomial infections, primarily among immunocompromised patients. The emergence of strains resistant to carbapenems has left few treatment options, making infection containment critical. In 2011, the U.S. National Institutes of Health Clinical Center experienced an outbreak of carbapenem-resistant K. pneumoniae that affected 18 patients, 11 of whom died. Whole-genome sequencing was performed on K. pneumoniae isolates to gain insight into why the outbreak progressed despite early implementation of infection control procedures. Integrated genomic and epidemiological analysis traced the outbreak to three independent transmissions from a single patient who was discharged 3 weeks before the next case became clinically apparent. Additional genomic comparisons provided evidence for unexpected transmission routes, with subsequent mining of epidemiological data pointing to possible explanations for these transmissions. Our analysis demonstrates that integration of genomic and epidemiological data can yield actionable insights and facilitate the control of nosocomial transmission. PMID:22914622

  19. Tracking a hospital outbreak of carbapenem-resistant Klebsiella pneumoniae with whole-genome sequencing.

    PubMed

    Snitkin, Evan S; Zelazny, Adrian M; Thomas, Pamela J; Stock, Frida; Henderson, David K; Palmore, Tara N; Segre, Julia A

    2012-08-22

    The Gram-negative bacteria Klebsiella pneumoniae is a major cause of nosocomial infections, primarily among immunocompromised patients. The emergence of strains resistant to carbapenems has left few treatment options, making infection containment critical. In 2011, the U.S. National Institutes of Health Clinical Center experienced an outbreak of carbapenem-resistant K. pneumoniae that affected 18 patients, 11 of whom died. Whole-genome sequencing was performed on K. pneumoniae isolates to gain insight into why the outbreak progressed despite early implementation of infection control procedures. Integrated genomic and epidemiological analysis traced the outbreak to three independent transmissions from a single patient who was discharged 3 weeks before the next case became clinically apparent. Additional genomic comparisons provided evidence for unexpected transmission routes, with subsequent mining of epidemiological data pointing to possible explanations for these transmissions. Our analysis demonstrates that integration of genomic and epidemiological data can yield actionable insights and facilitate the control of nosocomial transmission. PMID:22914622

  20. A Model of the Statistical Power of Comparative Genome Sequence Analysis

    E-print Network

    Eddy, Sean

    by their evolutionary conservation [1,2,3]. It will be instrumental for achieving the goal of the Human Genome Project to comprehensively identify functional elements in the human genome [4]. How many comparative genome sequences do we not contribute significant information to human genome analysis? Since sequencing is expensive and capacity

  1. SeqEntropy: Genome-Wide Assessment of Repeats for Short Read Sequencing

    E-print Network

    Chen, Chaur-Chin

    analysis of human genome [1] and for rapid full genome sequencing and typing of various organisms. The 1000 Genomes Project, launched in 2008, bSeqEntropy: Genome-Wide Assessment of Repeats for Short Read Sequencing Hsueh-Ting Chu1,2 , William

  2. Sequencing and analysis of bacterial genomes Eugene V. Koonin, Arcady R. Mushegian and Kenneth E. Rudd

    E-print Network

    Fernando, Chrisantha

    on the E. coli genome [4­7], two integrated Bacillus subtilis genome databases [8,9] and the new databases404 Review Sequencing and analysis of bacterial genomes Eugene V. Koonin, Arcady R. Mushegian and Kenneth E. Rudd The complete sequences of two small bacterial genomes have recently become available

  3. Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution

    E-print Network

    Jarvis, Erich D.

    for biologists work- ing with natural populations of non-model organisms. Large-scale genomic analysis was mainly the avian genome, we performed brain transcriptome sequencing using Roche 454 technology of 10 different non-model of next-generation sequencing for obtaining genomic resources for comparative genomic analysis of non-model

  4. The peach dehydrin family is small relative to all other sequenced plant genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recent advances in genomic sequencing technology have allowed the addition of a number of crops to the growing list of completely sequenced genomes. We have analyzed the peach genome for the dehydrin gene family and compared its members to the genomes of Arabidopsis, poplar, apple and rice. This c...

  5. Research participants' attitudes towards the confidentiality of genomic sequence information.

    PubMed

    Jamal, Leila; Sapp, Julie C; Lewis, Katie; Yanes, Tatiane; Facio, Flavia M; Biesecker, Leslie G; Biesecker, Barbara B

    2014-08-01

    Respecting the confidentiality of personal data contributed to genomic studies is an important issue for researchers using genomic sequencing in humans. Although most studies adhere to rules of confidentiality, there are different conceptions of confidentiality and why it is important. The resulting ambiguity obscures what is at stake when making tradeoffs between data protection and other goals in research, such as transparency, reciprocity, and public benefit. Few studies have examined why participants in genomic research care about how their information is used. To explore this topic, we conducted semi-structured phone interviews with 30 participants in two National Institutes of Health research protocols using genomic sequencing. Our results show that research participants value confidentiality as a form of control over information about themselves. To the individuals we interviewed, control was valued as a safeguard against discrimination in a climate of uncertainty about future uses of individual genome data. Attitudes towards data sharing were related to the goals of research and details of participants' personal lives. Expectations of confidentiality, trust in researchers, and a desire to advance science were common reasons for willingness to share identifiable data with investigators. Nearly, all participants were comfortable sharing personal data that had been de-identified. These findings suggest that views about confidentiality and data sharing are highly nuanced and are related to the perceived benefits of joining a research study. PMID:24281371

  6. The impact of next-generation sequencing on genomics

    PubMed Central

    Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

    2011-01-01

    This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come. PMID:21477781

  7. Genomic Sequencing and Analysis of Sucra jujuba Nucleopolyhedrovirus

    PubMed Central

    Liu, Xiaoping; Yin, Feifei; Zhu, Zheng; Hou, Dianhai; Wang, Jun; Zhang, Lei; Wang, Manli; Wang, Hualin; Hu, Zhihong; Deng, Fei

    2014-01-01

    The complete nucleotide sequence of Sucra jujuba nucleopolyhedrovirus (SujuNPV) was determined by 454 pyrosequencing. The SujuNPV genome was 135,952 bp in length with an A+T content of 61.34%. It contained 131 putative open reading frames (ORFs) covering 87.9% of the genome. Among these ORFs, 37 were conserved in all baculovirus genomes that have been completely sequenced, 24 were conserved in lepidopteran baculoviruses, 65 were found in other baculoviruses, and 5 were unique to the SujuNPV genome. Seven homologous regions (hrs) were identified in the SujuNPV genome. SujuNPV contained several genes that were duplicated or copied multiple times: two copies of helicase, DNA binding protein gene (dbp), p26 and cg30, three copies of the inhibitor of the apoptosis gene (iap), and four copies of the baculovirus repeated ORF (bro). Phylogenetic analysis suggested that SujuNPV belongs to a subclade of group II alphabaculovirus, which differs from other baculoviruses in that all nine members of this subclade contain a second copy of dbp. PMID:25329074

  8. The complete mitochondrial genome sequence of pike perch (Sander canadensis).

    PubMed

    Cao, Ding-Chen; Li, Jiong-Tang; Kuang, You-Yi; Guo, Jia-Xiang; Xu, Wei; Xue, Wei; Sun, Xiao-Wen

    2015-02-01

    Pike perch (Sander canadensis) is a member of the largest order of Osteichthyes, Perciformes, and is an important ecological and economic freshwater species, which distributes in Ili River and Ergis River of Xinjiang Province, China. In this study, we sequenced the whole mitochondrial genome of pike perch, and analyzed the similarity with its related species. The mitochondrial genome of S. canadensis is 16,542?bp in length with 55.05% AT content, contained 13 protein coding genes, 22 tRNA genes, 2 ribosomal genes and an 892?bp non-coding region. In control region, 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F), one potential TAS and one poly-T region were identified. Comparing all protein-coding genes and whole genome sequence with 4 species of Perciformes (three species of Percidae, Perca flavescens. Percina macrolepida. Etheostoma radiosum and one outgroup Oreochromis sp. red tilapia), ND3 gene has the highest mutation rate, and S. canadensis has higher similarity with Perca flavescens than others. The mitochondrial genomic sequence will help us to study the conservation genetic and evolution of Percidae. PMID:23815329

  9. Insights into hominid evolution from the gorilla genome sequence

    PubMed Central

    Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard

    2012-01-01

    Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555

  10. A web-based genomic sequence database for the Streptomycetaceae: a tool for systematics and genome mining

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...

  11. About The Center for Cancer Genomics (CCG)

    Cancer.gov

    CCG promotes opportunities to work with other agencies and community physicians to usher in a modern era of diagnosis, treatment, and prevention based on the study of genomes, gene expression, proteomics, and the use of other technologies.

  12. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis.

    PubMed

    Tyler, Brett M; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H Y; Aerts, Andrea; Arredondo, Felipe D; Baxter, Laura; Bensasson, Douda; Beynon, Jim L; Chapman, Jarrod; Damasceno, Cynthia M B; Dorrance, Anne E; Dou, Daolong; Dickerman, Allan W; Dubchak, Inna L; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G; Govers, Francine; Grunwald, Niklaus J; Huang, Wayne; Ivors, Kelly L; Jones, Richard W; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H; Lee, Mi-Kyung; McDonald, W Hayes; Medina, Mónica; Meijer, Harold J G; Nordberg, Eric K; Maclean, Donald J; Ospina-Giraldo, Manuel D; Morris, Paul F; Phuntumart, Vipaporn; Putnam, Nicholas H; Rash, Sam; Rose, Jocelyn K C; Sakihama, Yasuko; Salamov, Asaf A; Savidor, Alon; Scheuring, Chantel F; Smith, Brian M; Sobral, Bruno W S; Terry, Astrid; Torto-Alalibo, Trudy A; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V; Rokhsar, Daniel S; Boore, Jeffrey L

    2006-09-01

    Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes. PMID:16946064

  13. Phytophthora genome sequences uncover evolutionary origins and mechanisms of pathogenesis

    Microsoft Academic Search

    Brett M. Tyler; Sucheta Tripathy; Xuemin Zhang; Paramvir Dehal; Rays H. Y. Jiang; Andrea Aerts; Felipe D. Arredondo; Laura Baxter; Douda Bensasson; Jim L. Beynon; Jarrod Chapman; Cynthia M. B. Damasceno; Anne E. Dorrance; Daolong Dou; Allan W. Dickerman; Inna L. Dubchak; Matteo Garbelotto; Mark Gijzen; Stuart G. Gordon; Francine Govers; Niklaus J. Grunwald; Wayne Huang; Kelly L. Ivors; Richard W. Jones; Sophien Kamoun; Konstantinos Krampis; Kurt H. Lamour; Mi-Kyung Lee; W. Hayes McDonald; Mónica Medina; Harold J. G. Meijer; Eric K. Nordberg; Donald J. Maclean; Manuel D. Ospina-Giraldo; Paul F. Morris; Vipaporn Phuntumart; Nicholas H. Putnam; Sam Rash; Jocelyn K. C. Rose; Yasuko Sakihama; Asaf A. Salamov; Alon Savidor; Chantel F. Scheuring; Brian M. Smith; Bruno W. S. Sobral; Astrid Terry; Trudy A. Torto-Alalibo; Joe Win; Zhanyou Xu; Hongbin Zhang; Igor V. Grigoriev; Daniel S. Rokhsar; Jeffrey L. Boore

    2006-01-01

    Draft genome sequences have been determined for the soybean pathogen\\u000a Phytophthora sojae and the sudden oak death pathogen Phytophthora\\u000a ramorum. Oomycetes such as these Phytophthora species share the kingdom\\u000a Stramenopila with photosynthetic algae such as diatoms, and the presence\\u000a of many Phytophthora genes of probable phototroph origin supports a\\u000a photosynthetic ancestry for the stramenopiles. Comparison of the two\\u000a species' genomes

  14. Complete genome sequence of Streptomyces lividans TK24.

    PubMed

    Rückert, Christian; Albersmeier, Andreas; Busche, Tobias; Jaenicke, Sebastian; Winkler, Anika; Friðjónsson, Ólafur H; Hreggviðsson, Guðmundur Óli; Lambert, Christophe; Badcock, Daniel; Bernaerts, Kristel; Anne, Jozef; Economou, Anastassios; Kalinowski, Jörn

    2015-04-10

    Streptomyces lividans TK24 is the standard host for the heterologous expression of a number of different proteins and antibiotic-synthesizing enzymes. As such, it is often used as an experimental microbial cell factory for the production of secreted heterologous proteins including human cytokines and industrial enzymes, and of several antibiotics. It accepts methylated DNA and is an ideal Streptomyces cloning system. Here, we report the complete genome sequence of S. lividans TK24 that includes a plasmid-less genome of 8.345Mbp (72.24% G+C content). PMID:25680930

  15. Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis

    SciTech Connect

    Tyler, Brett M.; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H. Y.; Aerts, Andrea; Arredondo, Felipe D.; Baxter, Laura; Bensasson, Douda; Beynon, JIm L.; Chapman, Jarrod; Damasceno, Cynthia M. B.; Dorrance, Anne E.; Dou, Daolong; Dickerman, Allan W.; Dubchak, Inna L.; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G.; Govers, Francine; Grunwald, NIklaus J.; Huang, Wayne; Ivors, Kelly L.; Jones, Richard W.; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H.; Lee, Mi-Kyung; McDonald, W. Hayes; Medina, Monica; Meijer, Harold J. G.; Nordberg, Erik K.; Maclean, Donald J.; Ospina-Giraldo, Manuel D.; Morris, Paul F.; Phuntumart, Vipaporn; Putnam, Nicholas J.; Rash, Sam; Rose, Jocelyn K. C.; Sakihama, Yasuko; Salamov, Asaf A.; Savidor, Alon; Scheuring, Chantel F.; Smith, Brian M.; Sobral, Bruno W. S.; Terry, Astrid; Torto-Alalibo, Trudy A.; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Boore, Jeffrey L.

    2006-04-17

    Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.

  16. Draft genome sequence of Rhodococcus rhodochrous strain ATCC 17895.

    PubMed

    Chen, Bi-Shuang; Otten, Linda G; Resch, Verena; Muyzer, Gerard; Hanefeld, Ulf

    2013-10-16

    Rhodococcus rhodochrous ATCC 17895 possesses an array of mono- and dioxygenases, as well as hydratases, which makes it an interesting organism for biocatalysis. R. rhodochrous is a Gram-positive aerobic bacterium with a rod-like morphology. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 6,869,887 bp long genome contains 6,609 protein-coding genes and 53 RNA genes. Based on small subunit rRNA analysis, the strain is more likely to be a strain of Rhodococcus erythropolis rather than Rhodococcus rhodochrous. PMID:24501654

  17. The CHAOS\\/DIALIGN WWW server for multiple alignment of genomic sequences

    Microsoft Academic Search

    Michael Brudno; Rasmus Steinkamp; Burkhard Morgenstern

    2004-01-01

    Cross-species sequence comparison is a powerful approach to analyze functional sites in genomic sequences and many discoveries have been made based on genomic alignments. Herein, we present a WWW-based software system for multiple alignment of large genomic sequences. Our server utilizes the previously developed combination of CHAOS and DIALIGN to achieve both speed and alignment accu- racy. CHAOS is a

  18. Draft Genome Sequence of Bacillus atrophaeus UCMB-5137, a Plant Growth-Promoting Rhizobacterium

    E-print Network

    Draft Genome Sequence of Bacillus atrophaeus UCMB-5137, a Plant Growth-Promoting Rhizobacterium Wai activity in root colonization and plant and crop protection. Its draft genome sequence comprises 21 contigs of 4.11 Mb, harboring 4,167 coding sequences (CDS). The genome carries several genes encoding

  19. Draft Genome Sequence of Lactobacillus hominis Strain CRBIP 24.179T, Isolated from Human Intestine

    E-print Network

    Paris-Sud XI, Université de

    Draft Genome Sequence of Lactobacillus hominis Strain CRBIP 24.179T, Isolated from Human Intestine genome sequence of the strain Lactobacillus hominis CRBIP 24.179T, isolated from a human clinical sample, Clermont D, Loux V, Bizet C, Bouchier C. 2013. Draft genome sequence of Lactobacillus hominis strain CRBIP

  20. Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays

    Microsoft Academic Search

    Radoje Drmanac; Andrew B. Sparks; Matthew J. Callow; Aaron L. Halpern; Norman L. Burns; Bahram G. Kermani; Paolo Carnevali; Igor Nazarenko; Geoffrey B. Nilsen; George Yeung; Fredrik Dahl; Andres Fernandez; Bryan Staker; Krishna P. Pant; Jonathan Baccash; Adam P. Borcherding; Anushka Brownley; Ryan Cedeno; Linsu Chen; Dan Chernikoff; Alex Cheung; Razvan Chirita; Benjamin Curson; Jessica C. Ebert; Coleen R. Hacker; Robert Hartlage; Brian Hauser; Steve Huang; Yuan Jiang; Vitali Karpinchyk; Mark Koenig; Calvin Kong; Tom Landers; Catherine Le; Jia Liu; Celeste E. McBride; Matt Morenzoni; Robert E. Morey; Karl Mutch; Helena Perazich; Kimberly Perry; Brock A. Peters; Joe Peterson; Charit L. Pethiyagoda; Kaliprasad Pothuraju; Claudia Richter; Abraham M. Rosenbaum; Shaunak Roy; Jay Shafto; Uladzislau Sharanhovich; Karen W. Shannon; Conrad G. Sheppy; Michel Sun; Joseph V. Thakuria; Anne Tran; Dylan Vu; Alexander Wait Zaranek; Xiaodi Wu; Snezana Drmanac; Arnold R. Oliphant; William C. Banyai; Bruce Martin; Dennis G. Ballinger; George M. Church; Clifford A. Reid

    2010-01-01

    Genome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs. We sequenced three human genomes with this