These are representative sample records from Science.gov related to your search topic.
For comprehensive and current results, perform a real-time search at Science.gov.
1

The Genome Sequencing Center at NCGR  

SciTech Connect

Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Schilkey, Faye [National Center for Genome Resources

2010-06-02

2

The Genome Database Organism-centered listing of available genomic sequence records and projects  

E-print Network

The Genome Database Organism-centered listing of available genomic sequence records and projects http://www.ncbi.nlm.nih.gov/genome National Center for Biotechnology Information · National Library | NCBI Genome | Last Update August 19, 2013 Contact: info@ncbi.nlm.nih.gov Scope Since 2011, the Genome

Levin, Judith G.

3

Operational streamlining in a high-throughput genome sequencing center  

E-print Network

Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

Person, Kerry P. (Kerry Patrick)

2006-01-01

4

Nevada Genomics Center These are general instructions on how to use dnaTools to submit sequencing  

E-print Network

Nevada Genomics Center These are general instructions on how to use dnaTools to submit sequencing samples. We here at the Nevada Genomics Center feel that dnaTools is user friendly and fairly intuitive-784-1657) or email us (Genomics@unr.nevada.edu) and we will assist you. How to use dnaTools Table of Contents

Hemmers, Oliver

5

Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)  

SciTech Connect

John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

Crow, John [National Center for Genome Resources] [National Center for Genome Resources

2012-06-01

6

Genome Characterization Centers  

Cancer.gov

Genomics is a fast-moving field with novel technologies and platforms that help characterize the genome being made available to the research community on a continual basis. The Cancer Genome Atlas (TCGA) Genome Characterization Centers (GCCs) are responsible for characterizing all of the genomic changes found in the tumors studied as part of the TCGA program.

7

Whole Genome Sequencing  

MedlinePLUS

... research, but there are several companies that can sequence your DNA. These are known as direct-to-consumer tests . The testing that is offered through a physician is currently several thousand dollars. Many biotechnology companies, however, are racing to sequence the genome for under $1000 and at a ...

8

The Genome Center at Washington University  

SciTech Connect

Bob Fulton of Washington University discusses the sequencing platforms in use at this large scale genome center on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Fulton, Bob [Washington University

2010-06-02

9

Prenatal Whole Genome Sequencing  

PubMed Central

With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means. PMID:22777977

Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

2014-01-01

10

Operations capability improvement of a molecular biology laboratory in a high throughput genome sequencing center  

E-print Network

The Broad Institute is a research collaboration of MIT, Harvard University and affiliated hospitals, and the Whitehead Institute for Biomedical Research. Its scientific mission is to "(1) create tools for genomic medicine ...

Vokoun, Matthew R. (Matthew Richard)

2005-01-01

11

Towards Sequencing Cotton (Gossypium) Genomes  

Technology Transfer Automated Retrieval System (TEKTRAN)

Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly. Generating larger amounts of sequence data more quickly does not address the difficulties of sequencing and assembling complex genomes de novo. The cotton genomes represent a...

12

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

Schadt, Christopher Warren [ORNL; Baker, Scott [Pacific Northwest National Laboratory (PNNL); Thykaer, Jette [Pacific Northwest National Laboratory (PNNL); Adney, William S [National Renewable Energy Laboratory (NREL); Brettin, Tom [Los Alamos National Laboratory (LANL); Brockman, Fred [Pacific Northwest National Laboratory (PNNL); Dhaeseleer, Patrick [Lawrence Livermore National Laboratory (LLNL); Martinez, A diego [Los Alamos National Laboratory (LANL); Miller, R michael [Argonne National Laboratory (ANL); Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Torok, Tamas [U.S. Department of Energy, Joint Genome Institute; Tuskan, Gerald A [ORNL; Bennett, Joan [Rutgers University; Berka, Randy [Novozymes, Inc; Briggs, Steven [University of California, San Diego; Heitman, Joseph [Duke University; Rizvi, L [Royal Ontario Museum; Taylor, John [University of California, Berkeley; Turgeon, Gillian [Cornell University; Werner-Washburne, Maggie [University of New Mexico, Albuquerque; Himmel, Michael [ORNL

2008-01-01

13

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions.

Baker, Scott E.; Thykaer, Jette; Adney, William S.; Brettin, T.; Brockman, Fred J.; D'haeseleer, Patrik; Martinez, Antonio D.; Miller, R. M.; Rokhsar, Daniel S.; Schadt, Christopher W.; Torok, Tamas; Tuskan, Gerald; Bennett, Joan W.; Berka, Randy; Briggs, Steve; Heitman, Joseph; Taylor, John; Turgeon, Barbara G.; Werner-Washburne, Maggie; Himmel, Michael E.

2008-09-30

14

Fungal Genome Sequencing and Bioenergy  

SciTech Connect

To date, the number of ongoing filamentous fungal genome sequencing projects is almost tenfold fewer than those of bacterial and archaeal genome projects. The fungi chosen for sequencing represent narrow kingdom diversity; most are pathogens or models. We advocate an ambitious, forward-looking phylogenetic-based genome sequencing program, designed to capture metabolic diversity within the fungal kingdom, thereby enhancing research into alternative bioenergy sources, bioremediation, and fungal-environment interactions. Published by Elsevier Ltd on behalf of The British Mycological Society.

Baker, Scott [Pacific Northwest National Laboratory (PNNL); Thykaer, Jette [Pacific Northwest National Laboratory (PNNL); Adney, William S [National Renewable Energy Laboratory (NREL); Brettin, Tom [Los Alamos National Laboratory (LANL); Brockman, Fred [Pacific Northwest National Laboratory (PNNL); Dhaeseleer, Patrick [Lawrence Livermore National Laboratory (LLNL); Martinez, A diego [Los Alamos National Laboratory (LANL); Miller, R michael [Argonne National Laboratory (ANL); Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Schadt, Christopher Warren [ORNL; Torok, Tamas [U.S. Department of Energy, Joint Genome Institute; Tuskan, Gerald A [ORNL; Bennett, Joan [Rutgers University; Berka, Randy [Novozymes, Inc; Briggs, Steven [University of California, San Diego; Heitman, Joseph [Duke University; Taylor, John [University of California, Berkeley; Turgeon, Gillian [Cornell University; Werner-Washburne, Maggie [University of New Mexico, Albuquerque; Himmel, Michael E [National Renewable Energy Laboratory (NREL)

2008-01-01

15

Supplementary Information The genome sequence of the orchid Phalaenopsis equestris  

E-print Network

Supplementary Information The genome sequence of the orchid Phalaenopsis equestris Jing Cai1 Shenzhen Key Laboratory for Orchid Conservation and Utilization, National Orchid Conservation Center of China and Orchid Conservation and Research Center of Shenzhen, Shenzhen, China. 2 Center

Kaski, Samuel

16

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

2000-01-01

17

The complete sequence of a heterochromatic island from a higher eukaryote. The Cold Spring Harbor Laboratory, Washington University Genome Sequencing Center, and PE Biosystems Arabidopsis Sequencing Consortium.  

PubMed

Heterochromatin, constitutively condensed chromosomal material, is widespread among eukaryotes but incompletely characterized at the nucleotide level. We have sequenced and analyzed 2.1 megabases (Mb) of Arabidopsis thaliana chromosome 4 that includes 0.5-0.7 Mb of isolated heterochromatin that resembles the chromosomal knobs described by Barbara McClintock in maize. This isolated region has a low density of expressed genes, low levels of recombination and a low incidence of genetrap insertion. Satellite repeats were absent, but tandem arrays of long repeats and many transposons were found. Methylation of these sequences was dependent on chromatin remodeling. Clustered repeats were associated with condensed chromosomal domains elsewhere. The complete sequence of a heterochromatic island provides an opportunity to study sequence determinants of chromosome condensation. PMID:10676819

2000-02-01

18

Fuzzy Genome Sequence Assembly for Single and Environmental Genomes  

E-print Network

Fuzzy Genome Sequence Assembly for Single and Environmental Genomes Sara Nasser, Adrienne Breland for multiple genome sequence assembly of cultured genomes (single organism) and environmental genomes (multiple DNA is the building block of all life on this planet, from single cell micro- scopic bacteria to more

Nicolescu, Monica

19

Sequencing and mapping of the onion genome  

Technology Transfer Automated Retrieval System (TEKTRAN)

The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

20

Challenges of sequencing human genomes  

PubMed Central

Massively parallel sequencing technologies continue to alter the study of human genetics. As the cost of sequencing declines, next-generation sequencing (NGS) instruments and datasets will become increasingly accessible to the wider research community. Investigators are understandably eager to harness the power of these new technologies. Sequencing human genomes on these platforms, however, presents numerous production and bioinformatics challenges. Production issues like sample contamination, library chimaeras and variable run quality have become increasingly problematic in the transition from technology development lab to production floor. Analysis of NGS data, too, remains challenging, particularly given the short-read lengths (35–250 bp) and sheer volume of data. The development of streamlined, highly automated pipelines for data analysis is critical for transition from technology adoption to accelerated research and publication. This review aims to describe the state of current NGS technologies, as well as the strategies that enable NGS users to characterize the full spectrum of DNA sequence variation in humans. PMID:20519329

Ding, Li; Mardis, Elaine R.; Wilson, Richard K.

2010-01-01

21

The Sequence of the Human Genome  

Microsoft Academic Search

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

2001-01-01

22

Plant genome sequencing - applications for crop improvement.  

PubMed

It is over 10 years since the genome sequence of the first crop was published. Since then, the number of crop genomes sequenced each year has increased steadily. The amazing pace at which genome sequences are becoming available is largely due to the improvement in sequencing technologies both in terms of cost and speed. Modern sequencing technologies allow the sequencing of multiple cultivars of smaller crop genomes at a reasonable cost. Though many of the published genomes are considered incomplete, they nevertheless have proved a valuable tool to understand important crop traits such as fruit ripening, grain traits and flowering time adaptation. PMID:24679255

Bolger, Marie E; Weisshaar, Bernd; Scholz, Uwe; Stein, Nils; Usadel, Björn; Mayer, Klaus F X

2014-04-01

23

Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,  

E-print Network

COMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant, Arabidopsis thaliana, was published ,6 years ago (Arabidopsis Genome Initiative, 2000). Since Information Entrez Genome Projects website reports that sequencing of several more plant genomes is in prog

Purugganan, Michael D.

24

GENOME SEQUENCING AND ANALYSIS OF ASPERGILLUS ORYZAE  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of Aspergillus oryzae, an important industrial fungus used in the production of oriental fermented foods, such as soy sauce, miso, and sake, has been sequenced. The genome sequence reveals a wealth of genes encoding secreted enzymes. A comparison with the genome sequences of A. nidulans...

25

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

2002-01-01

26

Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project  

Microsoft Academic Search

Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity

Mark D. Adams; Jenny M. Kelley; Jeannine D. Gocayne; Mark Dubnick; Mihael H. Polymeropoulos; Hong Xiao; Carl R. Merril; Andrew Wu; Bjorn Olde; Ruben F. Moreno; Anthony R. Kerlavage; W. Richard McCombie; J. Craig Venter

1991-01-01

27

Genome sequencing and functional genomics approaches in tomato  

Microsoft Academic Search

Tomato genome sequencing has been taking place through an international, 10-year initiative entitled the “International Solanaceae Genome Project” (SOL). The strategy proposed by the SOL consortium is to sequence the approximately 220?Mb of euchromatin that contains the majority of genes, rather than the entire tomato genome. Tomato and other Solanaceae plants have unique developmental aspects, such as the formation of

Daisuke Shibata

2005-01-01

28

Sequencing Intractable DNA to Close Microbial Genomes  

SciTech Connect

Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

2012-01-01

29

Draft Genome Sequence of Lactobacillus rhamnosus 2166.  

PubMed

In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains. PMID:24558254

Karlyshev, Andrey V; Melnikov, Vyacheslav G; Kosarev, Igor V; Abramov, Vyacheslav M

2014-01-01

30

The Human Genome Project: Sequencing the Future  

E-print Network

#12;The Human Genome Project: Sequencing the Future I n 1986, the U.S. Department of Energy (DOE), convinced that its mission would be well served by a comprehensive picture of the human genome, took a bold and unilateral step by announcing its Human Genome Initiative--forerunner of the Human Genome Project

31

The Center for integrative genomics  

E-print Network

The Center for integrative genomics Faculty of biology and medicine a new adventure #12;The CIG, a new advenTure · a new institute with state-of-the-art technologies and facilities · Cutting edge institute located in the Génopode building, situated on the spectacular Dorigny campus of the University

Fankhauser, Christian

32

Value of a newly sequenced bacterial genome.  

PubMed

Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

Barbosa, Eudes Gv; Aburjaile, Flavia F; Ramos, Rommel Tj; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2014-05-26

33

Value of a newly sequenced bacterial genome  

PubMed Central

Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2014-01-01

34

Sequencing Centers Panel at SFAF  

SciTech Connect

From left to right: Faye Schilkey of NCGR, Johar Ali of OICR, Darren Grafham of Wellcome Trust Sanger Institute, Donna Muzny of the Baylor College of Medicine, Bob Fulton of Washington University, Mike Fitzgerald of the Broad Institute, Jessica Hostetler of the J. Craig Venter Institute and Chris Daum of the DOE Joint Genome Institute discuss sequencing technologies, applications and pipelines on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Schilkey, Faye [NCGR; Ali, Johar [OICR; Grafham, Darren [Wellcome Trust Sanger Institute; Muzny, Donna [Baylor College of Medicine; Fulton, Bob [Washington University; Fitzgerald, Mike [Broad Institute; Hostetler, Jessica [J. Craig Venter Institute; Daum, Chris [DOE Joint Genome Institute

2010-06-02

35

Human Whole-Genome Shotgun Sequencing James L. Weber1,3  

E-print Network

Project, many doubted the scientific value of sequencing the entire human genome, these doubts haveHuman Whole-Genome Shotgun Sequencing James L. Weber1,3 and Eugene W. Myers2 1 Center for Medical Science, University of Arizona, Tucson, Arizona 85721 Large-scale sequencing of the human genome is now

Batzoglou, Serafim

36

INTEGRATION OF THE RECOMBINATION AND PHYSICAL MAPS WITH THE GENOME SEQUENCE OF TRIBOLIUM CASTANEUM  

Technology Transfer Automated Retrieval System (TEKTRAN)

The final assembly of the Tribolium genome sequence and its integration with genetic and physical mapping data is nearing completion. Release 2 of the genome assembly by the Baylor College of Medicine’s Human Genome Sequencing Center consists of 420 sequence scaffolds which encompass >95% of the cl...

37

Accurate and comprehensive sequencing of personal genomes.  

PubMed

As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of ?30× coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses of a clinical sample sequenced on two related Illumina platforms, GAII(x) and HiSeq 2000, to a very high depth (126×). We used these data to establish genotype-calling filters that dramatically increase accuracy. We also empirically determined how the callable portion of the genome varies as a function of the amount of sequence data used. These results help provide a "sequencing guide" for future whole-genome sequencing decisions and metrics by which coverage statistics should be reported. PMID:21771779

Ajay, Subramanian S; Parker, Stephen C J; Abaan, Hatice Ozel; Fajardo, Karin V Fuentes; Margulies, Elliott H

2011-09-01

38

Accurate and comprehensive sequencing of personal genomes  

PubMed Central

As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of ?30× coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses of a clinical sample sequenced on two related Illumina platforms, GAIIx and HiSeq 2000, to a very high depth (126×). We used these data to establish genotype-calling filters that dramatically increase accuracy. We also empirically determined how the callable portion of the genome varies as a function of the amount of sequence data used. These results help provide a “sequencing guide” for future whole-genome sequencing decisions and metrics by which coverage statistics should be reported. PMID:21771779

Ajay, Subramanian S.; Parker, Stephen C.J.; Ozel Abaan, Hatice; Fuentes Fajardo, Karin V.; Margulies, Elliott H.

2011-01-01

39

Sequencing complete mitochondrial and plastid genomes.  

PubMed

Organelle genomics has become an increasingly important research field, with applications in molecular modeling, phylogeny, taxonomy, population genetics and biodiversity. Typically, research projects involve the determination and comparative analysis of complete mitochondrial and plastid genome sequences, either from closely related species or from a taxonomically broad range of organisms. Here, we describe two alternative organelle genome sequencing protocols. The "random genome sequencing" protocol is suited for the large majority of organelle genomes irrespective of their size. It involves DNA fragmentation by shearing (nebulization) and blunt-end cloning of the resulting fragments into pUC or BlueScript-type vectors. This protocol excels in randomness of clone libraries as well as in time and cost-effectiveness. The "long-PCR-based genome sequencing" protocol is specifically adapted for DNAs of low purity and quantity, and is particularly effective for small organelle genomes. Library construction by either protocol can be completed within 1 week. PMID:17406621

Burger, Gertraud; Lavrov, Dennis V; Forget, Lise; Lang, B Franz

2007-01-01

40

Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project  

E-print Network

Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project Vicki L. Chandler­3260 (V.B.) On September 20, 2002, the National Science Foundation (NSF) announced the launch of the Maize Genome Sequencing Project. The momentum for this endeavor has been building within the maize (Zea mays

Brendel, Volker

41

Progress in Arabidopsis genome sequencing and functional genomics  

Microsoft Academic Search

Arabidopsis thaliana has a relatively small genome of approximately 130 Mb containing about 10% repetitive DNA. Genome sequencing studies reveal a gene-rich genome, predicted to contain approximately 25?000 genes spaced on average every 4.5 kb. Between 10 to 20% of the predicted genes occur as clusters of related genes, indicating that local sequence duplication and subsequent divergence generates a significant

R. Wambutt; G. Murphy; G. Volckaert; T. Pohl; A Düsterhöft; W Stiekema; K.-D Entian; N Terryn; B Harris; W Ansorge; P Brandt; L Grivell; M Rieger; M Weichselgartner; V de Simone; B Obermaier; R Mache; M Müller; M Kreis; M Delseny; P Puigdomenech; M Watson; T Schmidtheini; B Reichert; D Portatelle; M Perez-Alonso; M Boutry; I Bancroft; P Vos; J Hoheisel; W Zimmermann; H Wedler; P Ridley; S.-A Langham; B McCullagh; L Bilham; J Robben; J Van der Schueren; B Grymonprez; Y.-J Chuang; F Vandenbussche; M Braeken; I Weltjens; M Voet; I Bastiaens; R Aert; E Defoor; T Weitzenegger; G Bothe; U Ramsperger; H Hilbert; M Braun; E Holzer; A Brandt; S Peters; M van Staveren; W Dirkse; P Mooijman; R Klein Lankhorst; M Rose; J Hauf; P Kötter; S Berneiser; S Hempel; M Feldpausch; S Lamberth; H Van den Daele; A De Keyser; C Buysshaert; J Gielen; R Villarroel; R De Clercq; M Van Montagu; J Rogers; A Cronin; M Quail; S Bray-Allen; L Clark; J Doggett; S Hall; M Kay; N Lennard; K McLay; R Mayes; A Pettett; M.-A Rajandream; M Lyne; V Benes; S Rechmann; D Borkova; H Blöcker; M Scharfe; M Grimm; T.-H Löhnert; S Dose; M de Haan; A Maarse; M Schäfer; S Müller-Auer; C Gabel; M Fuchs; B Fartmann; K Granderath; D Dauner; A Herzl; S Neumann; A Argiriou; D Vitale; R Liguori; E Piravandi; O Massenet; F Quigley; G Clabauld; A Mündlein; R Felber; S Schnabl; R Hiller; W Schmidt; A Lecharny; S Aubourg; I Gy; R Cooke; C Berger; A Monfort; E Casacuberta; T Gibbons; N Weber; M Vandenbol; M Bargues; J Terol; A Torres; A Perez-Perez; B Purnelle; E Bent; S Johnson; D Tacon; T Jesse; L Heijnen; S Schwarz; P Scholler; S Heber; C Bielke; D Frishmann; D Haase; K Lemcke; H. W Mewes; S Stocker; P Zaccaria; K Mayer; C Schüller; M Bevan

2000-01-01

42

Towards a reference pecan genome sequence  

Technology Transfer Automated Retrieval System (TEKTRAN)

The cost of generating DNA sequence data has declined dramatically over the previous 15 years as a result of the Human Genome Project and the potential applications of genome sequencing for human medicine. This cost reduction has generated renewed interest among crop breeding scientists in applying...

43

Next generation sequencing of viral RNA genomes  

PubMed Central

Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

2013-01-01

44

The sequence of the human genome.  

PubMed

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge. PMID:11181995

Venter, J C; Adams, M D; Myers, E W; Li, P W; Mural, R J; Sutton, G G; Smith, H O; Yandell, M; Evans, C A; Holt, R A; Gocayne, J D; Amanatides, P; Ballew, R M; Huson, D H; Wortman, J R; Zhang, Q; Kodira, C D; Zheng, X H; Chen, L; Skupski, M; Subramanian, G; Thomas, P D; Zhang, J; Gabor Miklos, G L; Nelson, C; Broder, S; Clark, A G; Nadeau, J; McKusick, V A; Zinder, N; Levine, A J; Roberts, R J; Simon, M; Slayman, C; Hunkapiller, M; Bolanos, R; Delcher, A; Dew, I; Fasulo, D; Flanigan, M; Florea, L; Halpern, A; Hannenhalli, S; Kravitz, S; Levy, S; Mobarry, C; Reinert, K; Remington, K; Abu-Threideh, J; Beasley, E; Biddick, K; Bonazzi, V; Brandon, R; Cargill, M; Chandramouliswaran, I; Charlab, R; Chaturvedi, K; Deng, Z; Di Francesco, V; Dunn, P; Eilbeck, K; Evangelista, C; Gabrielian, A E; Gan, W; Ge, W; Gong, F; Gu, Z; Guan, P; Heiman, T J; Higgins, M E; Ji, R R; Ke, Z; Ketchum, K A; Lai, Z; Lei, Y; Li, Z; Li, J; Liang, Y; Lin, X; Lu, F; Merkulov, G V; Milshina, N; Moore, H M; Naik, A K; Narayan, V A; Neelam, B; Nusskern, D; Rusch, D B; Salzberg, S; Shao, W; Shue, B; Sun, J; Wang, Z; Wang, A; Wang, X; Wang, J; Wei, M; Wides, R; Xiao, C; Yan, C; Yao, A; Ye, J; Zhan, M; Zhang, W; Zhang, H; Zhao, Q; Zheng, L; Zhong, F; Zhong, W; Zhu, S; Zhao, S; Gilbert, D; Baumhueter, S; Spier, G; Carter, C; Cravchik, A; Woodage, T; Ali, F; An, H; Awe, A; Baldwin, D; Baden, H; Barnstead, M; Barrow, I; Beeson, K; Busam, D; Carver, A; Center, A; Cheng, M L; Curry, L; Danaher, S; Davenport, L; Desilets, R; Dietz, S; Dodson, K; Doup, L; Ferriera, S; Garg, N; Gluecksmann, A; Hart, B; Haynes, J; Haynes, C; Heiner, C; Hladun, S; Hostin, D; Houck, J; Howland, T; Ibegwam, C; Johnson, J; Kalush, F; Kline, L; Koduru, S; Love, A; Mann, F; May, D; McCawley, S; McIntosh, T; McMullen, I; Moy, M; Moy, L; Murphy, B; Nelson, K; Pfannkoch, C; Pratts, E; Puri, V; Qureshi, H; Reardon, M; Rodriguez, R; Rogers, Y H; Romblad, D; Ruhfel, B; Scott, R; Sitter, C; Smallwood, M; Stewart, E; Strong, R; Suh, E; Thomas, R; Tint, N N; Tse, S; Vech, C; Wang, G; Wetter, J; Williams, S; Williams, M; Windsor, S; Winn-Deen, E; Wolfe, K; Zaveri, J; Zaveri, K; Abril, J F; Guigó, R; Campbell, M J; Sjolander, K V; Karlak, B; Kejariwal, A; Mi, H; Lazareva, B; Hatton, T; Narechania, A; Diemer, K; Muruganujan, A; Guo, N; Sato, S; Bafna, V; Istrail, S; Lippert, R; Schwartz, R; Walenz, B; Yooseph, S; Allen, D; Basu, A; Baxendale, J; Blick, L; Caminha, M; Carnes-Stine, J; Caulk, P; Chiang, Y H; Coyne, M; Dahlke, C; Mays, A; Dombroski, M; Donnelly, M; Ely, D; Esparham, S; Fosler, C; Gire, H; Glanowski, S; Glasser, K; Glodek, A; Gorokhov, M; Graham, K; Gropman, B; Harris, M; Heil, J; Henderson, S; Hoover, J; Jennings, D; Jordan, C; Jordan, J; Kasha, J; Kagan, L; Kraft, C; Levitsky, A; Lewis, M; Liu, X; Lopez, J; Ma, D; Majoros, W; McDaniel, J; Murphy, S; Newman, M; Nguyen, T; Nguyen, N; Nodell, M; Pan, S; Peck, J; Peterson, M; Rowe, W; Sanders, R; Scott, J; Simpson, M; Smith, T; Sprague, A; Stockwell, T; Turner, R; Venter, E; Wang, M; Wen, M; Wu, D; Wu, M; Xia, A; Zandieh, A; Zhu, X

2001-02-16

45

The genome sequence of Drosophila melanogaster.  

SciTech Connect

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

NONE

2000-03-24

46

DNA sequence organization in the flax genome.  

PubMed

The complexity of the flax genome has been determined by reassociation kinetics. The total complexity of one constituent genome was 3.5 . 10(8) nucleotide pairs. The single copy sequences comprised 44% of the genome and showed a long period interspersion pattern with the repetitive sequences. The repetitive sequences occurred in clusters which stretched for at least 10 000 base pairs. Within these clusters the individual repetitive elements were about 650 base pairs. These elements themselves showed little interspersion of different frequency classes in lengths less than 3000 base pairs. The repetitive sequence duplexes formed on reassociation, except for the satellite DNA, showed a high thermal stability. The fold-back DNA comprised 1% of the total genome, and was itself clustered in a small fraction of the genome. PMID:7213728

Cullis, C A

1981-01-29

47

Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes  

PubMed Central

Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. PMID:22174807

Barthelson, Roger; McFarlin, Adam J.; Rounsley, Steven D.; Young, Sarah

2011-01-01

48

Genome sequence of Coxiella burnetii strain Namibia  

PubMed Central

We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

2014-01-01

49

Genome Sequence of Serratia plymuthica V4  

PubMed Central

Serratia spp. are gammaproteobacteria and members of the family Enterobacteriaceae. Here, we announce the genome sequence of Serratia plymuthica strain V4, which produces the siderophore serratiochelin and antimicrobial compounds. PMID:24831138

Cleto, S.; Van der Auwera, G.; Almeida, C.; Vieira, M. J.; Vlamakis, H.

2014-01-01

50

INVESTIGATION Genomic Sequence Diversity and Population  

E-print Network

for biological research. The genetic diversity contained in the global population of yeast strains represents find diversity among these strains is principally organized by geography, with European, North AmericanINVESTIGATION Genomic Sequence Diversity and Population Structure of Saccharomyces cerevisiae

Fay, Justin

51

Complete Genome Sequence of Equid Herpesvirus 3  

PubMed Central

Equid herpesvirus 3 (EHV-3) is a member of the subfamily Alphaherpesvirinae that causes equine coital exanthema. Here, we report the first complete genome sequence of EHV-3. The 151,601-nt genome encodes 76 distinct genes like other equine alphaherpesviruses, but genetically, EHV-3 is significantly more divergent. PMID:25278519

Vissani, Aldana; Tordoya, Maria Silva; Muylkens, Benoît; Thiry, Etienne; Maes, Piet; Matthijnssens, Jelle; Barrandeguy, Maria; Van Ranst, Marc

2014-01-01

52

Complete Genome Sequences of Nine Mycobacteriophages  

PubMed Central

Genome analyses of a large number of mycobacteriophages, bacterial viruses that infect members of the genus Mycobacterium, yielded novel enzymes and tools for the genetic manipulation of mycobacteria. We report here the complete genome sequences of nine mycobacteriophages, including a new singleton, isolated using Mycobacterium smegmatis mc2155 as a host strain. PMID:24874666

Franceschelli, Jorgelina Judith; Suarez, Cristian Alejandro; Terán, Lucrecia; Raya, Raúl Ricardo

2014-01-01

53

Complete genome sequence of Streptomyces fulvissimus.  

PubMed

The complete genome sequence of Streptomyces fulvissimus (DSM 40593), consisting of a linear chromosome with a size of 7.9Mbp, is reported. Preliminary data indicates that the chromosome of S. fulvissimus contains 32 putative gene clusters involved in the biosynthesis of secondary metabolites, two of them showing very high similarity to the valinomycin and nonactin biosynthetic clusters. The availability of genome sequence of S. fulvissimus will contribute to the evaluation of the full biosynthetical potential of streptomycetes. PMID:23965270

Myronovskyi, M; Tokovenko, B; Manderscheid, N; Petzke, L; Luzhetskyy, A

2013-10-10

54

Genome sequence and analysis of Lactobacillus helveticus  

PubMed Central

The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

2013-01-01

55

Complementary DNA sequencing: Expressed sequence tags and human genome project  

SciTech Connect

Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.

Adams, M.D.; Kelley, J.M.; Gocayne, J.D.; Dubnick, M.; Wu, A.; Olde, B.; Moreno, R.F.; Kerlavage, A.R.; McCombie, W.R.; Venter, J.C. (National Institutes of Health, Bethesda, MD (United States)); Polymeropoulos, M.H.; Hong Xiao; Merril, C.R. (National Inst. of Mental Health, Washington, DC (United States))

1991-06-21

56

Sequencing and comparing whole mitochondrial genomes ofanimals  

SciTech Connect

Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

2005-04-22

57

From sequence mapping to genome assemblies.  

PubMed

The development of "next-generation" high-throughput sequencing technologies has made it possible for many labs to undertake sequencing-based research projects that were unthinkable just a few years ago. Although the scientific applications are diverse, e.g., new genome projects, gene expression analysis, genome-wide functional screens, or epigenetics-the sequence data are usually processed in one of two ways: sequence reads are either mapped to an existing reference sequence, or they are built into a new sequence ("de novo assembly"). In this chapter, we first discuss some limitations of the mapping process and how these may be overcome through local sequence assembly. We then introduce the concept of de novo assembly and describe essential assembly improvement procedures such as scaffolding, contig ordering, gap closure, error evaluation, gene annotation transfer and ab initio gene annotation. The results are high-quality draft assemblies that will facilitate informative downstream analyses. PMID:25388106

Otto, Thomas D

2015-01-01

58

Complete genome sequence of arracacha mottle virus.  

PubMed

Arracacha mottle virus (AMoV) is the only potyvirus reported to infect arracacha (Arracacia xanthorrhiza) in Brazil. Here, the complete genome sequence of an isolate of AMoV was determined to be 9,630 nucleotides in length, excluding the 3' poly-A tail, and encoding a polyprotein of 3,135 amino acids and a putative P3N-PIPO protein. Its genomic organization is typical of a member of the genus Potyvirus, containing all conserved motifs. Its full genome sequence shared 56.2 % nucleotide identity with sunflower chlorotic mottle virus and verbena virus Y, the most closely related viruses. PMID:23001696

Orílio, Anelise F; Lucinda, Natalia; Dusi, André N; Nagata, Tatsuya; Inoue-Nagata, Alice K

2013-01-01

59

Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements  

Microsoft Academic Search

As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments

Aaron C. E. Darling; Bob Mau; Frederick R. Blattner; Nicole T. Perna

2004-01-01

60

Genomic Sequencing of Single Microbial Cells from Environmental Samples  

SciTech Connect

Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

2008-02-01

61

Complete Genome Sequences of 63 Mycobacteriophages  

PubMed Central

Mycobacteriophages are viruses that infect mycobacterial hosts. The current collection of sequenced mycobacteriophages—all isolated on a single host strain, Mycobacterium smegmatis mc2155, reveals substantial genetic diversity. The complete genome sequences of 63 newly isolated mycobacteriophages expand the resolution of our understanding of phage diversity. PMID:24285655

2013-01-01

62

Locus Reference Genomic: reference sequences for the reporting of clinically relevant sequence variants  

PubMed Central

Locus Reference Genomic (LRG; http://www.lrg-sequence.org/) records contain internationally recognized stable reference sequences designed specifically for reporting clinically relevant sequence variants. Each LRG is contained within a single file consisting of a stable ‘fixed’ section and a regularly updated ‘updatable’ section. The fixed section contains stable genomic DNA sequence for a genomic region, essential transcripts and proteins for variant reporting and an exon numbering system. The updatable section contains mapping information, annotation of all transcripts and overlapping genes in the region and legacy exon and amino acid numbering systems. LRGs provide a stable framework that is vital for reporting variants, according to Human Genome Variation Society (HGVS) conventions, in genomic DNA, transcript or protein coordinates. To enable translation of information between LRG and genomic coordinates, LRGs include mapping to the human genome assembly. LRGs are compiled and maintained by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). LRG reference sequences are selected in collaboration with the diagnostic and research communities, locus-specific database curators and mutation consortia. Currently >700 LRGs have been created, of which >400 are publicly available. The aim is to create an LRG for every locus with clinical implications. PMID:24285302

MacArthur, Jacqueline A. L.; Morales, Joannella; Tully, Ray E.; Astashyn, Alex; Gil, Laurent; Bruford, Elspeth A.; Larsson, Pontus; Flicek, Paul; Dalgleish, Raymond; Maglott, Donna R.; Cunningham, Fiona

2014-01-01

63

Computational Genomics: From Genome Sequence To Global Gene Regulation  

NASA Astrophysics Data System (ADS)

As various genome projects are shifting to the post-sequencing phase, it becomes a big challenge to analyze the sequence data and extract biological information using computational tools. In the past, computational genomics has mainly focused on finding new genes and mapping out their biological functions. With the rapid accumulation of experimental data on genome-wide gene activities, it is now possible to understand how genes are regulated on a genomic scale. A major mechanism for gene regulation is to control the level of transcription, which is achieved by regulatory proteins that bind to short DNA sequences - the regulatory elements. We have developed a new approach to identifying regulatory elements in genomes. The approach formalizes how one would proceed to decipher a ``text'' consisting of a long string of letters written in an unknown language that did not delineate words. The algorithm is based on a statistical mechanics model in which the sequence is segmented probabilistically into ``words'' and a ``dictionary'' of ``words'' is built concurrently. For the control regions in the yeast genome, we built a ``dictionary'' of about one thousand words which includes many known as well as putative regulatory elements. I will discuss how we can use this dictionary to search for genes that are likely to be regulated in a similar fashion and to analyze gene expression data generated from DNA micro-array experiments.

Li, Hao

2000-03-01

64

Next-generation sequencing applied to rare diseases genomics.  

PubMed

Genomics has revolutionized the study of rare diseases. In this review, we overview the latest technological development, rare disease discoveries, implementation obstacles and bioethical challenges. First, we discuss the technology of genome and exome sequencing, including the different next-generation platforms and exome enrichment technologies. Second, we survey the pioneering centers and discoveries for rare diseases, including few of the research institutions that have contributed to the field, as well as an overview survey of different types of rare diseases that have had new discoveries due to next-generation sequencing. Third, we discuss the obstacles and challenges that allow for clinical implementation, including returning of results, informed consent and privacy. Last, we discuss possible outlook as clinical genomics receives wider adoption, as third-generation sequencing is coming onto the horizon, and some needs in informatics and software to further advance the field. PMID:24702023

Danielsson, Krissi; Mun, Liew Jun; Lordemann, Amanda; Mao, Jimmy; Lin, Cheng-Ho Jimmy

2014-05-01

65

Standardized metadata for human pathogen/vector genomic sequences.  

PubMed

High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium's minimal information (MIxS) and NCBI's BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. PMID:24936976

Dugan, Vivien G; Emrich, Scott J; Giraldo-Calderón, Gloria I; Harb, Omar S; Newman, Ruchi M; Pickett, Brett E; Schriml, Lynn M; Stockwell, Timothy B; Stoeckert, Christian J; Sullivan, Dan E; Singh, Indresh; Ward, Doyle V; Yao, Alison; Zheng, Jie; Barrett, Tanya; Birren, Bruce; Brinkac, Lauren; Bruno, Vincent M; Caler, Elizabet; Chapman, Sinéad; Collins, Frank H; Cuomo, Christina A; Di Francesco, Valentina; Durkin, Scott; Eppinger, Mark; Feldgarden, Michael; Fraser, Claire; Fricke, W Florian; Giovanni, Maria; Henn, Matthew R; Hine, Erin; Hotopp, Julie Dunning; Karsch-Mizrachi, Ilene; Kissinger, Jessica C; Lee, Eun Mi; Mathur, Punam; Mongodin, Emmanuel F; Murphy, Cheryl I; Myers, Garry; Neafsey, Daniel E; Nelson, Karen E; Nierman, William C; Puzak, Julia; Rasko, David; Roos, David S; Sadzewicz, Lisa; Silva, Joana C; Sobral, Bruno; Squires, R Burke; Stevens, Rick L; Tallon, Luke; Tettelin, Herve; Wentworth, David; White, Owen; Will, Rebecca; Wortman, Jennifer; Zhang, Yun; Scheuermann, Richard H

2014-01-01

66

Genome Sequence of the Palaeopolyploid soybean  

SciTech Connect

Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

2009-08-03

67

Viral genome sequencing by random priming methods  

PubMed Central

Background Most emerging health threats are of zoonotic origin. For the overwhelming majority, their causative agents are RNA viruses which include but are not limited to HIV, Influenza, SARS, Ebola, Dengue, and Hantavirus. Of increasing importance therefore is a better understanding of global viral diversity to enable better surveillance and prediction of pandemic threats; this will require rapid and flexible methods for complete viral genome sequencing. Results We have adapted the SISPA methodology [1-3] to genome sequencing of RNA and DNA viruses. We have demonstrated the utility of the method on various types and sources of viruses, obtaining near complete genome sequence of viruses ranging in size from 3,000–15,000 kb with a median depth of coverage of 14.33. We used this technique to generate full viral genome sequence in the presence of host contaminants, using viral preparations from cell culture supernatant, allantoic fluid and fecal matter. Conclusion The method described is of great utility in generating whole genome assemblies for viruses with little or no available sequence information, viruses from greatly divergent families, previously uncharacterized viruses, or to more fully describe mixed viral infections. PMID:18179705

Djikeng, Appolinaire; Halpin, Rebecca; Kuzmickas, Ryan; DePasse, Jay; Feldblyum, Jeremy; Sengamalay, Naomi; Afonso, Claudio; Zhang, Xinsheng; Anderson, Norman G; Ghedin, Elodie; Spiro, David J

2008-01-01

68

Human Genome Project Sequencing, 3D animation with basic narrationSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

DNAi Location: Genome>Project>putting it together>Mapping the genome As represented by this huge stack of paper, the human genome contains more than three billion nucleotides or DNA letters. The first stage of the public Human Genome Project focused on identifying marker sequences or unique tags (shown here in yellow) at regular intervals throughout this \\"book of life.\\" Once enough sequences were tagged, various blocks of the genome were allocated to different academic centers for sequencing.

2008-10-06

69

Sequencing and Utilization of the Gossypium Genomes  

Microsoft Academic Search

Revealing the genetic underpinnings of cotton productivity will require understanding both the prehistoric evolution of spinnable\\u000a fibers, and the results of independent domestication processes in both the Old and New Worlds. Progress toward a reference\\u000a sequence for the smallest Gossypium genome is a logical stepping-stone toward revealing diversity in the remaining seven genomes (A, B, C, E, F, G, K)

Andrew H. Paterson; Jun-kang Rong; Alan R. Gingle; Peng W. Chee; Elizabeth S. Dennis; Danny Llewellyn; Leon S. Dure; Candace Haigler; Gerald O. Myers; Daniel G. Peterson; Mehboob ur Rahman; Yusuf Zafar; Umesh Reddy; Yehoshua Saranga; James M. Stewart; Joshua A. Udall; Vijay N. Waghmare; Jonathan F. Wendel; Thea A. Wilkins; Robert J. Wright; Essam Zaki; Elsayed E. Hafez; Jun Zhu

2010-01-01

70

Sequencing and comparative analysis of the gorilla MHC genomic sequence.  

PubMed

Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

2013-01-01

71

Sequencing and comparative analysis of the gorilla MHC genomic sequence  

PubMed Central

Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC. PMID:23589541

Wilming, Laurens G.; Hart, Elizabeth A.; Coggill, Penny C.; Horton, Roger; Gilbert, James G. R.; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L.

2013-01-01

72

Noninvasive fetal genome sequencing: a primer  

PubMed Central

We recently demonstrated whole genome sequencing of a human fetus using only parental DNA samples and plasma from the pregnant mother. This proof-of-concept study demonstrated how samples obtained noninvasively in the first or second trimester can be analyzed to yield a highly accurate and substantially complete genetic profile of the fetus, including both inherited and de novo variation. Here, we revisit our original study from a clinical standpoint, provide an overview of the scientific approach, and describe opportunities and challenges along the path towards clinical adoption of noninvasive fetal whole genome sequencing (NIFWGS). PMID:23553552

Snyder, Matthew W.; Simmons, LaVone E.; Kitzman, Jacob O.; Santillan, Donna A.; Santillan, Mark K.; Gammill, Hilary S.; Shendure, Jay

2013-01-01

73

Genome Sequence of Mercury-Methylating and Pleomorphic Desulfovibrio africanus  

E-print Network

Genome Sequence of Mercury-Methylating and Pleomorphic Desulfovibrio africanus Contact: Steven D. africanus genome sequence to allow us to gain insights into the physiological states genomics using the sequence information for D. africanus and the previously sequenced mercury methylator D

74

Mapping and sequencing the human genome  

SciTech Connect

Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

none,

1988-01-01

75

The complete genome sequence of Mycobacterium bovis  

PubMed Central

Mycobacterium bovis is the causative agent of tuberculosis in a range of animal species and man, with worldwide annual losses to agriculture of $3 billion. The human burden of tuberculosis caused by the bovine tubercle bacillus is still largely unknown. M. bovis was also the progenitor for the M. bovis bacillus Calmette–Guérin vaccine strain, the most widely used human vaccine. Here we describe the 4,345,492-bp genome sequence of M. bovis AF2122/97 and its comparison with the genomes of Mycobacterium tuberculosis and Mycobacterium leprae. Strikingly, the genome sequence of M. bovis is >99.95% identical to that of M. tuberculosis, but deletion of genetic information has led to a reduced genome size. Comparison with M. leprae reveals a number of common gene losses, suggesting the removal of functional redundancy. Cell wall components and secreted proteins show the greatest variation, indicating their potential role in host–bacillus interactions or immune evasion. Furthermore, there are no genes unique to M. bovis, implying that differential gene expression may be the key to the host tropisms of human and bovine bacilli. The genome sequence therefore offers major insight on the evolution, host preference, and pathobiology of M. bovis. PMID:12788972

Garnier, Thierry; Eiglmeier, Karin; Camus, Jean-Christophe; Medina, Nadine; Mansoor, Huma; Pryor, Melinda; Duthoy, Stephanie; Grondin, Sophie; Lacroix, Celine; Monsempe, Christel; Simon, Sylvie; Harris, Barbara; Atkin, Rebecca; Doggett, Jon; Mayes, Rebecca; Keating, Lisa; Wheeler, Paul R.; Parkhill, Julian; Barrell, Bart G.; Cole, Stewart T.; Gordon, Stephen V.; Hewinson, R. Glyn

2003-01-01

76

Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria  

PubMed Central

Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W.; Aarestrup, Frank M.; Lund, Ole

2012-01-01

77

Draft Genome Sequence of Bacillus oceanisediminis 2691  

PubMed Central

Bacillus oceanisediminis 2691 is an aerobic, Gram-positive, spore-forming, and moderately halophilic bacterium that was isolated from marine sediment of the Yellow Sea coast of South Korea. Here, we report the draft genome sequence of B. oceanisediminis 2691 that may have an important role in the bioremediation of marine sediment. PMID:23105082

Lee, Yong-Jik; Lee, Sang-Jae; Jeong, Haeyoung; Kim, Hyun Ju; Ryu, Naeun; Kim, Byoung-Chan; Lee, Han-Seung

2012-01-01

78

Supplementary Information Genome Sequence and Assembly  

E-print Network

and transferred to darkness for 2 days prior to nuclei isolation to reduce starch levels. Nuclei were prepared 1 with an additional Percoll gradient purification of nuclei. High molecular weight DNA was extracted and purified input. The whole genome shotgun strategy involved end-sequencing different sized insert libraries

Green, Pamela

79

Genome Sequence of Corynebacterium ulcerans Strain 210932  

PubMed Central

In this work, we present the complete genome sequence of Corynebacterium ulcerans strain 210932, isolated from a human. The species is an emergent pathogen that infects a variety of wild and domesticated animals and humans. It is associated with a growing number of cases of a diphtheria-like disease around the world. PMID:25428977

Viana, Marcus Vinicius Canário; de Jesus Benevides, Leandro; Batista Mariano, Diego Cesar; de Souza Rocha, Flávia; Bagano Vilas Boas, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Alves Dorella, Fernanda; Gomes Leal, Carlos Augusto; Fiorini de Carvalho, Alex; Silva, Artur; de Castro Soares, Siomar; Pereira Figueiredo, Henrique Cesar; Guimarães, Luis Carlos

2014-01-01

80

Complete Genome Sequence of Treponema pallidum, the  

E-print Network

Complete Genome Sequence of Treponema pallidum, the Syphilis Spirochete Claire M. Fraser,* Steven J and substantiates the considerable di- versity observed among pathogenic spirochetes. Venereal syphilis was first century with the age of exploration. Syphilis was ubiquitous by the 19th century and has been called

Salzberg, Steven

81

Genomic Sequencing George M. Church, Walter Gilbert  

E-print Network

combined with complete restriction enzyme digestion and separation by size on a denaturing gel preserves the 3' or 5' end of one restriction site in the genome. Numerous different sequences can be obtained and hybridized to a short single-stranded 32P_Ia_ beled probe specific for one end of one restriction fragment

Church, George M.

82

A draft sequence of the Neandertal genome.  

PubMed

Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other. PMID:20448178

Green, Richard E; Krause, Johannes; Briggs, Adrian W; Maricic, Tomislav; Stenzel, Udo; Kircher, Martin; Patterson, Nick; Li, Heng; Zhai, Weiwei; Fritz, Markus Hsi-Yang; Hansen, Nancy F; Durand, Eric Y; Malaspinas, Anna-Sapfo; Jensen, Jeffrey D; Marques-Bonet, Tomas; Alkan, Can; Prüfer, Kay; Meyer, Matthias; Burbano, Hernán A; Good, Jeffrey M; Schultz, Rigo; Aximu-Petri, Ayinuer; Butthof, Anne; Höber, Barbara; Höffner, Barbara; Siegemund, Madlen; Weihmann, Antje; Nusbaum, Chad; Lander, Eric S; Russ, Carsten; Novod, Nathaniel; Affourtit, Jason; Egholm, Michael; Verna, Christine; Rudan, Pavao; Brajkovic, Dejana; Kucan, Zeljko; Gusic, Ivan; Doronichev, Vladimir B; Golovanova, Liubov V; Lalueza-Fox, Carles; de la Rasilla, Marco; Fortea, Javier; Rosas, Antonio; Schmitz, Ralf W; Johnson, Philip L F; Eichler, Evan E; Falush, Daniel; Birney, Ewan; Mullikin, James C; Slatkin, Montgomery; Nielsen, Rasmus; Kelso, Janet; Lachmann, Michael; Reich, David; Pääbo, Svante

2010-05-01

83

Gambling on a shortcut to genome sequencing  

SciTech Connect

Almost from the start of the Human Genome Project, a debate has been raging over whether to sequence the entire human genome, all 3 billion bases, or just the genes - a mere 2% or 3% of the genome, and by far the most interesting part. In England, Sydney Brenner convinced the Medical Research Council (MRC) to start with the expressed genes, or complementary DNAs. But the US stance has been that the entire sequence is essential if we are to understand the blueprint of man. Craig Venter of the National Institute of Neurological Disorders and Stroke says that focusing on the expressed genes may be even more useful than expected. His strategy involves randomly selecting clones from cDNA libraries which theoretically contain all the genes that are switched on at a particular time in a particular tissue. Then the researchers sequence just a short stretch of each clone, about 400 to 500 bases, to create can expressed sequence tag or EST. The sequences of these ESTs are then stored in a database. Using that information, other researchers can then recreate that EST by using polymerase chain reaction techniques.

Roberts, L.

1991-06-21

84

Genome, Epigenome and RNA sequences of Monozygotic Twins Discordant for Multiple Sclerosis  

SciTech Connect

Neil Miller, Deputy Director of Software Engineering at the National Center for Genome Resources, discusses a monozygotic twin study on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Miller, Neil [National Center for Genome Resources

2010-06-02

85

Defining Genome Project Standards in a New Era of Sequencing  

SciTech Connect

Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

Chain, Patrick [DOE-JGI

2009-05-27

86

The Genome Sequence of Drosophila melanogaster  

NSDL National Science Digital Library

On Thursday March 23, 2000, a historic milestone was marked as researchers announced they have completed mapping the genome of the fruit fly, Drosophila melanogaster. The achievement, which was announced in a special issue of the journal Science, culminates close to 100 years of research. Drosophila melanogaster is the most complex animal thus far to have its genetic sequence deciphered. The findings have important implications for human medical research and for completing a map of the human genome. Mapping the fruit fly genome has been a broad collaborative effort between academia and industry in several countries. While a foundation was laid by US (Berkeley), European, and Canadian Drosophila Genome Projects, Celera Genomic finished the job over the last year by employing super-computers and state-of-the-art gene-sequencing machines. The techniques learned and used in this last phase of mapping may now be applied to more rapidly decode genes of other organisms, including humans. This week's In The News takes a closer look at this important landmark.

Ramanujan, Krishna.

87

Whole-genome sequencing in bacteriology: state of the art  

PubMed Central

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

Dark, Michael J

2013-01-01

88

Comparative Analysis of Genome Sequences with VISTA  

DOE Data Explorer

VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

Dubchak, Inna

89

Agaricus bisporus genome sequence: a commentary.  

PubMed

The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and ?-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium. PMID:23558250

Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

2013-06-01

90

Genome sequencing and analysis of the model grass Brachypodium distachyon  

E-print Network

ARTICLES Genome sequencing and analysis of the model grass Brachypodium distachyon The International Brachypodium Initiative* Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our

Green, Pamela

91

Initial sequencing and comparative analysis of the mouse genome  

Microsoft Academic Search

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

2002-01-01

92

Sequence and comparative analysis of the chicken genome provide unique  

E-print Network

........................................................................................................................................................................................................................... We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken

Edwards, Scott

93

Recurrence time statistics: Versatile tools for genomic DNA sequence analysis  

E-print Network

enables us to carry out sequence analysis on the whole genomic scale by a PC. Keywords Genomic DNARecurrence time statistics: Versatile tools for genomic DNA sequence analysis Yinhe Cao1, Wen from DNA sequences. One of the more important structures in a DNA se- quence is repeat-related. Often

Gao, Jianbo

94

Complete mitochondrial genome sequence of Procypris merus.  

PubMed

Abstract In this study, the complete mitochondrianl genome of Procypris merus was sequenced. It was determined to be 16,581?bp and included 13 protein-coding genes, 22 tRNA genes, 2 ribosomal RNA genes and 1 non-coding region (D-loop). The descending order of the base composition on heavy strand was 31.91% A, 24.95% T, 27.39% C, 15.75% G, which is similar to other cyprinid fish mitochondrial genomes. All protein-coding genes had ATG as the start codon but the stop codons have three types. Night genes end with TAA or TAG, and COII, ND4, ND6 and Cytb genes terminate with an incomplete -?-T. The complete mitochondrial genome may provide important DNA molecular data for further phylogenetic analyses for higher taxa of Cyprinidae. PMID:25350737

Chen, Yuanhua; Wu, Ping; Chen, Dunxue; Li, Zongjun; Bin, Shi-Yu

2014-10-28

95

Cactus: Algorithms for genome multiple sequence alignment  

PubMed Central

Much attention has been given to the problem of creating reliable multiple sequence alignments in a model incorporating substitutions, insertions, and deletions. Far less attention has been paid to the problem of optimizing alignments in the presence of more general rearrangement and copy number variation. Using Cactus graphs, recently introduced for representing sequence alignments, we describe two complementary algorithms for creating genomic alignments. We have implemented these algorithms in the new “Cactus” alignment program. We test Cactus using the Evolver genome evolution simulator, a comprehensive new tool for simulation, and show using these and existing simulations that Cactus significantly outperforms all of its peers. Finally, we make an empirical assessment of Cactus's ability to properly align genes and find interesting cases of intra-gene duplication within the primates. PMID:21665927

Paten, Benedict; Earl, Dent; Nguyen, Ngan; Diekhans, Mark; Zerbino, Daniel; Haussler, David

2011-01-01

96

Bisulfite genomic sequencing of microdissected cells  

PubMed Central

Mapping of methylation patterns in CpG islands has become an important tool for understanding tissue-specific gene expression in both normal and pathological situations. However, the inherent cellular heterogeneity of any given tissues can affect the outcome and interpretation of molecular studies. In order to analyse genomic DNA methylation on a pure cell population from tissue sample, we have developed a simple technique of single-cell microdissection from cryostat sections which can be combined with bisulfite-mediated sequencing of 5-methylcytosine. We report here our results on the methylation status of the androgen receptor gene studied by bisulfite genomic sequencing on purified cells isolated from human testis. PMID:11691943

Kerjean, Antoine; Vieillefond, Annick; Thiounn, Nicolas; Sibony, Mathilde; Jeanpierre, Marc; Jouannet, Pierre

2001-01-01

97

Complete genome sequence of Ikoma lyssavirus.  

PubMed

Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isolated from an African civet in Tanzania displaying clinical signs of rabies. Genetically, this virus is the most divergent within the genus Lyssavirus. Characterization of the genome will help to improve our understanding of lyssavirus diversity and enable investigation into vaccine-induced immunity and protection. PMID:22923801

Marston, Denise A; Ellis, Richard J; Horton, Daniel L; Kuzmin, Ivan V; Wise, Emma L; McElhinney, Lorraine M; Banyard, Ashley C; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E; Fooks, Anthony R

2012-09-01

98

Genome Sequence of Aerococcus viridans LL1  

PubMed Central

Aerococcus viridans is a catalase-negative Gram-positive bacterium and has been described as an airborne organism widely distributed in the hospital environment or in clinical specimens. We isolated A. viridans strain LL1 from indoor dust samples collected by a patient. Here, we prepared a genome sequence for this strain consisting of 31 contigs totaling 1,994,039 bases and a GC content of 39.42%. PMID:22815455

Qin, Nan; Zheng, Beiwen; Yang, Fengling; Chen, Yanfei; Guo, Jing; Hu, Xinjun

2012-01-01

99

The genome sequence of Schizosaccharomyces pombe  

Microsoft Academic Search

We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

2002-01-01

100

Registered report: Melanoma genome sequencing reveals frequent PREX2 mutations.  

PubMed

The Reproducibility Project: Cancer Biology seeks to address growing concerns about reproducibility in scientific research by conducting replications of 50 papers in the field of cancer biology published between 2010 and 2012. This Registered Report describes the proposed replication plan of key experiments from "Melanoma genome sequencing reveals frequent PREX2 mutations" by Berger and colleagues, published in Nature in 2012 (Berger et al., 2012). The key experiments that will be replicated are those reported in Figure 3B and Supplementary Figure S6. In these experiments, Berger and colleagues show that somatic PREX2 mutations identified through whole-genome sequencing of human melanoma can contribute to enhanced lethality of tumor xenografts in nude mice (Figure 3B, S6B, and S6C; Berger et al., 2012). The Reproducibility Project: Cancer Biology is a collaboration between the Center for Open Science and Science Exchange, and the results of the replications will be published by eLife. PMID:25490935

Chroscinski, Denise; Sampey, Darryl; Hewitt, Alex

2014-01-01

101

Why Assembling Plant Genome Sequences Is So Challenging  

PubMed Central

In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

2012-01-01

102

Selected Insights from Application of Whole Genome Sequencing for Outbreak Investigations  

PubMed Central

Purpose of review The advent of high-throughput whole genome sequencing has the potential to revolutionize the conduct of outbreak investigation. Because of its ultimate pathogen strain resolution, whole genome sequencing could augment traditional epidemiologic investigations of infectious disease outbreaks. Recent findings The combination of whole genome sequencing and intensive epidemiologic analysis provided new insights on the sources and transmission dynamics of large-scale epidemics caused by Escherichia coli and Vibrio cholerae, nosocomial outbreaks caused by methicillin-resistant Staphylococcus aureus, Klebsiella pneumonia, and Mycobacterium abscessus, community-centered outbreaks caused by Mycobacterium tuberculosis, and natural disaster-associated outbreak caused by environmentally acquired molds. Summary When combined with traditional epidemiologic investigation, whole genome sequencing has proven useful for elucidating sources and transmission dynamics of disease outbreaks. Development of a fully automated bioinformatics pipeline for analysis of whole genome sequence data is much needed to make this powerful tool more widely accessible. PMID:23856896

Le, Vien Thi Minh; Diep, Binh An

2014-01-01

103

Mapping DNA-protein interactions in large genomes by sequence tag analysis of genomic enrichment  

E-print Network

analysis of gene expres- sion (SAGE)12,13, but the template for STAGE consists of genomic loci enrichedMapping DNA-protein interactions in large genomes by sequence tag analysis of genomic enrichment gene expression programs. We have developed an unbiased genomic method called sequence tag analysis

Cai, Long

104

Ten years of bacterial genome sequencing: comparative-genomics-based discoveries  

Microsoft Academic Search

It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of

Tim T. Binnewies; Yair Motro; Peter F. Hallin; Ole Lund; David Dunn; Tom La; David J. Hampson; Matthew Bellgard; Trudy M. Wassenaar; David W. Ussery

2006-01-01

105

The Jackson Laboratory: The Mouse Genome Sequence Project  

NSDL National Science Digital Library

Part of the Mouse Genome Informatics program (last reported on in the NSDL Scout Report for the Life Sciences on March 19, 2004) at the Jackson Laboratory, this website presents The Mouse Genome Sequence (MGS) project. MGS is designed "to integrate emerging mouse genomic sequence data with the genetic and biological data available in MGD and GXD." The site links to Eukaryotic Genome Annotation Projects, as well as Sequence Analysis Tools including MouseBlast and Genome Analysis. The site also offers basic background information about the Mouse Genome Sequencing Initiative, and provides site users with access to groups involved in mouse genome sequencing, the BAC clone library, request forms for targeted sequencing, and more.

106

Porcine parvovirus: DNA sequence and genome organization.  

PubMed

We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV. PMID:2794971

Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I

1989-10-01

107

Optimizing the BACEnd Strategy for Sequencing the Human Genome  

E-print Network

University, Tel Aviv, 69978, Israel. 1 #12; 1 Introduction With the Human Genome Project moving from the map sequencing has become central. The classical strategy set forth by the founders of the Human Genome ProjectOptimizing the BAC­End Strategy for Sequencing the Human Genome Richard M. Karp \\Lambda Ron Shamir

Shamir, Ron

108

Ancient human genome sequence of an extinct Palaeo-Eskimo  

E-print Network

. Inconsistencies in neanderthal genomic DNA sequences. PLoS Genet. 3, 1862–1866 (2007). 12. Miller, W. et al. Sequencing the nuclear genome of the extinct woolly mammoth. Nature 456, 387–390 (2008). 13. Green, R. E. et al. The neandertal genome and ancient DNA...

Rasmussen, Morten; Li, Yingrui; Lindgreen, Stinus; Pedersen, Jakob Skou; Albrechtsen, Anders; Moltke, Ida; Metspalu, Mait; Metspalu, Ene; Kivisild, Toomas; Gupta, Ramneek; Bertalan, Marcelo; Nielsen, Kasper; Gilbert, M. Thomas P.; Wang, Yong; Raghavan, Maanasa; Campos, Paula F.; Kamp, Hanne Munkholm; Wilson, Andrew S.; Gledhill, Andrew; Tridico, Silvana; Bunce, Michael; Lorenzen, Eline D.; Binladen, Jonas; Guo, Xiaosen; Zhao, Jing; Zhang, Xiuqing; Zhang, Hao; Li, Zhuo; Chen, Minfeng; Orlando, Ludovic; Kristiansen, Karsten; Bak, Mads; Tommerup, Niels; Bendixen, Christian; Pierre, Tracey L.; Gronnow, Bjarne; Meldgaard, Morten; Andreasen, Claus; Fedorova, Sardana A.; Osipova, Ludmila P.; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Hansen, Thomas v. O.; Nielsen, Finn C.; Crawford, Michael H.; Brunak, Soren; Sicheritz-Ponten, Thomas; Villems, Richard; Nielsen, Rasmus; Krogh, Anders; Wang, Jun; Willerslev, Eske

2010-02-11

109

Draft Genome Sequence of Geotrichum candidum Strain 3C.  

PubMed

We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

Polev, Dmitrii E; Bobrov, Kirill S; Eneyskaya, Elena V; Kulminskaya, Anna A

2014-01-01

110

Initial sequencing and comparative analysis of the mouse genome  

E-print Network

and knockin techniques17­22 . For these and other reasons, the Human Genome Project (HGP) recognized from its ........................................................................................................................................................................................................................... The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from

Eddy, Sean

111

Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis  

Microsoft Academic Search

The 4 202 353 bp genome of the alkaliphilic bacterium Bacillus halodurans C-125 contains 4066 predicted protein coding sequences (CDSs), 2141 (52.7%) of which have functional assignments, 1182 (29%) of which are conserved CDSs with unknown function and 743 (18.3%) of which have no match to any protein database. Among the total CDSs, 8.8% match sequences of proteins found only

Hideto Takami; Kaoru Nakasone; Yoshihiro Takaki; Go Maeno; Rumie Sasaki; Noriaki Masui; Fumie Fuji; Chie Hirama; Yuka Nakamura; Naotake Ogasawara; Satoru Kuhara; Koki Horikoshi

2000-01-01

112

Initial impact of the sequencing of the human genome  

E-print Network

The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

113

Identification and annotation of repetitive sequences in fungal genomes  

Technology Transfer Automated Retrieval System (TEKTRAN)

Cheaper and faster sequencing technologies have fundamentally changed the pace of genome sequencing projects and have contributed to the ever-increasing volume of genomic data. This has been paralleled by an increase in computational power and resources to process and translate raw sequence data int...

114

Next Generation Sequencing at the University of Chicago Genomics Core  

SciTech Connect

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

Faber, Pieter [University of Chicago

2013-04-24

115

Heterochromatic sequences in a Drosophila whole-genome shotgun assembly  

Microsoft Academic Search

BACKGROUND: Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly. RESULTS: WGS3, an improved whole-genome shotgun

Roger A Hoskins; Christopher D Smith; Joseph W Carlson; A Bernardo Carvalho; Aaron Halpern; Joshua S Kaminker; Cameron Kennedy; Chris J Mungall; Beth A Sullivan; Granger G Sutton; Jiro C Yasuhara; Barbara T Wakimoto; Eugene W Myers; Susan E Celniker; Gerald M Rubin; Gary H Karpen

2002-01-01

116

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence  

Technology Transfer Automated Retrieval System (TEKTRAN)

An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions fr...

117

Whole genome sequencing in support of wellness and health maintenance  

PubMed Central

Background Whole genome sequencing is poised to revolutionize personalized medicine, providing the capacity to classify individuals into risk categories for a wide range of diseases. Here we begin to explore how whole genome sequencing (WGS) might be incorporated alongside traditional clinical evaluation as a part of preventive medicine. The present study illustrates novel approaches for integrating genotypic and clinical information for assessment of generalized health risks and to assist individuals in the promotion of wellness and maintenance of good health. Methods Whole genome sequences and longitudinal clinical profiles are described for eight middle-aged Caucasian participants (four men and four women) from the Center for Health Discovery and Well Being (CHDWB) at Emory University in Atlanta. We report multivariate genotypic risk assessments derived from common variants reported by genome-wide association studies (GWAS), as well as clinical measures in the domains of immune, metabolic, cardiovascular, musculoskeletal, respiratory, and mental health. Results Polygenic risk is assessed for each participant for over 100 diseases and reported relative to baseline population prevalence. Two approaches for combining clinical and genetic profiles for the purposes of health assessment are then presented. First we propose conditioning individual disease risk assessments on observed clinical status for type 2 diabetes, coronary artery disease, hypertriglyceridemia and hypertension, and obesity. An approximate 2:1 ratio of concordance between genetic prediction and observed sub-clinical disease is observed. Subsequently, we show how more holistic combination of genetic, clinical and family history data can be achieved by visualizing risk in eight sub-classes of disease. Having identified where their profiles are broadly concordant or discordant, an individual can focus on individual clinical results or genotypes as they develop personalized health action plans in consultation with a health partner or coach. Conclusion The CHDWB will facilitate longitudinal evaluation of wellness-focused medical care based on comprehensive self-knowledge of medical risks. PMID:23806097

2013-01-01

118

Volatiles from nineteen recently genome sequenced actinomycetes.  

PubMed

The volatiles released by agar plate cultures of nineteen actinomycetes whose genomes were recently sequenced were collected by use of a closed-loop stripping apparatus (CLSA) and analysed by GC/MS. In total, 178 compounds from various classes were identified. The most interesting findings were the detection of the insect pheromone frontalin in Streptomyces varsoviensis, and the emission of the unusual plant metabolite 1-nitro-2-phenylethane. Its biosynthesis from phenylalanine was investigated in isotopic labelling experiments. Furthermore, the identified terpenes were correlated to the information about terpene cyclase homologs encoded in the investigated strains. The analytical data were in line with functionally characterised bacterial terpene cyclases and particularly corroborated the recently suggested function of a terpene cyclase from Streptomyces violaceusniger by the identification of a functional homolog in Streptomyces rapamycinicus. PMID:25585196

Citron, Christian A; Barra, Lena; Wink, Joachim; Dickschat, Jeroen S

2015-02-18

119

Selection to sequence: opportunities in fungal genomics  

SciTech Connect

Selection is a biological force, causing genotypic and phenotypic change over time. Whether environmental or human induced, selective pressures shape the genotypes and the phenotypes of organisms both in nature and in the laboratory. In nature, selective pressure is highly dynamic and the sum of the environment and other organisms. In the laboratory, selection is used in genetic studies and industrial strain development programs to isolate mutants affecting biological processes of interest to researchers. Selective pressures are important considerations for fungal biology. In the laboratory a number of fungi are used as experimental systems to study a wide range of biological processes and in nature fungi are important pathogens of plants and animals and play key roles in carbon and nitrogen cycling. The continued development of high throughput sequencing technologies makes it possible to characterize at the genomic level, the effect of selective pressures both in the lab and in nature for filamentous fungi as well as other organisms.

Baker, Scott E.

2009-12-01

120

Genome sequence of the repetitive-sequence-rich Mycoplasma fermentans strain M64.  

PubMed

Mycoplasma fermentans is a microorganism commonly found in the genitourinary and respiratory tracts of healthy individuals and AIDS patients. The complete genome of the repetitive-sequence-rich M. fermentans strain M64 is reported here. Comparative genomics analysis revealed dramatic differences in genome size between this strain and the recently completely sequenced JER strain. PMID:21642450

Shu, Hung-Wei; Liu, Tze-Tze; Chan, Huang-I; Liu, Yen-Ming; Wu, Keh-Ming; Shu, Hung-Yu; Tsai, Shih-Feng; Hsiao, Kwang-Jen; Hu, Wensi S; Ng, Wailap Victor

2011-08-01

121

Genome Project Standards in a New Era of Sequencing  

SciTech Connect

For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.

GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

2009-06-01

122

Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.  

PubMed

It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this information. PMID:16773396

Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

2006-07-01

123

Rapid Genome Evolution Revealed by Comparative Sequence Analysis of Orthologous Regions from Four Triticeae Genomes  

PubMed Central

Bread wheat (Triticum aestivum) is an allohexaploid species, consisting of three subgenomes (A, B, and D). To study the molecular evolution of these closely related genomes, we compared the sequence of a 307-kb physical contig covering the high molecular weight (HMW)-glutenin locus from the A genome of durum wheat (Triticum turgidum, AABB) with the orthologous regions from the B genome of the same wheat and the D genome of the diploid wheat Aegilops tauschii (Anderson et al., 2003; Kong et al., 2004). Although gene colinearity appears to be retained, four out of six genes including the two paralogous HMW-glutenin genes are disrupted in the orthologous region of the A genome. Mechanisms involved in gene disruption in the A genome include retroelement insertions, sequence deletions, and mutations causing in-frame stop codons in the coding sequences. Comparative sequence analysis also revealed that sequences in the colinear intergenic regions of these different genomes were generally not conserved. The rapid genome evolution in these regions is attributable mainly to the large number of retrotransposon insertions that occurred after the divergence of the three wheat genomes. Our comparative studies indicate that the B genome diverged prior to the separation of the A and D genomes. Furthermore, sequence comparison of two distinct types of allelic variations at the HMW-glutenin loci in the A genomes of different hexaploid wheat cultivars with the A genome locus of durum wheat indicates that hexaploid wheat may have more than one tetraploid ancestor. PMID:15122014

Gu, Yong Qiang; Coleman-Derr, Devin; Kong, Xiuying; Anderson, Olin D.

2004-01-01

124

Finishing The Euchromatic Sequence Of The Human Genome  

SciTech Connect

The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

2004-09-07

125

Finished Genome Sequence of Collimonas arenae Cal35.  

PubMed

We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. PMID:25573943

Wu, Je-Jia; de Jager, Victor C L; Deng, Wen-Ling; Leveau, Johan H J

2015-01-01

126

Finished Genome Sequence of Collimonas arenae Cal35  

PubMed Central

We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. PMID:25573943

Wu, Je-Jia; de Jager, Victor C. L.; Deng, Wen-Ling

2015-01-01

127

Complete Genome Sequence of the Mesoplasma florum W37 Strain  

PubMed Central

Mesoplasma florum is a small-genome fast-growing mollicute that is an attractive model for systems and synthetic genomics studies. We report the complete 825,824-bp genome sequence of a second representative of this species, M. florum strain W37, which contains 733 predicted open reading frames and 35 stable RNAs. PMID:24285658

Baby, Vincent; Matteau, Dominick; Knight, Thomas F.

2013-01-01

128

Research ethics and the challenge of whole-genome sequencing  

Microsoft Academic Search

The recent completion of the first two individual whole-genome sequences is a research milestone. As personal genome research advances, investigators and international research bodies must ensure ethical research conduct. We identify three major ethical considerations that have been implicated in whole-genome research: the return of research results to participants; the obligations, if any, that are owed to participants' relatives; and

Amy L. McGuire; Mildred K. Cho; Timothy Caulfield

2007-01-01

129

Accurate whole human genome sequencing using reversible terminator chemistry  

Microsoft Academic Search

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

2008-01-01

130

Genome Wide Characterization of Simple Sequence Repeats in Cucumber  

Technology Transfer Automated Retrieval System (TEKTRAN)

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

131

Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis  

PubMed Central

Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A.; Edwards, Robert A.

2013-01-01

132

Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii  

PubMed Central

Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

2013-01-01

133

Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.  

PubMed

Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

2013-01-01

134

GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT  

E-print Network

GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR OFF-CAMPUS USERS). THIS USER AGREEMENT MUST BE RECEIVED BEFORE ANY BILLABLE SERVICES WILL BE PROVIDED. Users of Ion Torrent & _____ Initials of PI #12; GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR OFF

Wurtele, Eve Syrkin

135

GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT  

E-print Network

GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR ON-CAMPUS USERS). THIS USER AGREEMENT MUST BE RECEIVED BEFORE ANY BILLABLE SERVICES WILL BE PROVIDED. Users of Ion Torrent & _____ Initials of PI #12; GENOMIC TECHNOLOGIES FACILITY: Ion Torrent Sequencing USER/BILLING AGREEMENT FOR ON

Wurtele, Eve Syrkin

136

Draft Genome Sequence of Raoultella planticola, Isolated from River Water  

PubMed Central

We isolated Raoultella planticola from a river water sample, which was phenotypically indistinguishable from Escherichia coli on MI agar. The genome sequence of R. planticola was determined to gain information about its metabolic functions contributing to its false positive appearance of E. coli on MI agar. We report the first whole genome sequence of Raoultella planticola. PMID:25323725

Kahler, Amy; Strockbine, Nancy; Gladney, Lori; Hill, Vincent R.

2014-01-01

137

Initial sequencing and analysis of the human genome  

Microsoft Academic Search

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

2001-01-01

138

Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages.  

PubMed

Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

Sheflo, Michael A; Gardner, Adam V; Merrill, Bryan D; Fisher, Joshua N B; Lunt, Bryce L; Breakwell, Donald P; Grose, Julianne H; Burnett, Sandra H

2013-01-01

139

Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’  

Technology Transfer Automated Retrieval System (TEKTRAN)

The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...

140

Draft Genome Sequence of Bordetella trematum Strain HR18.  

PubMed

The genus Bordetella is reportedly a human or animal pathogen and environmental microbe. We report the draft genome sequence of Bordetella trematum strain HR18, which was isolated from the rumen of Korean native cattle (Hanwoo; Bos taurus coreanae). It is the first genome sequence of a Bordetella sp. isolated from the rumen of cattle. PMID:25573930

Chang, Dong-Ho; Jin, Tae-Eun; Rhee, Moon-Soo; Jeong, Haeyoung; Kim, Seil; Kim, Byoung-Chan

2015-01-01

141

Draft Genome Sequence of Tolypothrix boutellei Strain VB521301  

PubMed Central

We report here the draft genome sequence of the filamentous nitrogen-fixing cyanobacterium Tolypothrix boutellei strain VB521301. The organism is lipid rich and hydrophobic and produces polyunsaturated fatty acids which can be harnessed for industrial purpose. The draft genome sequence assembled into 11,572,263 bp with 70 scaffolds and 7,777 protein coding genes.

Chandrababunaidu, Mathu Malar; Singh, Deeksha; Sen, Diya; Bhan, Sushma; Das, Subhadeep; Gupta, Akash

2015-01-01

142

Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages  

PubMed Central

Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

Sheflo, Michael A.; Gardner, Adam V.; Merrill, Bryan D.; Fisher, Joshua N. B.; Lunt, Bryce L.; Breakwell, Donald P.; Grose, Julianne H.

2013-01-01

143

Sequencing the chimpanzee genome: insights into human evolution and disease  

Microsoft Academic Search

Large-scale sequencing of the chimpanzee genome is now imminent. Beyond the inherent fascination of comparing the sequence of the human genome with that of our closest living relative, this project is likely to yield tangible scientific benefits in two areas. First, the discovery of functionally important mutations that are specific to the human lineage offers a new path towards medical

Ajit Varki; Maynard V. Olson

2003-01-01

144

Draft Genome Sequence of Bordetella trematum Strain HR18  

PubMed Central

The genus Bordetella is reportedly a human or animal pathogen and environmental microbe. We report the draft genome sequence of Bordetella trematum strain HR18, which was isolated from the rumen of Korean native cattle (Hanwoo; Bos taurus coreanae). It is the first genome sequence of a Bordetella sp. isolated from the rumen of cattle. PMID:25573930

Chang, Dong-Ho; Jin, Tae-Eun; Rhee, Moon-Soo; Jeong, Haeyoung; Kim, Seil

2015-01-01

145

Genome Sequence of the Nonpathogenic Pseudomonas aeruginosa Strain ATCC 15442  

PubMed Central

Pseudomonas aeruginosa ATCC 15442 is an environmental strain of the Pseudomonas genus. Here, we present a 6.77-Mb assembly of its genome sequence. Besides giving insights into characteristics associated with the pathogenicity of P. aeruginosa, such as virulence, drug resistance, and biofilm formation, the genome sequence may provide some information related to biotechnological utilization of the strain. PMID:24786961

Wang, Yujiao; Li, Chao; Ma, Cuiqing; Xu, Ping

2014-01-01

146

Implications of the Plastid Genome Sequence of Typha (Typhaceae, Poales) for Understanding Genome Evolution in Poaceae  

Microsoft Academic Search

Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been\\u000a a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution\\u000a has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the

Mary M. GuisingerTimothy; Timothy W. Chumley; Jennifer V. Kuehl; Jeffrey L. Boore; Robert K. Jansen

2010-01-01

147

Genome sequence of the human malaria parasite Plasmodium falciparum  

Microsoft Academic Search

The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date.

Malcolm J. Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W. Hyman; Jane M. Carlton; Arnab Pain; Sharen Bowman; Ian T. Paulsen; Keith James; Kim Rutherford; Steven L. Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J. Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W. Mather; Akhil B. Vaidya; Alan H. Fairlamb; Martin J. Fraunholz; David S. Roos; Stuart A. Ralph; Geoffrey I. McFadden; Leda M. Cummings; G. Mani Subramanian; Chris Mungall; J. Craig Venter; Daniel J. Carucci; Stephen L. Hoffman; Chris Newbold; Ronald W. Davis; Claire M. Fraser; Bart Barrell

2002-01-01

148

Characterization of the complete genome sequence of pike fry rhabdovirus  

Microsoft Academic Search

The complete genome sequence of pike fry rhabdovirus (PFRV), consisting of 11,097 nucleotides, was determined. The genome\\u000a contains five genes, encoding the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G), and RNA-dependent\\u000a RNA polymerase (L) protein in the order 3?-N-P-M-G-L-5?. 3? leader- and 5? trailer- sequences in the PFRV genome show inverse\\u000a complementarity. The PFRV proteins share the highest

Hong-Lian Chen; Hong Liu; Zong-Xiao Liu; Jun-Qiang He; Long-Ying Gao; Xiu-Jie Shi; Yu-Lin Jiang

2009-01-01

149

Draft sequences of the radish (Raphanus sativus L.) genome.  

PubMed

Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ? 300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

2014-10-01

150

Draft Sequences of the Radish (Raphanus sativus L.) Genome  

PubMed Central

Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ?300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

2014-01-01

151

THE RICE GENOME: The Cereal of the World's Poor Takes Center Stage  

NSDL National Science Digital Library

Access to the article is free, however registration and sign-in are required. The milestone publication of not one, but two, draft genome sequences of rice (Oryza sativa) brought the cereal crop of the world's poor to center stage. In their Perspectives, Cantrell and Reeves discuss the potential impacts of these sequences for humankind from the standpoints of food security and combating malnutrition.

Ronald P. Cantrell (International Rice Research Institute (IRRI); )

2002-04-05

152

End-sequence profiling: Sequence-based analysis of aberrant genomes  

PubMed Central

Genome rearrangements are important in evolution, cancer, and other diseases. Precise mapping of the rearrangements is essential for identification of the involved genes, and many techniques have been developed for this purpose. We show here that end-sequence profiling (ESP) is particularly well suited to this purpose. ESP is accomplished by constructing a bacterial artificial chromosome (BAC) library from a test genome, measuring BAC end sequences, and mapping end-sequence pairs onto the normal genome sequence. Plots of BAC end-sequences density identify copy number abnormalities at high resolution. BACs spanning structural aberrations have end pairs that map abnormally far apart on the normal genome sequence. These pairs can then be sequenced to determine the involved genes and breakpoint sequences. ESP analysis of the breast cancer cell line MCF-7 demonstrated its utility for analysis of complex genomes. End sequencing of ?8,000 clones (0.37-fold haploid genome clonal coverage) produced a comprehensive genome copy number map of the MCF-7 genome at better than 300-kb resolution and identified 381 genome breakpoints, a subset of which was verified by fluorescence in situ hybridization mapping and sequencing. PMID:12788976

Volik, Stanislav; Zhao, Shaying; Chin, Koei; Brebner, John H.; Herndon, David R.; Tao, Quanzhou; Kowbel, David; Huang, Guiqing; Lapuk, Anna; Kuo, Wen-Lin; Magrane, Gregg; de Jong, Pieter; Gray, Joe W.; Collins, Colin

2003-01-01

153

Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes  

SciTech Connect

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

2005-08-26

154

Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding  

PubMed Central

Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly) were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge. PMID:24205335

Jiang, Yanliang; Ninwichian, Parichart; Liu, Shikai; Zhang, Jiaren; Kucuktas, Huseyin; Sun, Fanyue; Kaltenboeck, Ludmilla; Sun, Luyang; Bao, Lisui; Liu, Zhanjiang

2013-01-01

155

Pairwise end sequencing: a unified approach to genomic mapping and sequencing  

Microsoft Academic Search

Strategies for large-scale genomic DNA sequencing currently require physical mapping, followed by detailed mapping, and finally sequencing. The level of mapping detail determines the amount of effort, or sequence redundancy, required to finish a project. Current strategies attempt to find a balance between mapping and sequencing efforts. One such approach is to employ strategies that use sequence data to build

Jared C. Roach; Cecilie Boysen; Kai Wang; Leroy Hood

1995-01-01

156

Sequencing genomes from single cells by polymerase cloning.  

PubMed

Genome sequencing currently requires DNA from pools of numerous nearly identical cells (clones), leaving the genome sequences of many difficult-to-culture microorganisms unattainable. We report a sequencing strategy that eliminates culturing of microorganisms by using real-time isothermal amplification to form polymerase clones (plones) from the DNA of single cells. Two Escherichia coli plones, analyzed by Affymetrix chip hybridization, demonstrate that plonal amplification is specific and the bias is randomly distributed. Whole-genome shotgun sequencing of Prochlorococcus MIT9312 plones showed 62% coverage of the genome from one plone at a sequencing depth of 3.5x, and 66% coverage from a second plone at a depth of 4.7x. Genomic regions not revealed in the initial round of sequencing are recovered by sequencing PCR amplicons derived from plonal DNA. The mutation rate in single-cell amplification is <2 x 10(5), better than that of current genome sequencing standards. Polymerase cloning should provide a critical tool for systematic characterization of genome diversity in the biosphere. PMID:16732271

Zhang, Kun; Martiny, Adam C; Reppas, Nikos B; Barry, Kerrie W; Malek, Joel; Chisholm, Sallie W; Church, George M

2006-06-01

157

Registered report: Melanoma genome sequencing reveals frequent PREX2 mutations  

PubMed Central

The Reproducibility Project: Cancer Biology seeks to address growing concerns about reproducibility in scientific research by conducting replications of 50 papers in the field of cancer biology published between 2010 and 2012. This Registered Report describes the proposed replication plan of key experiments from ‘Melanoma genome sequencing reveals frequent PREX2 mutations’ by Berger and colleagues, published in Nature in 2012 (Berger et al., 2012). The key experiments that will be replicated are those reported in Figure 3B and Supplementary Figure S6. In these experiments, Berger and colleagues show that somatic PREX2 mutations identified through whole-genome sequencing of human melanoma can contribute to enhanced lethality of tumor xenografts in nude mice (Figure 3B, S6B, and S6C; Berger et al., 2012). The Reproducibility Project: Cancer Biology is a collaboration between the Center for Open Science and Science Exchange, and the results of the replications will be published by eLife. DOI: http://dx.doi.org/10.7554/eLife.04180.001 PMID:25490935

Chroscinski, Denise; Sampey, Darryl; Hewitt, Alex

2014-01-01

158

Complete Genome Sequence of Mycoplasma haemofelis, a Hemotropic Mycoplasma?  

PubMed Central

Here, we present the genome sequence of Mycoplasma haemofelis strain Langford 1, representing the first hemotropic mycoplasma (hemoplasma) species to be completely sequenced and annotated. Originally isolated from a cat with hemolytic anemia, this strain induces severe hemolytic anemia when inoculated into specific-pathogen-free-derived cats. The genome sequence has provided insights into the biology of this uncultivatable hemoplasma and has identified potential molecular mechanisms underlying its pathogenicity. PMID:21317334

Barker, Emily N.; Helps, Chris R.; Peters, Iain R.; Darby, Alistair C.; Radford, Alan D.; Tasker, Séverine

2011-01-01

159

Sequencing genomes from single cells by polymerase cloning  

Microsoft Academic Search

Genome sequencing currently requires DNA from pools of numerous nearly identical cells (clones), leaving the genome sequences of many difficult-to-culture microorganisms unattainable. We report a sequencing strategy that eliminates culturing of microorganisms by using real-time isothermal amplification to form polymerase clones (plones) from the DNA of single cells. Two Escherichia coli plones, analyzed by Affymetrix chip hybridization, demonstrate that plonal

Adam C Martiny; Nikos B Reppas; Kerrie W Barry; Joel Malek; Sallie W Chisholm; Kun Zhang; George M Church

2006-01-01

160

Center for Cell and Genome Sciences, Crocker Science Building  

E-print Network

chemistry Center for Cell and Genome Sciences genetic engineering building artificial life brain engineering photodiodes #12;the Cell engineering the genome, imaging proteins Invitrogen at the intersection of chemistry

Tipple, Brett

161

Identification of Optimum Sequencing Depth Especially for De Novo Genome Assembly of Small Genomes Using Next Generation Sequencing Data  

PubMed Central

Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources. PMID:23593174

Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

2013-01-01

162

Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.  

PubMed

Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes. PMID:25388107

Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

2015-01-01

163

Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo electric organ  

E-print Network

Evolution and comparative genomics of subcellular specializations: EST sequencing of Torpedo discovery Open reading frame (ORF) Uncharacterized open reading frames (ORFs) in human genomic sequence Elsevier B.V. All rights reserved. 1. Introduction The availability of complete genomic sequences

Vertes, Akos

164

Reference genome sequence of the model plant Setaria.  

PubMed

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ?400-Mb assembly covers ?80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum). PMID:22580951

Bennetzen, Jeffrey L; Schmutz, Jeremy; Wang, Hao; Percifield, Ryan; Hawkins, Jennifer; Pontaroli, Ana C; Estep, Matt; Feng, Liang; Vaughn, Justin N; Grimwood, Jane; Jenkins, Jerry; Barry, Kerrie; Lindquist, Erika; Hellsten, Uffe; Deshpande, Shweta; Wang, Xuewen; Wu, Xiaomei; Mitros, Therese; Triplett, Jimmy; Yang, Xiaohan; Ye, Chu-Yu; Mauro-Herrera, Margarita; Wang, Lin; Li, Pinghua; Sharma, Manoj; Sharma, Rita; Ronald, Pamela C; Panaud, Olivier; Kellogg, Elizabeth A; Brutnell, Thomas P; Doust, Andrew N; Tuskan, Gerald A; Rokhsar, Daniel; Devos, Katrien M

2012-06-01

165

Reference genome sequence of the model plant Setaria  

SciTech Connect

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The {approx}400-Mb assembly covers {approx}80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

Bennetzen, Jeffrey L [ORNL; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Tuskan, Gerald A [ORNL

2012-01-01

166

Reference genome sequence of the model plant Setaria  

SciTech Connect

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ~400-Mb assembly covers ~80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

Bennetzen, Jeffrey L [ORNL; Schmutz, Jeremy [Hudson Alpha Institute of Biotechnology; Wang, Hao [University of Georgia, Athens, GA; Percifield, Ryan [University of Georgia, Athens, GA; Hawkins, Jennifer [University of Georgia, Athens, GA; Pontaroli, Ana C. [University of Georgia, Athens, GA; Estep, Matt [University of Georgia, Athens, GA; Feng, Liang [University of Georgia, Athens, GA; Vaughn, Justin N [ORNL; Grimwood, Jane [Hudson Alpha Institute of Biotechnology; Jenkins, Jerry [Hudson Alpha Institute of Biotechnology; Barry, Kerrie [U.S. Department of Energy, Joint Genome Institute; Lindquist, Erika [U.S. Department of Energy, Joint Genome Institute; Hellsten, Uffe [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Wang, Xuewen [University of Georgia, Athens, GA; Wu, Xiaomei [University of Georgia, Athens, GA; Mitros, Therese [University of California, Berkeley; Triplett, Jimmy [University of Missouri, St. Louis; Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Mauro-Herrera, Margarita [Oklahoma State University; Wang, Lin [Cornell University; Li, Pinghua [Cornell University; Sharma, Manoj [University of California, Davis; Sharma, Rita [University of California, Davis; Ronald, Pamela [University of California, Davis; Panaud, Olivier [Universite de Perpignan, Perpignan, France; Kellogg, Elizabeth A. [University of Missouri, St. Louis; Brutnell, Thomas P. [Cornell University; Doust, Andrew N. [Oklahoma State University; Tuskan, Gerald A [ORNL; Rokhsar, Daniel [U.S. Department of Energy, Joint Genome Institute; Devos, Katrien M [ORNL

2012-01-01

167

Complete genome sequence of Thermomonospora curvata type strain (B9)  

SciTech Connect

Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Chertkov, Olga [Los Alamos National Laboratory (LANL); Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California

2011-01-01

168

Complete genome sequence of Gordonia bronchialis type strain (3410T)  

PubMed Central

Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304674

Ivanova, Natalia; Sikorski, Johannes; Jando, Marlen; Lapidus, Alla; Nolan, Matt; Lucas, Susan; Del Rio, Tijana Glavina; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Chain, Patrick; Saunders, Elizabeth; Han, Cliff; Detter, John C.; Brettin, Thomas; Rohde, Manfred; Göker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

2010-01-01

169

Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)  

SciTech Connect

Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

2009-05-20

170

Recurrent insertion and duplication generate networks of transposable element sequences in the Drosophila melanogaster genome  

Microsoft Academic Search

BACKGROUND: The recent availability of genome sequences has provided unparalleled insights into the broad-scale patterns of transposable element (TE) sequences in eukaryotic genomes. Nevertheless, the difficulties that TEs pose for genome assembly and annotation have prevented detailed, quantitative inferences about the contribution of TEs to genomes sequences. RESULTS: Using a high-resolution annotation of TEs in Release 4 genome sequence, we

Casey M Bergman; Hadi Quesneville; Dominique Anxolabéhère; Michael Ashburner

2006-01-01

171

Complete genome sequence of Thioalkalivibrio sp. K90mix  

PubMed Central

Thioalkalivibrio sp. K90mix is an obligately chemolithoautotrophic, natronophilic sulfur-oxidizing bacterium (SOxB) belonging to the family Ectothiorhodospiraceae within the Gammaproteobacteria. The strain was isolated from a mixture of sediment samples obtained from different soda lakes located in the Kulunda Steppe (Altai, Russia) based on its extreme potassium carbonate tolerance as an enrichment method. Here we report the complete genome sequence of strain K90mix and its annotation. The genome was sequenced within the Joint Genome Institute Community Sequencing Program, because of its relevance to the sustainable removal of sulfide from wastewater and gas streams. PMID:22675584

Muyzer, Gerard; Sorokin, Dimitry Y.; Mavromatis, Konstantinos; Lapidus, Alla; Foster, Brian; Sun, Hui; Ivanova, Natalia; Pati, Amrita; D'haeseleer, Patrik; Woyke, Tanja; Kyrpides, Nikos C.

2011-01-01

172

Complete genome sequence of Staphylothermus hellenicus P8T  

SciTech Connect

Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phy- lum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shal- low hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the com- plete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein- coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) La- boratory Sequencing Program (LSP) project.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Davenport, Karen W. [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

2011-01-01

173

Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens  

PubMed Central

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E.; Bakker, Freek T.

2013-01-01

174

Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.  

PubMed

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

2013-01-01

175

De Novo Whole-Genome Sequence and Genome Annotation of Lichtheimia ramosa  

PubMed Central

We report the annotated draft genome sequence of Lichtheimia ramosa (JMRC FSU:6197). It has been reported to be a causative organism of mucormycosis, a rare but rapidly progressive infection in immunocompromised humans. The functionally annotated genomic sequence consists of 74 scaffolds with a total number of 11,510 genes. PMID:25212617

Linde, Jörg; Schwartze, Volker; Binder, Ulrike; Lass-Flörl, Cornelia

2014-01-01

176

Sequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations  

PubMed Central

Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences—which we call recombinant population genome construction—that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross. PMID:24531727

Hahn, Matthew W.; Zhang, Simo V.; Moyle, Leonie C.

2014-01-01

177

Assembly of large genomes using second-generation sequencing  

PubMed Central

Second-generation sequencing technology can now be used to sequence an entire human genome in a matter of days and at low cost. Sequence read lengths, initially very short, have rapidly increased since the technology first appeared, and we now are seeing a growing number of efforts to sequence large genomes de novo from these short reads. In this Perspective, we describe the issues associated with short-read assembly, the different types of data produced by second-gen sequencers, and the latest assembly algorithms designed for these data. We also review the genomes that have been assembled recently from short reads and make recommendations for sequencing strategies that will yield a high-quality assembly. PMID:20508146

Schatz, Michael C.; Delcher, Arthur L.; Salzberg, Steven L.

2010-01-01

178

Genome sequencing and analysis of the model grass Brachypodium distachyon  

SciTech Connect

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

2010-01-01

179

Genome sequencing and analysis of the model grass Brachypodium distachyon.  

PubMed

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops. PMID:20148030

2010-02-11

180

BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes  

Technology Transfer Automated Retrieval System (TEKTRAN)

New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

181

Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species  

PubMed Central

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

2014-01-01

182

Complete Genome Sequence of Pseudomonas denitrificans ATCC 13867  

PubMed Central

Pseudomonas denitrificans ATCC 13867, a Gram-negative facultative anaerobic bacterium, is known to produce vitamin B12 under aerobic conditions. This paper reports the annotated whole-genome sequence of the circular chromosome of this organism. PMID:23723394

Ainala, Satish Kumar; Somasundar, Ashok

2013-01-01

183

Draft Genome Sequence of Coprobacter fastidiosus NSB1T  

PubMed Central

Coprobacter fastidiosus is a Gram-negative obligate anaerobic bacterium belonging to the phylum Bacteroidetes. In this work, we report the draft genome sequence of C. fastidiosus strain NSB1T isolated from human infant feces. PMID:24604645

Chaplin, A. V.; Efimov, B. A.; Khokhlova, E. V.; Kafarskaia, L. I.; Tupikin, A. E.; Kabilov, M. R.

2014-01-01

184

Bacterial epidemiology and biology - lessons from genome sequencing  

PubMed Central

Next-generation sequencing has ushered in a new era of microbial genomics, enabling the detailed historical and geographical tracing of bacteria. This is helping to shape our understanding of bacterial evolution. PMID:22027015

2011-01-01

185

Initial genome sequencing and analysis of multiple myeloma  

E-print Network

Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

Lander, Eric S.

186

Complete Genome Sequence of Mycobacterium phlei Type Strain RIVM601174  

PubMed Central

Mycobacterium phlei is a rapidly growing nontuberculous Mycobacterium species that is typically nonpathogenic, with few reported cases of human disease. Here we report the whole genome sequence of M. phlei type strain RIVM601174. PMID:22628511

Rashid, Mamoon; Adroub, Sabir A.; Arnoux, Marc; Ali, Shahjahan; van Soolingen, Dick; Bitter, Wilbert

2012-01-01

187

Complete Genome Sequences of Six Strains of the Genus Methylobacterium  

PubMed Central

The complete and assembled genome sequences were determined for six strains of the alphaproteobacterial genus Methylobacterium, chosen for their key adaptations to different plant-associated niches and environmental constraints. PMID:22887658

Bringel, Françoise; Chistoserdova, Ludmila; Moulin, Lionel; Farhan Ul Haque, Muhammad; Fleischman, Darrell E.; Gruffaz, Christelle; Jourand, Philippe; Knief, Claudia; Lee, Ming-Chun; Muller, Emilie E. L.; Nadalig, Thierry; Peyraud, Rémi; Roselli, Sandro; Russ, Lina; Goodwin, Lynne A.; Ivanova, Natalia; Kyrpides, Nikos; Lajus, Aurélie; Land, Miriam L.; Médigue, Claudine; Mikhailova, Natalia; Nolan, Matt; Woyke, Tanja; Stolyar, Sergey; Vorholt, Julia A.

2012-01-01

188

Complete genome sequence of Allochromatium vinosum DSM 180T  

PubMed Central

Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacterium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from freshwater, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp genome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Genome Institute Community Sequencing Program. PMID:22675582

Weissgerber, Thomas; Zigann, Renate; Bruce, David; Chang, Yun-juan; Detter, John C.; Han, Cliff; Hauser, Loren; Jeffries, Cynthia D.; Land, Miriam; Munk, A. Christine; Tapia, Roxanne; Dahl, Christiane

2011-01-01

189

Complete Genome Sequence of Rahnella aquatilis CIP 78.65  

SciTech Connect

Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

Martinez, Robert J [University of Alabama, Tuscaloosa; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Held, Brittany [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Sobeckya, Patricia A. [University of Alabama, Tuscaloosa

2012-01-01

190

Melanoma genome sequencing reveals frequent PREX2 mutations  

E-print Network

Melanoma is notable for its metastatic propensity, lethality in the advanced setting and association with ultraviolet exposure early in life. To obtain a comprehensive genomic view of melanoma in humans, we sequenced the ...

Lander, Eric S.

191

Science Originals: Sequencing Cancer Genomes: Targeted Cancer Therapies  

NSDL National Science Digital Library

Applying DNA sequencing to cancer genomes is providing insights that have allowed researchers to turn some cancers into chronic diseases rather than deadly ones. Still, the ultimate goal is to kill the cancer.

Robert Frederick (AAAS;)

2011-03-25

192

Complete genome sequence of the myxobacterium Sorangium cellulosum  

Microsoft Academic Search

The genus Sorangium synthesizes approximately half of the secondary metabolites isolated from myxobacteria, including the anti-cancer metabolite epothilone. We report the complete genome sequence of the model Sorangium strain S. cellulosum So ce56, which produces several natural products and has morphological and physiological properties typical of the genus. The circular genome, comprising 13,033,779 base pairs, is the largest bacterial genome

Susanne Schneiker; Olena Perlova; Olaf Kaiser; Klaus Gerth; Aysel Alici; Matthias O Altmeyer; Daniela Bartels; Thomas Bekel; Stefan Beyer; Edna Bode; Helge B Bode; Christoph J Bolten; Jomuna V Choudhuri; Sabrina Doss; Yasser A Elnakady; Bettina Frank; Lars Gaigalat; Alexander Goesmann; Carolin Groeger; Frank Gross; Lars Jelsbak; Lotte Jelsbak; Jörn Kalinowski; Carsten Kegler; Tina Knauber; Sebastian Konietzny; Maren Kopp; Lutz Krause; Daniel Krug; Bukhard Linke; Taifo Mahmud; Rosa Martinez-Arias; Alice C McHardy; Michelle Merai; Folker Meyer; Sascha Mormann; Jose Muñoz-Dorado; Juana Perez; Silke Pradella; Shwan Rachid; Günter Raddatz; Frank Rosenau; Christian Rückert; Florenz Sasse; Maren Scharfe; Stephan C Schuster; Garret Suen; Anke Treuner-Lange; Gregory J Velicer; Frank-Jörg Vorhölter; Kira J Weissman; Roy D Welch; Silke C Wenzel; David E Whitworth; Susanne Wilhelm; Christoph Wittmann; Helmut Blöcker; Alfred Pühler; Rolf Müller

2007-01-01

193

Complete genome sequence of Treponema pallidum strain DAL-1  

PubMed Central

Treponema pallidum strain DAL-1 is a human uncultivable pathogen causing the sexually transmitted disease syphilis. Strain DAL-1 was isolated from the amniotic fluid of a pregnant woman in the secondary stage of syphilis. Here we describe the 1,139,971 bp long genome of T. pallidum strain DAL-1 which was sequenced using two independent sequencing methods (454 pyrosequencing and Illumina). In rabbits, strain DAL-1 replicated better than the T. pallidum strain Nichols. The comparison of the complete DAL-1 genome sequence with the Nichols sequence revealed a list of genetic differences that are potentially responsible for the increased rabbit virulence of the DAL-1 strain. PMID:23449808

Zobaníková, Marie; Mikolka, Pavol; ?ejková, Darina; Pospíšilová, Petra; Chen, Lei; Strouhal, Michal; Qin, Xiang; Weinstock, George M.; Šmajs, David

2012-01-01

194

The complete sequence of the mitochondrial genome of Saccharomyces cerevisiae  

Microsoft Academic Search

The currently available yeast mitochondrial DNA (mtDNA) sequence is incomplete, contains many errors and is derived from several polymorphic strains. Here, we report that the mtDNA sequence of the strain used for nuclear genome sequencing assembles into a circular map of 85?779 bp which includes 10 kb of new sequence. We give a list of seven small hypothetical open reading

Françoise Foury; Tiziana Roganti; Nicolas Lecrenier; Bénédicte Purnelle

1998-01-01

195

Genome sequence of the cultivated cotton Gossypium arboreum  

Technology Transfer Automated Retrieval System (TEKTRAN)

Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

196

Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria  

PubMed Central

Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp. PMID:25125646

Neave, Matthew J.; Michell, Craig T.

2014-01-01

197

Genome Sequence of the Asiatic Species Borrelia persica  

PubMed Central

We report the complete genome sequence of Borrelia persica, the causative agent of tick-borne relapsing fever borreliosis on the Asian continent. Its genome of 1,784,979 bp contains 1,850 open reading frames, three ribosomal RNAs, and 32 tRNAs. One clustered regularly interspaced short palindromic repeat (CRISPR) was detected. PMID:24407639

Elbir, Haitham; Larsson, Pär; Normark, Johan; Upreti, Mukunda; Korenberg, Edward; Larsson, Christer

2014-01-01

198

Genome Sequence of Chinese Porcine Parvovirus Strain PPV2010  

PubMed Central

Porcine parvovirus (PPV) isolate PPV2010 has recently emerged in China. Herein, we analyze the complete genome sequence of PPV2010. Our results indicate that the genome of PPV2010 bears mixed characteristics of virulent PPV and vaccine strains. Importantly, PPV2010 has the potential to be a naturally attenuated candidate vaccine strain. PMID:22282333

Cui, Jin; Wang, Xin; Ren, Yudong; Cui, Shangjin; Li, Guangxing

2012-01-01

199

RESEARCH Open Access Genomic and small RNA sequencing of  

E-print Network

of sorghum as a reference genome sequence for Andropogoneae grasses Kankshita Swaminathan1,2 , Magdy origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome. Included within the Andropogoneae are major crops such as maize, Sorghum bicolor (sorghum), sugarcane

Green, Pamela

200

A snapshot of the emerging tomato genome sequence  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

201

Trypanosoma cruzi Clone Dm28c Draft Genome Sequence  

PubMed Central

Trypanosoma cruzi affects millions of people worldwide. Clinical variability of Chagas disease can be due to the genetic variability of this parasite, requiring further genome studies. Here we report the genome sequence of the T. cruzi Dm28c clone (TcI), a strain related to the sylvatic cycle of the parasite. PMID:24482508

Grisard, Edmundo Carlos; Teixeira, Santuza Maria Ribeiro; de Almeida, Luiz Gonzaga Paula; Stoco, Patricia Hermes; Gerber, Alexandra Lehmkuhl; Talavera-López, Carlos; Lima, Oberdan Cunha; Andersson, Björn

2014-01-01

202

Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium  

PubMed Central

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

2014-01-01

203

Draft Genome Sequence of Enterobacter cloacae Strain JD6301.  

PubMed

Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

Wilson, Jessica G; French, William T; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W; Champlin, Franklin R; Donaldson, Janet R

2014-01-01

204

Human Computer Interaction Approaches in Genomic Sequencing Restriction  

E-print Network

Human Computer Interaction Approaches in Genomic Sequencing Restriction Digest Microscopy set of data for genomic analysis. Enter the eld of human- computer interaction. Currently the state- cess. Let the computer run the large analyses and then generate visual maps that humans can observe

Wurtele, Eve Syrkin

205

Draft Genome Sequence of Mycobacterium cosmeticum DSM 44829  

PubMed Central

We announce the draft genome sequence of Mycobacterium cosmeticum strain DSM 44829, a nontuberculous species responsible for opportunistic infection. The genome described here is composed of 6,462,090 bp, with a G+C content of 68.24%. It contains 6,281 protein-coding genes and 75 predicted RNA genes. PMID:24723727

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

206

Genome Sequence of Fusarium graminearum Isolate CS3005  

PubMed Central

Fusarium graminearum is one of the most important fungal pathogens of wheat, barley, and maize worldwide. This announcement reports the genome sequence of a highly virulent Australian isolate of this species to supplement the existing genome of the North American F. graminearum isolate Ph1. PMID:24744326

Stiller, Jiri; Kazan, Kemal

2014-01-01

207

Draft Genome Sequence of Corynebacterium pseudodiphtheriticum Strain 090104 "Sokolov".  

PubMed

This report describes the first draft genome sequence of a Corynebacterium pseudodiphtheriticum strain. The information on the genome organization and putative gene products will assist in better understanding of the molecular mechanisms involved in the beneficial probiotic effects of this bacterium. PMID:24201200

Karlyshev, Andrey V; Melnikov, Vyacheslav G

2013-01-01

208

Draft Genome Sequence of Necropsobacter rosorum Strain P709T  

PubMed Central

Necropsobacter is a recently described genus that contains a single species, N. rosorum, and belongs to the family Pasteurellaceae. Here, we present the draft genome of N. rosorum strain P709T, which is the first genome sequence from this species. PMID:25301642

Padmanabhan, Roshan; Robert, Catherine; Fenollar, Florence; Raoult, Didier

2014-01-01

209

Draft genome sequences of 10 strains of the genus exiguobacterium.  

PubMed

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

Vishnivetskaya, Tatiana A; Chauhan, Archana; Layton, Alice C; Pfiffner, Susan M; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C; Markowitz, Victor M; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W; Pati, Amrita; Stamatis, Dimitrios; Reddy, T B K; Shapiro, Nicole; Nordberg, Henrik P; Cantor, Michael N; Hua, X Susan; Woyke, Tanja

2014-01-01

210

Complete genome sequence of pronghorn virus, a pestivirus.  

PubMed

The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

2014-01-01

211

Complete Genome Sequence of Pronghorn Virus, a Pestivirus  

PubMed Central

The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

Ridpath, Julia F.; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

2014-01-01

212

The tomato genome sequence provides insight into fleshy fruit evolution  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...

213

Draft Genome Sequence of Highly Nematicidal Bacillus thuringiensis DB27  

PubMed Central

Here, we report the genome sequence of nematicidal Bacillus thuringiensis DB27, which provides first insights into the genetic determinants of its pathogenicity to nematodes. The genome consists of a 5.7-Mb chromosome and seven plasmids, three of which contain genes encoding nematicidal proteins. PMID:24558243

Corton, Craig; Pickard, Derek J.; Dougan, Gordon

2014-01-01

214

Genome Sequences of Five B1 Subcluster Mycobacteriophages  

PubMed Central

Mycobacteriophages infect members of the Mycobacterium genus in the phylum Actinobacteria and exhibit remarkable diversity. Genome analysis groups the thousands of known mycobacteriophages into clusters, of which the B1 subcluster is currently the third most populous. We report the complete genome sequences of five additional members of the B1 subcluster. PMID:24285667

Barrus, E. Zane; Benedict, Alex B.; Brighton, Alicia K.; Fisher, Joshua N. B.; Gardner, Adam V.; Kartchner, Brittany J.; Ladle, Kara C.; Lunt, Bryce L.; Merrill, Bryan D.; Morrell, John D.; Burnett, Sandra H.

2013-01-01

215

Genome Sequence of a Thermophilic Bacillus, Geobacillus thermodenitrificans DSM465  

PubMed Central

Geobacillus thermodenitrificans NG80-2 encodes a LadA-mediated alkane degradation pathway, while G. thermodenitrificans DSM465 cannot utilize alkanes. Here, we report the draft genome sequence of G. thermodenitrificans DSM465, which may help reveal the genomic differences between these two strains in regards to the biodegradation of alkanes. PMID:24336381

Yao, Nana; Ren, Yi

2013-01-01

216

Draft Genome Sequence of Enterobacter cloacae Strain JD6301  

PubMed Central

Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

Wilson, Jessica G.; French, William T.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W.; Champlin, Franklin R.

2014-01-01

217

Genome Sequence of Pectobacterium sp. Strain SCC3193  

PubMed Central

We report the complete and annotated genome sequence of the plant-pathogenic enterobacterium Pectobacterium sp. strain SCC3193, a model strain isolated from potato in Finland. The Pectobacterium sp. SCC3193 genome consists of a 516,411-bp chromosome, with no plasmids. PMID:23045508

Koskinen, J. Patrik; Laine, Pia; Niemi, Outi; Nykyri, Johanna; Harjunpää, Heidi; Auvinen, Petri; Paulin, Lars; Pirhonen, Minna; Palva, Tapio

2012-01-01

218

Genome sequence of the milbemycin-producing bacterium Streptomyces bingchenggensis.  

PubMed

Streptomyces bingchenggensis is a soil-dwelling bacterium producing the commercially important anthelmintic macrolide milbemycins. Besides milbemycins, the insecticidal polyether antibiotic nanchangmycin and some other antibiotics have also been isolated from this strain. Here we report the complete genome sequence of S. bingchenggensis. The availability of the genome sequence of S. bingchenggensis should enable us to understand the biosynthesis of these structurally intricate antibiotics better and facilitate rational improvement of this strain to increase their titers. PMID:20581206

Wang, Xiang-Jing; Yan, Yi-Jun; Zhang, Bo; An, Jing; Wang, Ji-Jia; Tian, Jun; Jiang, Ling; Chen, Yi-Hua; Huang, Sheng-Xiong; Yin, Min; Zhang, Ji; Gao, Ai-Li; Liu, Chong-Xi; Zhu, Zhao-Xiang; Xiang, Wen-Sheng

2010-09-01

219

Whole Genome and Transcriptome Sequencing of a B3 Thymoma  

PubMed Central

Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina) and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37). Copy number (CN) aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X) was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs) and 2 insertion/deletions (INDELs) were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma. PMID:23577124

Petrini, Iacopo; Rajan, Arun; Pham, Trung; Voeller, Donna; Davis, Sean; Gao, James; Wang, Yisong; Giaccone, Giuseppe

2013-01-01

220

Draft genome sequence of Gluconobacter thailandicus NBRC 3257  

PubMed Central

Gluconobacter thailandicus strain NBRC 3257, isolated from downy cherry (Prunus tomentosa), is a strict aerobic rod-shaped Gram-negative bacterium. Here, we report the features of this organism, together with the draft genome sequence and annotation. The draft genome sequence is composed of 107 contigs for 3,446,046 bp with 56.17% G+C content and contains 3,360 protein-coding genes and 54 RNA genes. PMID:25197448

Matsutani, Minenosuke; Yakushi, Toshiharu

2014-01-01

221

Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii  

Microsoft Academic Search

The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

1996-01-01

222

The Complete Nucleic Acid Sequence of the Adenovirus Type 5 Reference Material (ARM) Genome  

Microsoft Academic Search

development, assay validation, and product characterization.7 FDA's Center for Biologics Evaluation and Research (CBER) revised their require- ments for pre-Phase I sequence analysis of viral vectors used for gene transfer trials such that it is now necessary to sequence and analyze the entire genome of viral vectors ? 40 kilobases (kb) in length (e.g., adenovirus, adeno-associat- ed virus, and retro\\/lentivirus),

BARRY J. SUGARMAN; BETH M. HUTCHINS; DIANE L. MCALLISTER; FEI LU; KENNETH B. THOMAS

223

Large-Scale Sequencing: The Future of Genomic Sciences Colloquium  

SciTech Connect

Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin, since not only are their genomes available, but they are also accompanied by data on environment and physiology that can be used to understand the resulting data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into this effort. Efforts to sequence cultivated isolates should target characterized isolates from culture collections for which biochemical data are available, as well as other cultures of lasting value from personal collections. The genomes of type strains should be among the first targets for sequencing, but creative culture methods, novel cell isolation, and sorting methods would all be helpful in obtaining organisms we have not yet been able to cultivate for sequencing. The data that should be provided for strains targeted for sequencing will depend on the phylogenetic context of the organism and the amount of information available about its nearest relatives. Annotation is an important part of transforming genome sequences into useful resources, but it represents the most significant bottleneck to the field of comparative genomics right now and must be addressed. Furthermore, there is a need for more consistency in both annotation and achieving annotation data. As new annotation tools become available over time, re-annotation of genomes should be implemented, taking advantage of advancements in annotation techniques in order to capitalize on the genome sequences and increase both the societal and scientific benefit of genomics work. Given the proper resources, the knowledge and ability exist to be able to select model systems, some simple, some less so, and dissect them so that we may understand the processes and interactions at work in them. Colloquium participants suggest a five-pronged, coordinated initiative to exhaustively describe six different microbial ecosystems, designed to describe all the gene diversity, across genomes. In this effort, sequencing should be complemented by other experimental data, particularly transcriptomics and metabolomics data, all of which

Margaret Riley; Merry Buckley

2009-01-01

224

Mitochondrial Genome Sequence of the Legume Vicia faba  

PubMed Central

The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000?bp with a genome complexity of 387,745?bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806?bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs. PMID:23675376

Negruk, Valentine

2013-01-01

225

Complete Genome Sequence of Methanobacterium thermoautotrophicum DH: Functional Analysis and Comparative Genomics  

Microsoft Academic Search

The complete 1,751,377-bp sequence of the genome of the thermophilic archaeon Methanobacterium thermo- autotrophicum DH has been determined by a whole-genome shotgun sequencing approach. A total of 1,855 open reading frames (ORFs) have been identified that appear to encode polypeptides, 844 (46%) of which have been assigned putative functions based on their similarities to database sequences with assigned functions. A

DOUGLAS R. SMITH; LYNN A. DOUCETTE-STAMM; CRAIG DELOUGHERY; HONGMEI LEE; JOANN DUBOIS; TYLER ALDREDGE; ROMINA BASHIRZADEH; DERRON BLAKELY; ROBIN COOK; KATIE GILBERT; DAWN HARRISON; LIEU HOANG; PAMELA KEAGLE; WENDY LUMM; BRYAN POTHIER; DAYONG QIU; ROB SPADAFORA; RITA VICAIRE; YING WANG; JAMEY WIERZBOWSKI; RENE GIBSON; NILOFER JIWANI; ANTHONY CARUSO; DAVID BUSH; HERSHEL SAFER; DONIVAN PATWELL; SHASHI PRABHAKAR; STEVE MCDOUGALL; GEORGE SHIMER; ANIL GOYAL; SHMUEL PIETROKOVSKI; GEORGE M. CHURCH; CHARLES J. DANIELS; JEN-I MAO; PHIL RICE; JORK NOLLING; JOHN N. REEVE

1997-01-01

226

Characterizing the walnut genome through analyses of BAC end sequences.  

PubMed

Persian walnut (Juglans regia L.) is an economically important tree for its nut crop and timber. To gain insight into the structure and evolution of the walnut genome, we constructed two bacterial artificial chromosome (BAC) libraries, containing a total of 129,024 clones, from in vitro-grown shoots of J. regia cv. Chandler using the HindIII and MboI cloning sites. A total of 48,218 high-quality BAC end sequences (BESs) were generated, with an accumulated sequence length of 31.2 Mb, representing approximately 5.1% of the walnut genome. Analysis of repeat DNA content in BESs revealed that approximately 15.42% of the genome consists of known repetitive DNA, while walnut-unique repetitive DNA identified in this study constitutes 13.5% of the genome. Among the walnut-unique repetitive DNA, Julia SINE and JrTRIM elements represent the first identified walnut short interspersed element (SINE) and terminal-repeat retrotransposon in miniature (TRIM) element, respectively; both types of elements are abundant in the genome. As in other species, these SINEs and TRIM elements could be exploited for developing repeat DNA-based molecular markers in walnut. Simple sequence repeats (SSR) from BESs were analyzed and found to be more abundant in BESs than in expressed sequence tags. The density of SSR in the walnut genome analyzed was also slightly higher than that in poplar and papaya. Sequence analysis of BESs indicated that approximately 11.5% of the walnut genome represents a coding sequence. This study is an initial characterization of the walnut genome and provides the largest genomic resource currently available; as such, it will be a valuable tool in studies aimed at genetically improving walnut. PMID:22101470

Wu, Jiajie; Gu, Yong Q; Hu, Yuqin; You, Frank M; Dandekar, Abhaya M; Leslie, Charles A; Aradhya, Mallikarjuna; Dvorak, Jan; Luo, Ming-Cheng

2012-01-01

227

Draft Genome Sequence of Campylobacter ureolyticus Strain CIT007, the First Whole-Genome Sequence of a Clinical Isolate  

PubMed Central

Herein, we present the draft genome sequence of Campylobacter ureolyticus. Strain CIT007 was isolated from a stool sample from an elderly female presenting with diarrheal illness and end-stage chronic renal disease. PMID:24723712

Lucid, Alan; Bullman, Susan; Koziel, Monika; Corcoran, Gerard D.; Cotter, Paul D.; Lucey, Brigid

2014-01-01

228

Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution  

SciTech Connect

Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

Schulman, Al

2009-08-09

229

Complete mitochondrial genome sequence of Aoluguya reindeer (Rangifer tarandus).  

PubMed

Abstract The complete mitochondria genome of the reindeer, Rangifer tarandus, was determined by accurate polymerase chain reaction. The entire genome is 16,357?bp in length and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a D-loop region, all of which are arranged in a typical vertebrate manner. The overall base composition of the reindeer's mitochondrial genome is 33.7% of A, 23.1% of C, 30.1% of T and 13.2%of G. A termination associated sequence and several conserved central sequence block domains were discovered within the control region. PMID:25469816

Ju, Yan; Liu, Huamiao; Rong, Min; Yang, Yifeng; Wei, Haijun; Shao, Yuanchen; Chen, Xiumin; Xing, Xiumei

2014-12-01

230

Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology  

PubMed Central

Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

2014-01-01

231

Short reads, circular genome: skimming solid sequence to construct the bighorn sheep mitochondrial genome.  

PubMed

As sequencing technology improves, an increasing number of projects aim to generate full genome sequence, even for nonmodel taxa. These projects may be feasibly conducted at lower read depths if the alignment can be aided by previously developed genomic resources from a closely related species. We investigated the feasibility of constructing a complete mitochondrial (mt) genome without preamplification or other targeting of the sequence. Here we present a full mt genome sequence (16,463 nucleotides) for the bighorn sheep (Ovis canadensis) generated though alignment of SOLiD short-read sequences to a reference genome. Average read depth was 1240, and each base was covered by at least 36 reads. We then conducted a phylogenomic analysis with 27 other bovid mitogenomes, which placed bighorn sheep firmly in the Ovis clade. These results show that it is possible to generate a complete mitogenome by skimming a low-coverage genomic sequencing library. This technique will become increasingly applicable as the number of taxa with some level of genome sequence rises. PMID:21948953

Miller, Joshua M; Malenfant, René M; Moore, Stephen S; Coltman, David W

2012-01-01

232

Draft genome sequence of adzuki bean, Vigna angularis.  

PubMed

Adzuki bean (Vigna angularis var. angularis) is a dietary legume crop in East Asia. The presumed progenitor (Vigna angularis var. nipponensis) is widely found in East Asia, suggesting speciation and domestication in these temperate climate regions. Here, we report a draft genome sequence of adzuki bean. The genome assembly covers 75% of the estimated genome and was mapped to 11 pseudo-chromosomes. Gene prediction revealed 26,857 high confidence protein-coding genes evidenced by RNAseq of different tissues. Comparative gene expression analysis with V. radiata showed that the tissue specificity of orthologous genes was highly conserved. Additional re-sequencing of wild adzuki bean, V. angularis var. nipponensis, and V. nepalensis, was performed to analyze the variations between cultivated and wild adzuki bean. The determined divergence time of adzuki bean and the wild species predated archaeology-based domestication time. The present genome assembly will accelerate the genomics-assisted breeding of adzuki bean. PMID:25626881

Kang, Yang Jae; Satyawan, Dani; Shim, Sangrea; Lee, Taeyoung; Lee, Jayern; Hwang, Won Joo; Kim, Sue K; Lestari, Puji; Laosatit, Kularb; Kim, Kil Hyun; Ha, Tae Joung; Chitikineni, Annapurna; Kim, Moon Young; Ko, Jong-Min; Gwag, Jae-Gyun; Moon, Jung-Kyung; Lee, Yeong-Ho; Park, Beom-Seok; Varshney, Rajeev K; Lee, Suk-Ha

2015-01-01

233

Complete genome sequence of Kangiella koreensis type strain (SW-125).  

PubMed

Kangiella koreensis (Yoon et al. 2004) is the type species of the genus and is of phylogenetic interest because of the very isolated location of the genus Kangiella in the gammaproteobacterial order Oceanospirillales. K. koreensis SW-125(T) is a Gram-negative, non-motile, non-spore-forming bacterium isolated from tidal flat sediments at Daepo Beach, Yellow Sea, Korea. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first completed genome sequence from the genus Kangiella and only the fourth genome from the order Oceanospirillales. This 2,852,073 bp long single replicon genome with its 2647 protein-coding and 48 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304661

Han, Cliff; Sikorski, Johannes; Lapidus, Alla; Nolan, Matt; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Copeland, Alex; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Saunders, Elizabeth; Brettin, Thomas; Göker, Markus; Tindall, Brian J; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Detter, John C

2009-01-01

234

Complete genome sequence of Kribbella flavida type strain (IFO 14399).  

PubMed

The genus Kribbella consists of 15 species, with Kribbella flavida (Park et al. 1999) as the type species. The name Kribbella was formed from the acronym of the Korea Research Institute of Bioscience and Biotechnology, KRIBB. Strains of the various Kribbella species were originally isolated from soil, potato, alum slate mine, patinas of catacombs or from horse racecourses. Here we describe the features of K. flavida together with the complete genome sequence and annotation. In addition to the 5.3 Mbp genome of Nocardioides sp. JS614, this is only the second completed genome sequence of the family Nocardioidaceae. The 7,579,488 bp long genome with its 7,086 protein-coding and 60 RNA genes and is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304701

Pukall, Rüdiger; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Labutti, Kurt; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pitluck, Sam; Bruce, David; Goodwin, Lynne; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chen, Amy; Palaniappan, Krishna; Chain, Patrick; Rohde, Manfred; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Brettin, Thomas

2010-01-01

235

Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences  

PubMed Central

Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of the zebrafish genome. BES of common carp are tremendous tools for comparative mapping between the two closely related species, zebrafish and common carp, which should facilitate both structural and functional genome analysis in common carp. PMID:21492448

2011-01-01

236

Complete Genome Sequence of Equine Herpesvirus Type 9  

PubMed Central

Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

Yamaguchi, Tsuyoshi; Yamada, Souichi

2012-01-01

237

Complete genome sequence of equine herpesvirus type 9.  

PubMed

Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

Fukushi, Hideto; Yamaguchi, Tsuyoshi; Yamada, Souichi

2012-12-01

238

Transcriptome and genome sequencing uncovers functional variation in humans  

PubMed Central

Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

2013-01-01

239

Tandem Clusters of Membrane Proteins in Complete Genome Sequences  

E-print Network

genome sequences. Membrane pro- teins play important roles in living cells, such as for transport, energy sequences were concerned mostly with the estimation of the number of membrane pro- teins (Arkin et al. 1997; Boyd et al. 1998; Jones 1998; Wallin and von Heijne 1998). Paulsen et al. (1998) ana- lyzed a specific

Kihara, Daisuke

240

Genome Sequences of Vibrio navarrensis, a Potential Human Pathogen  

PubMed Central

Vibrio navarrensis is an aquatic bacterium recently shown to be associated with human illness. We report the first genome sequences of three V. navarrensis strains obtained from clinical and environmental sources. Preliminary analyses of the sequences reveal that V. navarrensis contains genes commonly associated with virulence in other human pathogens. PMID:25414502

Gladney, Lori M.; Katz, Lee S.; Knipe, Kristen M.; Rowe, Lori A.; Conley, Andrew B.; Rishishwar, Lavanya; Mariño-Ramírez, Leonardo

2014-01-01

241

Environmental Genome Shotgun Sequencing of the Sargasso Sea  

Microsoft Academic Search

We have applied ``whole-genome shotgun sequencing'' to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These

J. Craig Venter; Karin Remington; John F. Heidelberg; Aaron L. Halpern; Doug Rusch; Dongying Wu; Ian Paulsen; Karen E. Nelson; William Nelson; Derrick E. Fouts; Samuel Levy; Anthony H. Knap; Michael W. Lomas; Ken Nealson; Owen White; Jeremy Peterson; Jeff Hoffman; Rachel Parsons; Holly Baden-Tillson; Cynthia Pfannkoch; Yu-Hui Rogers; Hamilton O. Smith

2004-01-01

242

Genome sequencing and analysis of the model grass Brachypodium distachyon  

Microsoft Academic Search

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum

David F. Garvin; Todd C. Mockler; Jeremy Schmutz; Dan Rokhsar; Kerrie Barry; Susan Lucas; Miranda Harmon-Smith; Kathleen Lail; Hope Tice; Jane Grimwood; Neil McKenzie; Naxin Huo; Yong Q. Gu; Gerard R. Lazo; Olin D. Anderson; Frank M. You; Ming-Cheng Luo; Jan Dvorak; Jonathan Wright; Melanie Febrer; Michael W. Bevan; Dominika Idziak; Robert Hasterok; Erika Lindquist; Mei Wang; Samuel E. Fox; Henry D. Priest; Sergei A. Filichkin; Scott A. Givan; Douglas W. Bryant; Jeff H. Chang; Haiyan Wu; Wei Wu; An-Ping Hsia; Patrick S. Schnable; Anantharaman Kalyanaraman; Brad Barbazuk; Todd P. Michael; Samuel P. Hazen; Jennifer N. Bragg; Debbie Laudencia-Chingcuanco; Yiqun Weng; Georg Haberer; Manuel Spannagl; Klaus Mayer; Thomas Rattei; Therese Mitros; Sang-Jik Lee; Jocelyn K. C. Rose; Lukas A. Mueller; Jan P. Buchmann; Jaakko Tanskanen; Heidrun Gundlach; Antonio Costa de Oliveira; Luciano da C. Maia; William Belknap; Ning Jiang; Jinsheng Lai; Liucun Zhu; Jianxin Ma; Cheng Sun; Florent Murat; Michael Abrouk; Remy Bruggmann; Joachim Messing; Noah Fahlgren; Christopher M. Sullivan; James C. Carrington; Elisabeth J. Chapman; Greg D. May; Jixian Zhai; Matthias Ganssmann; Sai Guna Ranjan Gurazada; Marcelo German; Ludmila Tyler; Jiajie Wu; James Thomson; Shan Chen; Henrik V. Scheller; Jesper Harholt; Peter Ulvskov; Jeffrey A. Kimbrel; Laura E. Bartley; Peijian Cao; Ki-Hong Jung; Manoj K. Sharma; Miguel Vega-Sanchez; Pamela Ronald; Christopher D. Dardick; Stefanie de Bodt; Wim Verelst; Dirk Inzé; Maren Heese; Arp Schnittger; Xiaohan Yang; Udaya C. Kalluri; Gerald A. Tuskan; Zhihua Hua; Richard D. Vierstra; Yu Cui; Shuhong Ouyang; Qixin Sun; Zhiyong Liu; Alper Yilmaz; Erich Grotewold; Richard Sibout; Kian Hematy; Gregory Mouille; Herman Höfte; Jérome Pelloux; Devin O'Connor; James Schnable; Scott Rowe; Frank Harmon; Cynthia L. Cass; John C. Sedbrook; Mary E. Byrne; Sean Walsh; Janet Higgins; Pinghua Li; Thomas Brutnell; Turgay Unver; Hikmet Budak; Harry Belcram; Mathieu Charles; Boulos Chalhoub; Ivan Baxter

2010-01-01

243

PHYTOPHTHORA GENOME SEQUENCES UNCOVER EVOLUTIONARY ORIGINS AND MECHANISMS OF PATHOGENESIS  

Technology Transfer Automated Retrieval System (TEKTRAN)

Draft genome sequences of the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum have been determined. Oomycetes such as these Phytophthora species share the kingdom Stramenopiles with photosynthetic algae such as diatoms, and the Phytophthora sequences sugges...

244

Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes  

Microsoft Academic Search

The natural transfer of DNA from mitochondria to the nucleus generates nuclear copies of mitochondrial DNA (numts) and is an ongoing evolutionary process, as genome sequences attest. In humans, five different numts cause genetic disease and a dozen human loci are polymorphic for the presence of numts, underscoring the rapid rate at which mitochondrial sequences reach the nucleus over evolutionary

Einat Hazkani-Covo; Raymond M. Zeller; William Martin

2010-01-01

245

Sequencing the Genome of the Heirloom Watermelon Cultivar Charleston Gray  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of the watermelon cultivar Charleston Gray, a major heirloom which has been used in breeding programs of many watermelon cultivars, was sequenced. Our strategy involved a hybrid approach using the Illumina and 454/Titanium next-generation sequencing technologies. For Illumina, shotgun g...

246

Complete genome sequencing and variant analysis of a Pakistani individual.  

PubMed

We sequenced the genome of a Pakistani male at 25.5x coverage using massively parallel sequencing technology. More than 90% of the sequence reads were mapped to the human reference genome. In subsequent analysis, we identified 3,224,311 single-nucleotide polymorphisms (SNPs), of which 388,532 (12% of the total SNPs) had not been previously recorded in single nucleotide polymorphism database (dbSNP) or the 1000 Genomes Project database. The 5991 non-synonymous coding variants were screened for deleterious or disease-associated SNPs. Analysis of genes with deleterious SNPs identified 'retinoic acid signaling' and 'regulation of transcription' as the enriched Gene Ontology terms. Scanning of non-synonymous SNPs against the OMIM revealed several disease and phenotype-associated variants in Pakistani genome. Comparative analysis with Indian genome sequence revealed >1.8 million shared SNPs; 32% of which were annotated in ~14,000 genes. Gene Ontology (GO) terms analysis of these genes identified 'response to jasmonic acid stimulus', 'aminoglycoside antibiotic metabolic process' and 'glycoside metabolic process' with considerable enrichment. A total of 59,558 of small indels (1-5 bp) and 16,063 large structural variations were found; 54% of which was novel. Substantial number of novel structural variations discovered in Pakistani genome enforced previous inferences that (a) structural variations are major type of variation in the genome and (b) compared with SNPs, they putatively exhibit equivalent or superior functional roles. This genome sequence information will be an important reference for population-wide genomics studies of ethnically diverse South Asian subcontinent. PMID:23842039

Azim, Muhammad Kamran; Yang, Chuanchun; Yan, Zhixiang; Choudhary, Muhammad Iqbal; Khan, Asifullah; Sun, Xiao; Li, Ran; Asif, Huma; Sharif, Sana; Zhang, Yong

2013-09-01

247

Sequence Analysis of the Genome of Carnation (Dianthus caryophyllus L.)  

PubMed Central

The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. ‘Francesco’ was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568 887 315 bp, consisting of 45 088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16 644 bp and 60 737 bp, respectively, and the longest scaffold was 1 287 144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ?98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. PMID:24344172

Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

2014-01-01

248

Initial sequence and comparative analysis of the cat genome  

PubMed Central

The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ?65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

2007-01-01

249

Conserved terminal sequences of rice ragged stunt virus genomic RNA.  

PubMed

The 5'- and 3'-terminal nucleotide sequences of the dsRNA genome segments of rice ragged stunt virus (RRSV), a member of the plant Reoviridae, were determined and compared with those published for other viruses in this family. The 5'- and 3'-terminal regions of the RRSV plus strand RNA from all genome segments were found to have the same conserved hexanucleotide (5' GAUAAA---) and tetranucleotide (---GUGC 3') sequences, respectively. These conserved terminal sequences were different from those found in viruses in the Phytoreovirus and Fijivirus genera. This result confirms that, as already suggested, RRSV should be placed in a third genus. PMID:1634874

Yan, J; Kudo, H; Uyeda, I; Lee, S Y; Shikata, E

1992-04-01

250

Genome Sequence of Luminous Piezophile Photobacterium phosphoreum ANT-2200.  

PubMed

Bacteria of the genus Photobacterium thrive worldwide in oceans and show substantially varied lifestyles, including free-living, commensal, pathogenic, symbiotic, and piezophilic. Here, we present the genome sequence of a luminous, piezophilic Photobacterium phosphoreum strain, ANT-2200, isolated from a water column at 2,200 m depth in the Mediterranean Sea. It is the first genomic sequence of the P. phosphoreum group. An analysis of the sequence provides insight into the adaptation of bacteria to the deep-sea habitat. PMID:24744322

Zhang, Sheng-Da; Barbe, Valérie; Garel, Marc; Zhang, Wei-Jia; Chen, Haitao; Santini, Claire-Lise; Murat, Dorothée; Jing, Hongmei; Zhao, Yuan; Lajus, Aurélie; Martini, Séverine; Pradel, Nathalie; Tamburini, Christian; Wu, Long-Fei

2014-01-01

251

The complete mitochondrial genome sequence of Xenocypris davidi (Bleeker).  

PubMed

Xenocypris davidi is a member of Cyprindae and widely distributed in China. To understand the systematic status of this species, we sequenced the whole mitochondrial genome of Xenocypris davidi. The complete mitochondrial genome is 16,630 bp in length including the typical structure of 22 tRNA, 2 rRNA, 13 protein-coding genes and the non-coding region. The major non-coding sequence which is the control region containing 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F). The second non-coding sequence is the origin of light-strand replication (OL). This region has the potential to fold in a step-loop secondary structure. The mitochondrial genomic sequence will help us to study the conservation genetic and evolution of Xenocypris. PMID:23815323

Liu, Yu

2014-10-01

252

LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report  

SciTech Connect

Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

2010-01-26

253

Sequence-Based Mapping of the Polyploid Wheat Genome  

PubMed Central

The emergence of new sequencing technologies has provided fast and cost-efficient strategies for high-resolution mapping of complex genomes. Although these approaches hold great promise to accelerate genome analysis, their application in studying genetic variation in wheat has been hindered by the complexity of its polyploid genome. Here, we applied the next-generation sequencing of a wheat doubled-haploid mapping population for high-resolution gene mapping and tested its utility for ordering shotgun sequence contigs of a flow-sorted wheat chromosome. A bioinformatical pipeline was developed for reliable variant analysis of sequence data generated for polyploid wheat mapping populations. The results of variant mapping were consistent with the results obtained using the wheat 9000 SNP iSelect assay. A reference map of the wheat genome integrating 2740 gene-associated single-nucleotide polymorphisms from the wheat iSelect assay, 1351 diversity array technology, 118 simple sequence repeat/sequence-tagged sites, and 416,856 genotyping-by-sequencing markers was developed. By analyzing the sequenced megabase-size regions of the wheat genome we showed that mapped markers are located within 40?100 kb from genes providing a possibility for high-resolution mapping at the level of a single gene. In our population, gene loci controlling a seed color phenotype cosegregated with 2459 markers including one that was located within the red seed color gene. We demonstrate that the high-density reference map presented here is a useful resource for gene mapping and linking physical and genetic maps of the wheat genome. PMID:23665877

Saintenac, Cyrille; Jiang, Dayou; Wang, Shichen; Akhunov, Eduard

2013-01-01

254

Pervasive sequence patents cover the entire human genome  

PubMed Central

The scope and eligibility of patents for genetic sequences have been debated for decades, but a critical case regarding gene patents (Association of Molecular Pathologists v. Myriad Genetics) is now reaching the US Supreme Court. Recent court rulings have supported the assertion that such patents can provide intellectual property rights on sequences as small as 15 nucleotides (15mers), but an analysis of all current US patent claims and the human genome presented here shows that 15mer sequences from all human genes match at least one other gene. The average gene matches 364 other genes as 15mers; the breast-cancer-associated gene BRCA1 has 15mers matching at least 689 other genes. Longer sequences (1,000 bp) still showed extensive cross-gene matches. Furthermore, 15mer-length claims from bovine and other animal patents could also claim as much as 84% of the genes in the human genome. In addition, when we expanded our analysis to full-length patent claims on DNA from all US patents to date, we found that 41% of the genes in the human genome have been claimed. Thus, current patents for both short and long nucleotide sequences are extraordinarily non-specific and create an uncertain, problematic liability for genomic medicine, especially in regard to targeted re-sequencing and other sequence diagnostic assays. PMID:23522065

2013-01-01

255

Genome sequencing highlights the dynamic early history of dogs.  

PubMed

To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11-16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary. PMID:24453982

Freedman, Adam H; Gronau, Ilan; Schweizer, Rena M; Ortega-Del Vecchyo, Diego; Han, Eunjung; Silva, Pedro M; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Beale, Holly; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R; Parker, Heidi G; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D; Harkins, Timothy T; Nelson, Stanley F; Ostrander, Elaine A; Marques-Bonet, Tomas; Wayne, Robert K; Novembre, John

2014-01-01

256

Genome Sequencing Highlights the Dynamic Early History of Dogs  

PubMed Central

To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11–16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary. PMID:24453982

Freedman, Adam H.; Gronau, Ilan; Schweizer, Rena M.; Ortega-Del Vecchyo, Diego; Han, Eunjung; Silva, Pedro M.; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Beale, Holly; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R.; Parker, Heidi G.; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D.; Harkins, Timothy T.; Nelson, Stanley F.; Ostrander, Elaine A.; Marques-Bonet, Tomas; Wayne, Robert K.; Novembre, John

2014-01-01

257

A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)  

ScienceCinema

Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

FitzGerald, Michael [Broad Institute

2013-02-12

258

Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)  

DOE Data Explorer

Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

259

Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner  

PubMed Central

We define a “threaded blockset,” which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for “threaded blockset aligner”) builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser. PMID:15060014

Blanchette, Mathieu; Kent, W. James; Riemer, Cathy; Elnitski, Laura; Smit, Arian F.A.; Roskin, Krishna M.; Baertsch, Robert; Rosenbloom, Kate; Clawson, Hiram; Green, Eric D.; Haussler, David; Miller, Webb

2004-01-01

260

The Genome of Polymorphonuclear Neutrophils Maintains Normal Coding Sequences  

PubMed Central

Genetic studies often use genomic DNA from whole blood cells, of which the majority are the polymorphonuclear myeloid cells. Those cells undergo dramatic change of nuclear morphology following cellular differentiation. It remains elusive if the nuclear morphological change accompanies sequence alternations from the intact genome. If such event exists, it will cause a serious problem in using such type of genomic DNA for genetic study as the sequences will not represent the intact genome in the host individuals. Using exome sequencing, we compared the coding regions between neutrophil, which is the major type of polymorphonuclear cells, and CD4+ T cell, which has an intact genome, from the same individual. The results show that exon sequences between the two cell types are essentially the same. The minor differences represented by the missed exons and base changes between the two cell types were validated to be mainly caused by experimental errors. Our study concludes that genomic DNA from whole blood cells can be safely used for genetic studies. PMID:24250807

Wen, Hongxiu; Luo, Jiangtao; Chen, Peixian; Cowan, Kenneth; Wang, San Ming

2013-01-01

261

Inconsistencies in Neanderthal Genomic DNA Sequences  

Microsoft Academic Search

Two recently published papers describe nuclear DNA sequences that were obtained from the same Neanderthal fossil. Our reanalyses of the data from these studies show that they are not consistent with each other and point to serious problems with the data quality in one of the studies, possibly due to modern human DNA contaminants and\\/or a high rate of sequencing

Jeffrey D. Wall; Sung K. Kim

2007-01-01

262

Comprehensive Genome Sequence Analysis of a Breast Cancer Amplicon  

PubMed Central

Gene amplification occurs in most solid tumors and is associated with poor prognosis. Amplification of 20q13.2 is common to several tumor types including breast cancer. The 1 Mb of sequence spanning the 20q13.2 breast cancer amplicon is one of the most exhaustively studied segments of the human genome. These studies have included amplicon mapping by comparative genomic hybridization (CGH), fluorescent in-situ hybridization (FISH), array-CGH, quantitative microsatellite analysis (QUMA), and functional genomic studies. Together these studies revealed a complex amplicon structure suggesting the presence of at least two driver genes in some tumors. One of these, ZNF217, is capable of immortalizing human mammary epithelial cells (HMEC) when overexpressed. In addition, we now report the sequencing of this region in human and mouse, and on quantitative expression studies in tumors. Amplicon localization now is straightforward and the availability of human and mouse genomic sequence facilitates their functional analysis. However, comprehensive annotation of megabase-scale regions requires integration of vast amounts of information. We present a system for integrative analysis and demonstrate its utility on 1.2 Mb of sequence spanning the 20q13.2 breast cancer amplicon and 865 kb of syntenic murine sequence. We integrate tumor genome copy number measurements with exhaustive genome landscape mapping, showing that amplicon boundaries are associated with maxima in repetitive element density and a region of evolutionary instability. This integration of comprehensive sequence annotation, quantitative expression analysis, and tumor amplicon boundaries provide evidence for an additional driver gene prefoldin 4 (PFDN4), coregulated genes, conserved noncoding regions, and associate repetitive elements with regions of genomic instability at this locus. PMID:11381030

Collins, Colin; Volik, Stanislav; Kowbel, David; Ginzinger, David; Ylstra, Bauke; Cloutier, Thomas; Hawkins, Trevor; Predki, Paul; Martin, Christopher; Wernick, Meredith; Kuo, Wen-Lin; Alberts, Arthur; Gray, Joe W.

2001-01-01

263

Sequencing the nuclear genome of the extinct woolly mammoth.  

PubMed

In 1994, two independent groups extracted DNA from several Pleistocene epoch mammoths and noted differences among individual specimens. Subsequently, DNA sequences have been published for a number of extinct species. However, such ancient DNA is often fragmented and damaged, and studies to date have typically focused on short mitochondrial sequences, never yielding more than a fraction of a per cent of any nuclear genome. Here we describe 4.17 billion bases (Gb) of sequence from several mammoth specimens, 3.3 billion (80%) of which are from the woolly mammoth (Mammuthus primigenius) genome and thus comprise an extensive set of genome-wide sequence from an extinct species. Our data support earlier reports that elephantid genomes exceed 4 Gb. The estimated divergence rate between mammoth and African elephant is half of that between human and chimpanzee. The observed number of nucleotide differences between two particular mammoths was approximately one-eighth of that between one of them and the African elephant, corresponding to a separation between the mammoths of 1.5-2.0 Myr. The estimated probability that orthologous elephant and mammoth amino acids differ is 0.002, corresponding to about one residue per protein. Differences were discovered between mammoth and African elephant in amino-acid positions that are otherwise invariant over several billion years of combined mammalian evolution. This study shows that nuclear genome sequencing of extinct species can reveal population differences not evident from the fossil record, and perhaps even discover genetic factors that affect extinction. PMID:19020620

Miller, Webb; Drautz, Daniela I; Ratan, Aakrosh; Pusey, Barbara; Qi, Ji; Lesk, Arthur M; Tomsho, Lynn P; Packard, Michael D; Zhao, Fangqing; Sher, Andrei; Tikhonov, Alexei; Raney, Brian; Patterson, Nick; Lindblad-Toh, Kerstin; Lander, Eric S; Knight, James R; Irzyk, Gerard P; Fredrikson, Karin M; Harkins, Timothy T; Sheridan, Sharon; Pringle, Tom; Schuster, Stephan C

2008-11-20

264

A draft sequence of the Neandertal Genome  

E-print Network

-8794-2-29-2-l.jpg #12;Where were the Neandertals? http://news.nationalgeographic.com/news/2003/03/photogalleries/neanderthal/images/ primary/neanderthals.jpg #12;How would you tell (if there was gene flow)? · Look for parts of the genome

Borenstein, Elhanan

265

Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)  

SciTech Connect

Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family Alicyclobacillaceae . A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Alicyclobacillaceae . The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Meincke, Linda [Los Alamos National Laboratory (LANL); Sims, David [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Wahrenburg, Claudia [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

2010-01-01

266

Complete genome sequence of Arthrobacter sp. strain FB24  

SciTech Connect

Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program.

Nakatsu, C. H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, T.; Han, Cliff F.; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

2013-09-30

267

Complete genome sequence of Desulfohalobium retbaense type strain (HR(100)).  

PubMed

Desulfohalobium retbaense (Ollivier et al. 1991) is the type species of the polyphyletic genus Desulfohalobium, which comprises, at the time of writing, two species and represents the family Desulfohalobiaceae within the Deltaproteobacteria. D. retbaense is a moderately halophilic sulfate-reducing bacterium, which can utilize H(2) and a limited range of organic substrates, which are incompletely oxidized to acetate and CO(2), for growth. The type strain HR(100) (T) was isolated from sediments of the hypersaline Retba Lake in Senegal. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Desulfohalobiaceae. The 2,909,567 bp genome (one chromosome and a 45,263 bp plasmid) with its 2,552 protein-coding and 57 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304676

Spring, Stefan; Nolan, Matt; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Land, Miriam; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Munk, Christine; Kiss, Hajnalka; Chain, Patrick; Han, Cliff; Brettin, Thomas; Detter, John C; Schüler, Esther; Göker, Markus; Rohde, Manfred; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter

2010-01-01

268

Genome rearrangements caused by interstitial telomeric sequences in yeast  

PubMed Central

Interstitial telomeric sequences (ITSs) are present in many eukaryotic genomes and are linked to genome instabilities and disease in humans. The mechanisms responsible for ITS-mediated genome instability are not understood in molecular detail. Here, we use a model Saccharomyces cerevisiae system to characterize genome instability mediated by yeast telomeric (Ytel) repeats embedded within an intron of a reporter gene inside a yeast chromosome. We observed a very high rate of small insertions and deletions within the repeats. We also found frequent gross chromosome rearrangements, including deletions, duplications, inversions, translocations, and formation of acentric minichromosomes. The inversions are a unique class of chromosome rearrangement involving an interaction between the ITS and the true telomere of the chromosome. Because we previously found that Ytel repeats cause strong replication fork stalling, we suggest that formation of double-stranded DNA breaks within the Ytel sequences might be responsible for these gross chromosome rearrangements. PMID:24191060

Aksenova, Anna Y.; Greenwell, Patricia W.; Dominska, Margaret; Shishkin, Alexander A.; Kim, Jane C.; Petes, Thomas D.; Mirkin, Sergei M.

2013-01-01

269

Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)  

PubMed Central

Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family ‘Alicyclobacillaceae’. A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family ‘Alicyclobacillaceae’. The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304673

Mavromatis, Konstantinos; Sikorski, Johannes; Lapidus, Alla; Glavina Del Rio, Tijana; Copeland, Alex; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Chain, Patrick; Meincke, Linda; Sims, David; Chertkov, Olga; Han, Cliff; Brettin, Thomas; Detter, John C.; Wahrenburg, Claudia; Rohde, Manfred; Pukall, Rüdiger; Göker, Markus; Bristow, Jim; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C.

2010-01-01

270

Sequencing and analysis of an Irish human genome  

PubMed Central

Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge. PMID:20822512

2010-01-01

271

The genome sequence of the filamentous fungus Neurospora crassa  

Microsoft Academic Search

Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes-more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis

James E. Galagan; Sarah E. Calvo; Katherine A. Borkovich; Eric U. Selker; Nick D. Read; David Jaffe; William FitzHugh; Li-Jun Ma; Serge Smirnov; Seth Purcell; Bushra Rehman; Timothy Elkins; Reinhard Engels; Shunguang Wang; Cydney B. Nielsen; Jonathan Butler; Matthew Endrizzi; Dayong Qui; Peter Ianakiev; Deborah Bell-Pedersen; Mary Anne Nelson; Margaret Werner-Washburne; Claude P. Selitrennikoff; John A. Kinsey; Edward L. Braun; Alex Zelter; Ulrich Schulte; Gregory O. Kothe; Gregory Jedd; Werner Mewes; Chuck Staben; Edward Marcotte; David Greenberg; Alice Roy; Karen Foley; Jerome Naylor; Nicole Stange-Thomann; Robert Barrett; Sante Gnerre; Michael Kamal; Manolis Kamvysselis; Evan Mauceli; Cord Bielke; Stephen Rudd; Dmitrij Frishman; Svetlana Krystofova; Carolyn Rasmussen; Robert L. Metzenberg; David D. Perkins; Scott Kroken; Carlo Cogoni; Giuseppe Macino; David Catcheside; Weixi Li; Robert J. Pratt; Stephen A. Osmani; Colin P. C. DeSouza; Louise Glass; Marc J. Orbach; J. Andrew Berglund; Rodger Voelker; Oded Yarden; Michael Plamann; Stephan Seiler; Jay Dunlap; Alan Radford; Rodolfo Aramayo; Donald O. Natvig; Lisa A. Alex; Gertrud Mannhaupt; Daniel J. Ebbole; Michael Freitag; Ian Paulsen; Matthew S. Sachs; Eric S. Lander; Chad Nusbaum; Bruce Birren

2003-01-01

272

Structure and sequence of the saimiriine herpesvirus 1 genome  

Microsoft Academic Search

We report here the complete genome sequence of the squirrel monkey ?-herpesvirus saimiriine herpesvirus 1 (HVS1). Unlike the simplexviruses of other primate species, only the unique short region of the HVS1 genome is bounded by inverted repeats. While all Old World simian simplexviruses characterized to date lack the herpes simplex virus RL1 (?34.5) gene, HVS1 has an RL1 gene. HVS1

Shaun Tyler; Alberto Severini; Darla Black; Matthew Walker; R. Eberle

2011-01-01

273

Genome sequence of the plant pathogen Ralstonia solanacearum  

Microsoft Academic Search

Ralstonia solanacearum is a devastating, soil-borne plant pathogen with a global distribution and an unusually wide host range. It is a model system for the dissection of molecular determinants governing pathogenicity. We present here the complete genome sequence and its analysis of strain GMI1000. The 5.8-megabase (Mb) genome is organized into two replicons: a 3.7-Mb chromosome and a 2.1-Mb megaplasmid.

M. Salanoubat; S. Genin; F. Artiguenave; J. Gouzy; S. Mangenot; M. Arlat; A. Billault; P. Brottier; J. C. Camus; L. Cattolico; M. Chandler; N. Choisne; C. Claudel-Renard; S. Cunnac; N. Demange; C. Gaspin; M. Lavie; A. Moisan; C. Robert; W. Saurin; T. Schiex; P. Siguier; P. Thébault; M. Whalen; P. Wincker; M. Levy; J. Weissenbach; C. A. Boucher

2002-01-01

274

Mitochondrial genome sequence of the bluegill sunfish (Lepomis macrochirus).  

PubMed

The bluegill sunfish (Lepomis macrochirus) belongs to Lepomis genera of the family Centrarchidae, which is an economically important freshwater species in China. This study presents the complete mitochondrial genome of L. macrochirus, which is the first complete sequence from sunfish species. L. macrochirus mitochondrial DNA is 16,489 bp long, with the genome organization and gene order being identical to that of the typical vertebrate. PMID:22165836

Li, Sheng-Jie; Cai, Lei; Bai, Jun-Jie

2011-10-01

275

The genome sequence of the plant pathogen Xylella fastidiosa  

Microsoft Academic Search

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis—a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign

A. J. G. Simpson; F. C. Reinach; P. Arruda; F. A. Abreu; M. Acencio; R. Alvarenga; L. M. C. Alves; J. E. Araya; G. S. Baia; C. S. Baptista; M. H. Barros; E. D. Bonaccorsi; S. Bordin; J. M. Bové; M. R. S. Briones; A. A. Camargo; L. E. A. Camargo; D. M. Carraro; H. Carrer; N. B. Colauto; C. Colombo; F. F. Costa; M. C. R. Costa; C. M. Costa-Neto; L. L. Coutinho; M. Cristofani; E. Dias-Neto; C. Docena; H. El-Dorry; A. P. Facincani; A. J. S. Ferreira; V. C. A. Ferreira; J. A. Ferro; J. S. Fraga; S. C. França; M. C. Franco; L. R. Furlan; M. Garnier; G. H. Goldman; M. H. S. Goldman; S. L. Gomes; A. Gruber; P. L. Ho; J. D. Hoheisel; M. L. Junqueira; E. L. Kemper; J. P. Kitajima; E. E. Kuramae; F. Laigret; M. R. Lambais; L. C. C. Leite; E. G. M. Lemos; M. V. F. Lemos; S. A. Lopes; C. R. Lopes; J. A. Machado; M. A. Machado; A. M. B. N. Madeira; H. M. F. Madeira; C. L. Marino; M. V. Marques; E. A. L. Martins; E. M. F. Martins; A. Y. Matsukuma; C. F. M. Menck; E. C. Miracca; C. Y. Miyaki; C. B. Monteiro-Vitorello; D. H. Moon; M. A. Nagai; A. L. T. O. Nascimento; L. E. S. Netto; A. Nhani; F. G. Nobrega; L. R. Nunes; M. A. Oliveira; M. C. de Oliveira; R. C. de Oliveira; D. A. Palmieri; B. R. Peixoto; G. A. G. Pereira; H. A. Pereira; J. B. Pesquero; R. B. Quaggio; P. G. Roberto; V. Rodrigues; A. J. de M. Rosa; V. E. de Rosa; R. G. de Sá; R. V. Santelli; H. E. Sawasaki; A. C. R. da Silva; F. R. da Silva; W. A. Silva; J. F. da Silveira; M. L. Z. Silvestri; W. J. Siqueira; A. A. de Souza; A. P. de Souza; M. F. Terenzi; D. Truffi; S. M. Tsai; M. H. Tsuhako; H. Vallada; M. A. Van Sluys; S. Verjovski-Almeida; A. L. Vettore; M. A. Zago; J. Meidanis; J. C. Setubal

2000-01-01

276

The genome sequence of the plant pathogen Xylella fastidiosa  

Microsoft Academic Search

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis-a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign

A. J. G. Simpson; F. C. Reinach; P. Arruda; F. A. Abreu; M. Acencio; R. Alvarenga; L. M. C. Alves; J. E. Araya; G. S. Baia; C. S. Baptista; M. H. Barros; E. D. Bonaccorsi; S. Bordin; J. M. Bové; M. R. S. Briones; M. R. P. Bueno; A. A. Camargo; L. E. A. Camargo; D. M. Carraro; H. Carrer; N. B. Colauto; C. Colombo; F. F. Costa; M. C. R. Costa; C. M. Costa-Neto; L. L. Coutinho; M. Cristofani; E. Dias-Neto; C. Docena; H. El-Dorry; A. P. Facincani; A. J. S. Ferreira; V. C. A. Ferreira; J. A. Ferro; J. S. Fraga; S. C. França; M. C. Franco; M. Frohme; L. R. Furlan; M. Garnier; G. H. Goldman; M. H. S. Goldman; S. L. Gomes; A. Gruber; P. L. Ho; J. D. Hoheisel; M. L. Junqueira; E. L. Kemper; J. P. Kitajima; J. E. Krieger; E. E. Kuramae; F. Laigret; M. R. Lambais; L. C. C. Leite; E. G. M. Lemos; M. V. F. Lemos; S. A. Lopes; C. R. Lopes; J. A. Machado; M. A. Machado; A. M. B. N. Madeira; H. M. F. Madeira; C. L. Marino; M. V. Marques; E. A. L. Martins; E. M. F. Martins; A. Y. Matsukuma; C. F. M. Menck; E. C. Miracca; C. Y. Miyaki; C. B. Monteiro-Vitorello; D. H. Moon; M. A. Nagai; A. L. T. O. Nascimento; L. E. S. Netto; A. Nhani; F. G. Nobrega; L. R. Nunes; M. A. Oliveira; M. C. de Oliveira; R. C. de Oliveira; D. A. Palmieri; B. R. Peixoto; G. A. G. Pereira; H. A. Pereira; J. B. Pesquero; R. B. Quaggio; P. G. Roberto; V. Rodrigues; A. J. de M. Rosa; V. E. de Rosa; R. G. de Sá; R. V. Santelli; H. E. Sawasaki; A. C. R. da Silva; A. M. da Silva; F. R. da Silva; W. A. Silva; J. F. da Silveira; M. L. Z. Silvestri; W. J. Siqueira; A. A. de Souza; A. P. de Souza; M. F. Terenzi; D. Truffi; S. M. Tsai; M. H. Tsuhako; H. Vallada; M. A. Van Sluys; S. Verjovski-Almeida; A. L. Vettore; M. A. Zago; M. Zatz; J. Meidanis; J. C. Setubal

2000-01-01

277

Draft genome sequence of the Tibetan antelope  

PubMed Central

The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared with other plain-dwelling mammals, the genome of the Tibetan antelope shows signals of adaptive evolution and gene-family expansion in genes associated with energy metabolism and oxygen transmission. Both the highland American pika, and the Tibetan antelope have signals of positive selection for genes involved in DNA repair and the production of ATPase. Genes associated with hypoxia seem to have experienced convergent evolution. Thus, our study suggests that common genetic mechanisms might have been utilized to enable high-altitude adaptation. PMID:23673643

Ge, Ri-Li; Cai, Qingle; Shen, Yong-Yi; San, A; Ma, Lan; Zhang, Yong; Yi, Xin; Chen, Yan; Yang, Lingfeng; Huang, Ying; He, Rongjun; Hui, Yuanyuan; Hao, Meirong; Li, Yue; Wang, Bo; Ou, Xiaohua; Xu, Jiaohui; Zhang, Yongfen; Wu, Kui; Geng, Chunyu; Zhou, Weiping; Zhou, Taicheng; Irwin, David M.; Yang, Yingzhong; Ying, Liu; Bao, Haihua; Kim, Jaebum; Larkin, Denis M.; Ma, Jian; Lewin, Harris A.; Xing, Jinchuan; Platt, Roy N.; Ray, David A.; Auvil, Loretta; Capitanu, Boris; Zhang, Xiufeng; Zhang, Guojie; Murphy, Robert W.; Wang, Jun; Zhang, Ya-Ping; Wang, Jian

2013-01-01

278

Exploring genome characteristics and sequence quality without a reference  

PubMed Central

Motivation: The de novo assembly of large, complex genomes is a significant challenge with currently available DNA sequencing technology. While many de novo assembly software packages are available, comparatively little attention has been paid to assisting the user with the assembly. Results: This article addresses the practical aspects of de novo assembly by introducing new ways to perform quality assessment on a collection of sequence reads. The software implementation calculates per-base error rates, paired-end fragment-size distributions and coverage metrics in the absence of a reference genome. Additionally, the software will estimate characteristics of the sequenced genome, such as repeat content and heterozygosity that are key determinants of assembly difficulty. Availability: The software described is freely available online (https://github.com/jts/sga) and open source under the GNU Public License. Contact: jared.simpson@oicr.on.ca Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:24443382

2014-01-01

279

Diagnostic cancer genome sequencing and the contribution of germline variants.  

PubMed

Whole-genome sequencing (WGS) is revolutionizing medical research and has the potential to serve as a powerful and cost-effective diagnostic tool in the management of cancer. We review the progress to date in the use of WGS to reveal how germline variants and mutations may be associated with cancer. We use colorectal cancer as an example of how the current level of knowledge can be translated into predictions of predisposition. We also address challenges in the clinical implementation of the variants in germline DNA identified through cancer genome sequencing. We call for the international development of standards to facilitate the clinical use of germline information arising from diagnostic cancer genome sequencing. PMID:23539595

Kilpivaara, O; Aaltonen, L A

2013-03-29

280

Molecular poltergeists: mitochondrial DNA copies (numts) in sequenced nuclear genomes.  

PubMed

The natural transfer of DNA from mitochondria to the nucleus generates nuclear copies of mitochondrial DNA (numts) and is an ongoing evolutionary process, as genome sequences attest. In humans, five different numts cause genetic disease and a dozen human loci are polymorphic for the presence of numts, underscoring the rapid rate at which mitochondrial sequences reach the nucleus over evolutionary time. In the laboratory and in nature, numts enter the nuclear DNA via non-homolgous end joining (NHEJ) at double-strand breaks (DSBs). The frequency of numt insertions among 85 sequenced eukaryotic genomes reveal that numt content is strongly correlated with genome size, suggesting that the numt insertion rate might be limited by DSB frequency. Polymorphic numts in humans link maternally inherited mitochondrial genotypes to nuclear DNA haplotypes during the past, offering new opportunities to associate nuclear markers with mitochondrial markers back in time. PMID:20168995

Hazkani-Covo, Einat; Zeller, Raymond M; Martin, William

2010-02-01

281

Nanopore Sequencing of the phi X 174 genome  

E-print Network

Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost, and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length that can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. All methods and data are made fully available.

Laszlo, Andrew H; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

2014-01-01

282

The genome sequence of the colonial chordate, Botryllus schlosseri  

PubMed Central

Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI: http://dx.doi.org/10.7554/eLife.00569.001 PMID:23840927

Voskoboynik, Ayelet; Neff, Norma F; Sahoo, Debashis; Newman, Aaron M; Pushkarev, Dmitry; Koh, Winston; Passarelli, Benedetto; Fan, H Christina; Mantalas, Gary L; Palmeri, Karla J; Ishizuka, Katherine J; Gissi, Carmela; Griggio, Francesca; Ben-Shlomo, Rachel; Corey, Daniel M; Penland, Lolita; White, Richard A; Weissman, Irving L; Quake, Stephen R

2013-01-01

283

Genomics:GTL Bioenergy Research Centers White Paper  

SciTech Connect

In his Advanced Energy Initiative announced in January 2006, President George W. Bush committed the nation to new efforts to develop alternative sources of energy to replace imported oil and fossil fuels. Developing cost-effective and energy-efficient methods of producing renewable alternative fuels such as cellulosic ethanol from biomass and solar-derived biofuels will require transformational breakthroughs in science and technology. Incremental improvements in current bioenergy production methods will not suffice. The Genomics:GTL Bioenergy Research Centers will be dedicated to fundamental research on microbe and plant systems with the goal of developing knowledge that will advance biotechnology-based strategies for biofuels production. The aim is to spur substantial progress toward cost-effective production of biologically based renewable energy sources. This document describes the rationale for the establishment of the centers and their objectives in light of the U.S. Department of Energy's mission and goals. Developing energy-efficient and cost-effective methods of producing alternative fuels such as cellulosic ethanol from biomass will require transformational breakthroughs in science and technology. Incremental improvements in current bioenergy-production methods will not suffice. The focus on microbes (for cellular mechanisms) and plants (for source biomass) fundamentally exploits capabilities well known to exist in the microbial world. Thus 'proof of concept' is not required, but considerable basic research into these capabilities remains an urgent priority. Several developments have converged in recent years to suggest that systems biology research into microbes and plants promises solutions that will overcome critical roadblocks on the path to cost-effective, large-scale production of cellulosic ethanol and other renewable energy from biomass. The ability to rapidly sequence the DNA of any organism is a critical part of these new capabilities, but it is only a first step. Other advances include the growing number of high-throughput techniques for protein production and characterization; a range of new instrumentation for observing proteins and other cell constituents; the rapid growth of commercially available reagents for protein production; a new generation of high-intensity light sources that provide precision imaging on the nanoscale and allow observation of molecular interactions in ultrafast time intervals; major advances in computational capability; and the continually increasing numbers of these instruments and technologies within the national laboratory infrastructure, at universities, and in private industry. All these developments expand our ability to elucidate mechanisms present in living cells, but much more remains to be done. The Centers are designed to accomplish GTL program objectives more rapidly, more effectively, and at reduced cost by concentrating appropriate technologies and scientific expertise, from genome sequence to an integrated systems understanding of the pathways and internal structures of microbes and plants most relevant to developing bioenergy compounds. The Centers will seek to understand the principles underlying the structural and functional design of selected microbial, plant, and molecular systems. This will be accomplished by building technological pathways linking the genome-determined components in an organism with bioenergy-relevant cellular systems that can be characterized sufficiently to generate realistic options for biofuel development. In addition, especially in addressing what are believed to be nearer-term approaches to renewable energy (e.g., producing cellulosic ethanol cost-effectively and energy-efficiently), the Center research team must understand in depth the current industrial-level roadblocks and bottlenecks (see section, GTL's Vision for Biological Energy Alternatives, below). For the Centers, and indeed the entire BER effort, to be successful, Center research must be integrated with individual investigator research, and coordination of activities,

Mansfield, Betty Kay [ORNL; Alton, Anita Jean [ORNL; Andrews, Shirley H [ORNL; Bownas, Jennifer Lynn [ORNL; Casey, Denise [ORNL; Martin, Sheryl A [ORNL; Mills, Marissa [ORNL; Nylander, Kim [ORNL; Wyrick, Judy M [ORNL; Drell, Dr. Daniel [Office of Science, Department of Energy; Weatherwax, Sharlene [U.S. Department of Energy; Carruthers, Julie [U.S. Department of Energy

2006-08-01

284

Complete genome sequence of Haliscomenobacter hydrossis type strain (OT)  

SciTech Connect

Haliscomenobacter hydrossis van Veen et al. 1973 is the type species of the genus Halisco- menobacter, which belongs to order 'Sphingobacteriales'. The species is of interest because of its isolated phylogenetic location in the tree of life, especially the so far genomically un- charted part of it, and because the organism grows in a thin, hardly visible hyaline sheath. Members of the species were isolated from fresh water of lakes and from ditch water. The genome of H. hydrossis is the first completed genome sequence reported from a member of the family 'Saprospiraceae'. The 8,771,651 bp long genome with its three plasmids of 92 kbp, 144 kbp and 164 kbp length contains 6,848 protein-coding and 60 RNA genes, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Daligault, Hajnalka E. [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Zeytun, Ahmet [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Verbarg, Susanne [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

2011-01-01

285

Complete genome sequence of Pyrolobus fumarii type strain (1AT)  

SciTech Connect

Pyrolobus fumarii Bl chl et al. 1997 is the type species of the genus Pyrolobus, which be- longs to the crenarchaeal family Pyrodictiaceae. The species is a facultatively microaerophilic non-motile crenarchaeon. It is of interest because of its isolated phylogenetic location in the tree of life and because it is a hyperthermophilic chemolithoautotroph known as the primary producer of organic matter at deep-sea hydrothermal vents. P. fumarii exhibits currently the highest optimal growth temperature of all life forms on earth (106 C). This is the first com- pleted genome sequence of a member of the genus Pyrolobus to be published and only the second genome sequence from a member of the family Pyrodictiaceae. Although Diversa Corporation announced the completion of sequencing of the P. fumarii genome on Septem- ber 25, 2001, this sequence was never released to the public. The 1,843,267 bp long genome with its 1,986 protein-coding and 52 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Huber, Harald [Universitat Regensburg, Regensburg, Germany; Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute

2011-01-01

286

Genome sequence of a polydnavirus: insights into symbiotic virus evolution.  

PubMed

Little is known of the fate of viruses involved in long-term obligatory associations with eukaryotes. For example, many species of parasitoid wasps have symbiotic viruses to manipulate host defenses and to allow development of parasitoid larvae. The complete nucleotide sequence of the DNA enclosed in the virus particles injected by a parasitoid wasp revealed a complex organization, resembling a eukaryote genomic region more than a viral genome. Although endocellular symbiont genomes have undergone a dramatic loss of genes, the evolution of symbiotic viruses appears to be characterized by extensive duplication of virulence genes coding for truncated versions of cellular proteins. PMID:15472078

Espagne, Eric; Dupuy, Catherine; Huguet, Elisabeth; Cattolico, Laurence; Provost, Bertille; Martins, Nathalie; Poirié, Marylène; Periquet, Georges; Drezen, Jean Michel

2004-10-01

287

Draft genome sequence of Arthrospira platensis C1 (PCC9438)  

PubMed Central

Arthrospira platensis is a cyanobacterium that is extensively cultivated outdoors on a large commercial scale for consumption as a food for humans and animals. It can be grown in monoculture under highly alkaline conditions, making it attractive for industrial production. Here we describe the complete genome sequence of A. platensis C1 strain and its annotation. The A. platensis C1 genome contains 6,089,210 bp including 6,108 protein-coding genes and 45 RNA genes, and no plasmids. The genome information has been used for further comparative analysis, particularly of metabolic pathways, photosynthetic efficiency and barriers to gene transfer. PMID:22675597

Cheevadhanarak, Supapon; Paithoonrangsarid, Kalyanee; Prommeenate, Peerada; Kaewngam, Warunee; Musigkain, Apiluck; Tragoonrung, Somvong; Tabata, Satoshi; Kaneko, Takakazu; Chaijaruwanich, Jeerayut; Sangsrakru, Duangjai; Tangphatsornruang, Sithichoke; Chanprasert, Juntima; Tongsima, Sissades; Kusonmano, Kanthida; Jeamton, Wattana; Dulsawat, Sudarat; Klanchui, Amornpan; Vorapreeda, Tayvich; Chumchua, Vasunun; Khannapho, Chiraphan; Thammarongtham, Chinae; Plengvidhya, Vethachai; Subudhi, Sanjukta; Hongsthong, Apiradee; Ruengjitchatchawalya, Marasri; Meechai, Asawin; Senachak, Jittisak; Tanticharoen, Morakot

2012-01-01

288

Draft Genome Sequence of Daldinia eschscholzii Isolated from Blood Culture  

PubMed Central

Daldinia eschscholzii is an invasive endophyte that is most commonly found in plant tissues rich in secondary metabolites. We report the draft genome sequence of D. eschscholzii isolated from blood culture. The draft genome is 35,494,957 bp in length, with 42,898,665 reads, 61,449 contigs, and a G+C content of 46.8%. The genome was found to contain a high abundance of genes associated with plant cell wall degradation enzymes, mycotoxin production, and antifungal drug resistance. PMID:22544898

Ngeow, Yun Fong; Yew, Su Mei; Hassan, Hamimah; Soo-Hoo, Tuck Soon; Na, Shiang Ling; Chan, Chai Ling; Hoh, Chee-Choong; Lee, Kok-Wei; Yee, Wai-Yan

2012-01-01

289

The complete mitochondrial genome sequence of the budgerigar, Melopsittacus undulatus.  

PubMed

Abstract Here, we describe the budgie's mitochondrial genome sequence, a resource that can facilitate this parrot's use as a model organism as well as for determining its phylogenetic relatedness to other parrots/Psittaciformes. The estimated total length of the sequence was 18,193?bp. In addition to the to the 13 protein and tRNA and rRNA coding regions, the sequence also includes a duplicated hypervariable region, a feature unique to only a few birds. The two hypervariable regions shared a sequence identity of about 86%. PMID:24660934

Guan, Xiaojing; Xu, Jun; Smith, Edward J

2014-03-24

290

Closed Genome Sequence of Noninvasive Streptococcus pyogenes M/emm3 Strain STAB902  

E-print Network

Closed Genome Sequence of Noninvasive Streptococcus pyogenes M/emm3 Strain STAB902 Nicolas Soriano Rennes 1, Rennes, Franced We report a closed genome sequence of group A Streptococcus genotype emm3 (GAS. Closed genome sequence of noninvasive Streptococcus pyogenes M/emm3 strain STAB902. Genome Announc. 2

Paris-Sud XI, Université de

291

The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus).  

PubMed

We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the thylacine's basal position in Dasyuromorphia, aided by mitochondrial genome sequence that we generated from the extant numbat (Myrmecobius fasciatus). Surprisingly, both of our thylacine sequences differ by 11%-15% from putative thylacine mitochondrial genes in GenBank, with one of our samples originating from a direct offspring of the previously sequenced individual. Our data sample each mitochondrial nucleotide an average of 50 times, thereby providing the first high-fidelity reference sequence for thylacine population genetics. Our two sequences differ in only five nucleotides out of 15,452, hinting at a very low genetic diversity shortly before extinction. Despite the samples' heavy contamination with bacterial and human DNA and their temperate storage history, we estimate that as much as one-third of the total DNA in each sample is from the thylacine. The microbial content of the two thylacine samples was subjected to metagenomic analysis, and showed striking differences between a wild-captured individual and a born-in-captivity one. This study therefore adds to the growing evidence that extensive sequencing of museum collections is both feasible and desirable, and can yield complete genomes. PMID:19139089

Miller, Webb; Drautz, Daniela I; Janecka, Jan E; Lesk, Arthur M; Ratan, Aakrosh; Tomsho, Lynn P; Packard, Mike; Zhang, Yeting; McClellan, Lindsay R; Qi, Ji; Zhao, Fangqing; Gilbert, M Thomas P; Dalén, Love; Arsuaga, Juan Luis; Ericson, Per G P; Huson, Daniel H; Helgen, Kristofer M; Murphy, William J; Götherström, Anders; Schuster, Stephan C

2009-02-01

292

Simple sequences are ubiquitous repetitive components of eukaryotic genomes.  

PubMed Central

Simple sequences are stretches of DNA which consist of only one, or a few tandemly repeated nucleotides, for example poly (dA) X poly (dT) or poly (dG-dT) X poly (dC-dA). These two types of simple sequence have been shown to be repetitive and interspersed in many eukaryotic genomes. Several other types have been found by sequencing eukaryotic DNA. In this report we have undertaken a systematical survey for simple sequences. We hybridized synthetical simple sequence DNA to genome blots of phylogenetically different organisms. We found that many, probably even all possible types of simple sequence are repetitive components of eukaryotic genomes. We propose therefore that they arise by common mechanisms namely slippage replication and unequal crossover and that they might have no general function with regards to gene expression. This latter inference is supported by the fact that we have detected simple sequences only in the metabolically inactive micronucleus of the protozoan Stylonychia, but not in the metabolically active macronucleus which is derived from the micronucleus by chromosome diminution. Images PMID:6328411

Tautz, D; Renz, M

1984-01-01

293

The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus)  

PubMed Central

We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the thylacine's basal position in Dasyuromorphia, aided by mitochondrial genome sequence that we generated from the extant numbat (Myrmecobius fasciatus). Surprisingly, both of our thylacine sequences differ by 11%–15% from putative thylacine mitochondrial genes in GenBank, with one of our samples originating from a direct offspring of the previously sequenced individual. Our data sample each mitochondrial nucleotide an average of 50 times, thereby providing the first high-fidelity reference sequence for thylacine population genetics. Our two sequences differ in only five nucleotides out of 15,452, hinting at a very low genetic diversity shortly before extinction. Despite the samples’ heavy contamination with bacterial and human DNA and their temperate storage history, we estimate that as much as one-third of the total DNA in each sample is from the thylacine. The microbial content of the two thylacine samples was subjected to metagenomic analysis, and showed striking differences between a wild-captured individual and a born-in-captivity one. This study therefore adds to the growing evidence that extensive sequencing of museum collections is both feasible and desirable, and can yield complete genomes. PMID:19139089

Miller, Webb; Drautz, Daniela I.; Janecka, Jan E.; Lesk, Arthur M.; Ratan, Aakrosh; Tomsho, Lynn P.; Packard, Mike; Zhang, Yeting; McClellan, Lindsay R.; Qi, Ji; Zhao, Fangqing; Gilbert, M. Thomas P.; Dalén, Love; Arsuaga, Juan Luis; Ericson, Per G.P.; Huson, Daniel H.; Helgen, Kristofer M.; Murphy, William J.; Götherström, Anders; Schuster, Stephan C.

2009-01-01

294

Establishing a framework for comparative analysis of genome sequences  

SciTech Connect

This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

Bansal, A.K.

1995-06-01

295

Complete Sequence and Genomic Analysis of Rhesus Cytomegalovirus  

PubMed Central

The complete DNA sequence of rhesus cytomegalovirus (RhCMV) strain 68-1 was determined with the whole-genome shotgun approach on virion DNA. The RhCMV genome is 221,459 bp in length and possesses a 49% G+C base composition. The genome contains 230 potential open reading frames (ORFs) of 100 or more codons that are arranged colinearly with counterparts of previously sequenced betaherpesviruses such as human cytomegalovirus (HCMV). Of the 230 RhCMV ORFs, 138 (60%) are homologous to known HCMV proteins. The conserved ORFs include the structural, replicative, and transcriptional regulatory proteins, immune evasion elements, G protein-coupled receptors, and immunoglobulin homologues. Interestingly, the RhCMV genome also contains sequences with homology to cyclooxygenase-2, an enzyme associated with inflammatory processes. Closer examination identified a series of candidate exons with the capacity to encode a full-length cyclooxygenase-2 protein. Counterparts of cyclooxygenase-2 have not been found in other sequenced herpesviruses. The availability of the complete RhCMV sequence along with the ability to grow RhCMV in vitro will facilitate the construction of recombinant viral strains for identifying viral determinants of CMV pathogenicity in the experimentally infected rhesus macaque and to the development of CMV as a vaccine vector. PMID:12767982

Hansen, Scott G.; Strelow, Lisa I.; Franchi, David C.; Anders, David G.; Wong, Scott W.

2003-01-01

296

Sequence modelling and an extensible data model for genomic database  

SciTech Connect

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

Li, Peter Wei-Der (California Univ., San Francisco, CA (United States) Lawrence Berkeley Lab., CA (United States))

1992-01-01

297

Sequence modelling and an extensible data model for genomic database  

SciTech Connect

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

Li, Peter Wei-Der [California Univ., San Francisco, CA (United States); [Lawrence Berkeley Lab., CA (United States)

1992-01-01

298

Comparative analyses of multi-species sequences from targeted genomic regions  

Microsoft Academic Search

The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding and conserved non-coding regions, including regulatory elements, and provide insight into the forces that have rendered modern-day genomes. As a complement to whole-genome sequencing efforts, we are sequencing and comparing targeted genomic regions in multiple,

J. W. Thomas; J. W. Touchman; R. W. Blakesley; G. G. Bouffard; S. M. Beckstrom-Sternberg; E. H. Margulies; M. Blanchette; A. C. Siepel; P. J. Thomas; J. C. McDowell; B. Maskeri; N. F. Hansen; M. S. Schwartz; R. J. Weber; W. J. Kent; D. Karolchik; T. C. Bruen; R. Bevan; D. J. Cutler; S. Schwartz; L. Elnitski; J. R. Idol; A. B. Prasad; S.-Q. Lee-Lin; V. V. B. Maduro; T. J. Summers; M. E. Portnoy; N. L. Dietrich; N. Akhter; K. Ayele; B. Benjamin; K. Cariaga; C. P. Brinkley; S. Y. Brooks; S. Granite; X. Guan; J. Gupta; P. Haghighi; S.-L. Ho; M. C. Huang; E. Karlins; P. L. Laric; R. Legaspi; M. J. Lim; Q. L. Maduro; C. A. Masiello; S. D. Mastrian; J. C. McCloskey; R. Pearson; S. Stantripop; E. E. Tiongson; J. T. Tran; C. Tsurgeon; J. L. Vogt; M. A. Walker; K. D. Wetherby; L. S. Wiggins; A. C. Young; L.-H. Zhang; K. Osoegawa; B. Zhu; B. Zhao; C. L. Shu; P. J. De Jong; C. E. Lawrence; A. F. Smit; A. Chakravarti; D. Haussler; P. Green; W. Miller; E. D. Green

2003-01-01

299

Final progress report, Construction of a genome-wide highly characterized clone resource for genome sequencing  

SciTech Connect

At TIGR, the human Bacterial Artificial Chromosome (BAC) end sequencing and trimming were with an overall sequencing success rate of 65%. CalTech human BAC libraries A, B, C and D as well as Roswell Park Cancer Institute's library RPCI-11 were used. To date, we have generated >300,000 end sequences from >186,000 human BAC clones with an average read length {approx}460 bp for a total of 141 Mb covering {approx}4.7% of the genome. Over sixty percent of the clones have BAC end sequences (BESs) from both ends representing over five-fold coverage of the genome by the paired-end clones. The average phred Q20 length is {approx}400 bp. This high accuracy makes our BESs match the human finished sequences with an average identity of 99% and a match length of 450 bp, and a frequency of one match per 12.8 kb contig sequence. Our sample tracking has ensured a clone tracking accuracy of >90%, which gives researchers a high confidence in (1) retrieving the right clone from the BA C libraries based on the sequence matches; and (2) building a minimum tiling path of sequence-ready clones across the genome and genome assembly scaffolds.

Nierman, William C.

2000-02-14

300

Rapid bacterial genome sequencing: methods and applications in clinical microbiology.  

PubMed

The recent advances in sequencing technologies have given all microbiology laboratories access to whole genome sequencing. Providing that tools for the automated analysis of sequence data and databases for associated meta-data are developed, whole genome sequencing will become a routine tool for large clinical microbiology laboratories. Indeed, the continuing reduction in sequencing costs and the shortening of the 'time to result' makes it an attractive strategy in both research and diagnostics. Here, we review how high-throughput sequencing is revolutionizing clinical microbiology and the promise that it still holds. We discuss major applications, which include: (i) identification of target DNA sequences and antigens to rapidly develop diagnostic tools; (ii) precise strain identification for epidemiological typing and pathogen monitoring during outbreaks; and (iii) investigation of strain properties, such as the presence of antibiotic resistance or virulence factors. In addition, recent developments in comparative metagenomics and single-cell sequencing offer the prospect of a better understanding of complex microbial communities at the global and individual levels, providing a new perspective for understanding host-pathogen interactions. Being a high-resolution tool, high-throughput sequencing will increasingly influence diagnostics, epidemiology, risk management, and patient care. PMID:23601179

Bertelli, C; Greub, G

2013-09-01

301

Easy quantitative assessment of genome editing by sequence trace decomposition.  

PubMed

The efficacy and the mutation spectrum of genome editing methods can vary substantially depending on the targeted sequence. A simple, quick assay to accurately characterize and quantify the induced mutations is therefore needed. Here we present TIDE, a method for this purpose that requires only a pair of PCR reactions and two standard capillary sequencing runs. The sequence traces are then analyzed by a specially developed decomposition algorithm that identifies the major induced mutations in the projected editing site and accurately determines their frequency in a cell population. This method is cost-effective and quick, and it provides much more detailed information than current enzyme-based assays. An interactive web tool for automated decomposition of the sequence traces is available. TIDE greatly facilitates the testing and rational design of genome editing strategies. PMID:25300484

Brinkman, Eva K; Chen, Tao; Amendola, Mario; van Steensel, Bas

2014-12-16

302

Complete genome sequence of Allochromatium vinosum DSM 180T  

SciTech Connect

Allochromatium vinosum formerly Chromatium vinosum is a mesophilic purple sulfur bacte- rium belonging to the family Chromatiaceae in the bacterial class Gammaproteobacteria. The genus Allochromatium contains currently five species. All members were isolated from fresh- water, brackish water or marine habitats and are predominately obligate phototrophs. Here we describe the features of the organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the Chromatiaceae within the purple sulfur bacteria thriving in globally occurring habitats. The 3,669,074 bp ge- nome with its 3,302 protein-coding and 64 RNA genes was sequenced within the Joint Ge- nome Institute Community Sequencing Program.

Weissgerber, Thomas [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Zigann, Renate [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany; Bruce, David [Los Alamos National Laboratory (LANL); Chang, Yun-Juan [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Land, Miriam L [ORNL; Munk, Christine [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Dahl, Christiane [Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany

2011-01-01

303

Easy quantitative assessment of genome editing by sequence trace decomposition  

PubMed Central

The efficacy and the mutation spectrum of genome editing methods can vary substantially depending on the targeted sequence. A simple, quick assay to accurately characterize and quantify the induced mutations is therefore needed. Here we present TIDE, a method for this purpose that requires only a pair of PCR reactions and two standard capillary sequencing runs. The sequence traces are then analyzed by a specially developed decomposition algorithm that identifies the major induced mutations in the projected editing site and accurately determines their frequency in a cell population. This method is cost-effective and quick, and it provides much more detailed information than current enzyme-based assays. An interactive web tool for automated decomposition of the sequence traces is available. TIDE greatly facilitates the testing and rational design of genome editing strategies. PMID:25300484

Brinkman, Eva K.; Chen, Tao; Amendola, Mario; van Steensel, Bas

2014-01-01

304

Sequencing and Analysis of Neanderthal Genomic DNA  

Microsoft Academic Search

Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence

Noonan James P; Coop Graham; Kudaravalli Sridhar; Smith Doug; Krause Johannes; Alessi Joe; Chen Feng; Platt Darren; Pääbo Svante; Pritchard Jonathan K; Edward M. Rubin

2006-01-01

305

Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species  

PubMed Central

Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of genomic content. Differences in gene content likely contribute to differences in the clinical and environmental distribution of species and sequence types. PMID:23166675

Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

2012-01-01

306

A complete sequence of the T. tengcongensis genome.  

PubMed

Thermoanaerobacter tengcongensis is a rod-shaped, gram-negative, anaerobic eubacterium that was isolated from a freshwater hot spring in Tengchong, China. Using a whole-genome-shotgun method, we sequenced its 2,689,445-bp genome from an isolate, MB4(T) (Genbank accession no. AE008691). The genome encodes 2588 predicted coding sequences (CDS). Among them, 1764 (68.2%) are classified according to homology to other documented proteins, and the rest, 824 CDS (31.8%), are functionally unknown. One of the interesting features of the T. tengcongensis genome is that 86.7% of its genes are encoded on the leading strand of DNA replication. Based on protein sequence similarity, the T. tengcongensis genome is most similar to that of Bacillus halodurans, a mesophilic eubacterium, among all fully sequenced prokaryotic genomes up to date. Computational analysis on genes involved in basic metabolic pathways supports the experimental discovery that T. tengcongensis metabolizes sugars as principal energy and carbon source and utilizes thiosulfate and element sulfur, but not sulfate, as electron acceptors. T. tengcongensis, as a gram-negative rod by empirical definitions (such as staining), shares many genes that are characteristics of gram-positive bacteria whereas it is missing molecular components unique to gram-negative bacteria. A strong correlation between the G + C content of tDNA and rDNA genes and the optimal growth temperature is found among the sequenced thermophiles. It is concluded that thermophiles are a biologically and phylogenetically divergent group of prokaryotes that have converged to sustain extreme environmental conditions over evolutionary timescale. PMID:11997336

Bao, Qiyu; Tian, Yuqing; Li, Wei; Xu, Zuyuan; Xuan, Zhenyu; Hu, Songnian; Dong, Wei; Yang, Jian; Chen, Yanjiong; Xue, Yanfen; Xu, Yi; Lai, Xiaoqin; Huang, Li; Dong, Xiuzhu; Ma, Yanhe; Ling, Lunjiang; Tan, Huarong; Chen, Runsheng; Wang, Jian; Yu, Jun; Yang, Huanming

2002-05-01

307

Draft genome sequence of Corynebacterium diphtheriae biovar intermedius NCTC 5011.  

PubMed

We report an annotated draft genome of the human pathogen Corynebacterium diphtheriae bv. intermedius NCTC 5011. This strain is the first C. diphtheriae bv. intermedius strain to be sequenced, and our results provide a useful comparison to the other primary disease-causing biovars, C. diphtheriae bv. gravis and C. diphtheriae bv. mitis. The sequence has been deposited at DDBJ/EMBL/GenBank with the accession number AJVH01000000. PMID:22887653

Sangal, Vartul; Tucker, Nicholas P; Burkovski, Andreas; Hoskisson, Paul A

2012-09-01

308

Sequence and organization of the human mitochondrial genome  

Microsoft Academic Search

The complete sequence of the 16,569-base pair human mitochondrial genome is presented. The genes for the 12S and 16S rRNAs, 22 tRNAs, cytochrome c oxidase subunits I, II and III, ATPase subunit 6, cytochrome b and eight other predicted protein coding genes have been located. The sequence shows extreme economy in that the genes have none or only a few

S. Anderson; A. T. Bankier; B. G. Barrell; M. H. L. de Bruijn; A. R. Coulson; J. Drouin; I. C. Eperon; D. P. Nierlich; B. A. Roe; F. Sanger; P. H. Schreier; A. J. H. Smith; R. Staden; I. G. Young

1981-01-01

309

The complete genome sequence of the gastric pathogen Helicobacter pylori  

Microsoft Academic Search

Helicobacter pylori, strain 26695, has a circular genome of 1,667,867 base pairs and 1,590 predicted coding sequences. Sequence analysis indicates that H. pylori has well-developed systems for motility, for scavenging iron, and for DNA restriction and modification. Many putative adhesins, lipoproteins and other outer membrane proteins were identified, underscoring the potential complexity of host-pathogen interaction. Based on the large number

Jean-F. Tomb; Owen White; Anthony R. Kerlavage; Rebecca A. Clayton; Granger G. Sutton; Robert D. Fleischmann; Karen A. Ketchum; Hans Peter Klenk; Steven Gill; Brian A. Dougherty; Karen Nelson; John Quackenbush; Lixin Zhou; Ewen F. Kirkness; Scott Peterson; Brendan Loftus; Delwood Richardson; Robert Dodson; Hanif G. Khalak; Anna Glodek; Keith McKenney; Lisa M. Fitzegerald; Norman Lee; Mark D. Adams; Erin K. Hickey; Douglas E. Berg; Jeanine D. Gocayne; Teresa R. Utterback; Jeremy D. Peterson; Jenny M. Kelley; Matthew D. Cotton; Janice M. Weidman; Claire Fujii; Cheryl Bowman; Larry Watthey; Erik Wallin; William S. Hayes; Mark Borodovsky; Peter D. Karp; Hamilton O. Smith; Claire M. Fraser; J. Craig Venter

1997-01-01

310

Complete Genome Sequence of Streptococcus agalactiae CNCTC 10/84, a Hypervirulent Sequence Type 26 Strain  

PubMed Central

Streptococcus agalactiae (group B Streptococcus [GBS]) is a human pathogen with a propensity to cause neonatal infections. We report the complete genome sequence of GBS strain CNCTC 10/84, a hypervirulent clinical isolate frequently used to study GBS pathogenesis. Comparative analysis of this sequence may shed light on novel pathogenic mechanisms. PMID:25540350

Hooven, Thomas A.; Randis, Tara M.; Daugherty, Sean C.; Narechania, Apurva; Planet, Paul J.; Tettelin, Hervé

2014-01-01

311

Tracking a Hospital Outbreak of Carbapenem-Resistant Klebsiella pneumoniae with Whole-Genome Sequencing  

PubMed Central

The Gram-negative bacteria Klebsiella pneumoniae is a major cause of nosocomial infections, primarily among immunocompromised patients. The emergence of strains resistant to carbapenems has left few treatment options, making infection containment critical. In 2011, the U.S. National Institutes of Health Clinical Center experienced an outbreak of carbapenem-resistant K. pneumoniae that affected 18 patients, 11 of whom died. Whole-genome sequencing was performed on K. pneumoniae isolates to gain insight into why the outbreak progressed despite early implementation of infection control procedures. Integrated genomic and epidemiological analysis traced the outbreak to three independent transmissions from a single patient who was discharged 3 weeks before the next case became clinically apparent. Additional genomic comparisons provided evidence for unexpected transmission routes, with subsequent mining of epidemiological data pointing to possible explanations for these transmissions. Our analysis demonstrates that integration of genomic and epidemiological data can yield actionable insights and facilitate the control of nosocomial transmission. PMID:22914622

Snitkin, Evan S.; Zelazny, Adrian M.; Thomas, Pamela J.; Stock, Frida; Henderson, David K.; Palmore, Tara N.; Segre, Julia A.

2012-01-01

312

Brucella microti: the genome sequence of an emerging pathogen  

PubMed Central

Background Using a combination of pyrosequencing and conventional Sanger sequencing, the complete genome sequence of the recently described novel Brucella species, Brucella microti, was determined. B. microti is a member of the genus Brucella within the Alphaproteobacteria, which consists of medically important highly pathogenic facultative intracellular bacteria. In contrast to all other Brucella species, B. microti is a fast growing and biochemically very active microorganism with a phenotype more similar to that of Ochrobactrum, a facultative human pathogen. The atypical phenotype of B. microti prompted us to look for genomic differences compared to other Brucella species and to look for similarities with Ochrobactrum. Results The genome is composed of two circular chromosomes of 2,117,050 and 1,220,319 base pairs. Unexpectedly, we found that the genome sequence of B. microti is almost identical to that of Brucella suis 1330 with an overall sequence identity of 99.84% in aligned regions. The most significant structural difference between the two genomes is a bacteriophage-related 11,742 base pairs insert only present in B. microti. However, this insert is unlikely to have any phenotypical consequence. Only four protein coding genes are shared between B. microti and Ochrobactrum anthropi but impaired in other sequenced Brucella. The most noticeable difference between B. microti and other Brucella species was found in the sequence of the 23S ribosomal RNA gene. This unusual variation could have pleiotropic effects and explain the fast growth of B. microti. Conclusion Contrary to expectations from the phenotypic analysis, the genome sequence of B. microti is highly similar to that of known Brucella species, and is remotely related to the one of O. anthropi. How the few differences in gene content between B. microti and B. suis 1330 could result in vastly different phenotypes remains to be elucidated. This unexpected finding will complicate the task of identifying virulence determinants in the Brucella genus. The genome sequence of B. microti will serve as a model for differential expression analysis and complementation studies. Our results also raise some concerns about the importance given to phenotypical traits in the definition of bacterial species. PMID:19653890

Audic, Stéphane; Lescot, Magali; Claverie, Jean-Michel; Scholz, Holger C

2009-01-01

313

SEQUENCE AND COMPARATIVE ANALYSIS OF THE CHICKEN GENOME PROVIDE UNIQUE PERSPECTIVES ON VERTEBRATE EVOLUTION.  

Technology Transfer Automated Retrieval System (TEKTRAN)

We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence ...

314

A highly annotated whole-genome sequence of a Korean individual  

Microsoft Academic Search

Recent advances in sequencing technologies have initiated an era of personal genome sequences. To date, human genome sequences have been reported for individuals with ancestry in three distinct geographical regions: a Yoruba African, two individuals of northwest European origin, and a person from China. Here we provide a highly annotated, whole-genome sequence for a Korean individual, known as AK1. The

Jong-Il Kim; Young Seok Ju; Sheehyun Kim; Seonwook Lee; Jae-Hyuk Yi; Joann Mudge; Neil A. Miller; Dongwan Hong; Callum J. Bell; Hye-Sun Kim; In-Soon Chung; Woo-Chung Lee; Ji-Sun Lee; Seung-Hyun Seo; Ji-Young Yun; Hyun Nyun Woo; Heewook Lee; Dongwhan Suh; Seungbok Lee; Hyun-Jin Kim; Maryam Yavartanoo; Minhye Kwak; Ying Zheng; Mi Kyeong Lee; Hyunjun Park; Jeong Yeon Kim; Omer Gokcumen; Ryan E. Mills; Alexander Wait Zaranek; Joseph Thakuria; Xiaodi Wu; Ryan W. Kim; Jim J. Huntley; Shujun Luo; Gary P. Schroth; Thomas D. Wu; Hyeran Kim; Kap-Seok Yang; Woong-Yang Park; Hyungtae Kim; George M. Church; Charles Lee; Stephen F. Kingsmore; Jeong-Sun Seo

2009-01-01

315

Mitochondrial Genome Sequences Effectively Reveal the Phylogeny of Hylobates Gibbons  

PubMed Central

Background Uniquely among hominoids, gibbons exist as multiple geographically contiguous taxa exhibiting distinctive behavioral, morphological, and karyotypic characteristics. However, our understanding of the evolutionary relationships of the various gibbons, especially among Hylobates species, is still limited because previous studies used limited taxon sampling or short mitochondrial DNA (mtDNA) sequences. Here we use mtDNA genome sequences to reconstruct gibbon phylogenetic relationships and reveal the pattern and timing of divergence events in gibbon evolutionary history. Methodology/Principal Findings We sequenced the mitochondrial genomes of 51 individuals representing 11 species belonging to three genera (Hylobates, Nomascus and Symphalangus) using the high-throughput 454 sequencing system with the parallel tagged sequencing approach. Three phylogenetic analyses (maximum likelihood, Bayesian analysis and neighbor-joining) depicted the gibbon phylogenetic relationships congruently and with strong support values. Most notably, we recover a well-supported phylogeny of the Hylobates gibbons. The estimation of divergence times using Bayesian analysis with relaxed clock model suggests a much more rapid speciation process in Hylobates than in Nomascus. Conclusions/Significance Use of more than 15 kb sequences of the mitochondrial genome provided more informative and robust data than previous studies of short mitochondrial segments (e.g., control region or cytochrome b) as shown by the reliable reconstruction of divergence patterns among Hylobates gibbons. Moreover, molecular dating of the mitogenomic divergence times implied that biogeographic change during the last five million years may be a factor promoting the speciation of Sundaland animals, including Hylobates species. PMID:21203450

Chan, Yi-Chiao; Roos, Christian; Inoue-Murayama, Miho; Inoue, Eiji; Shih, Chih-Chin; Pei, Kurtis Jai-Chyi; Vigilant, Linda

2010-01-01

316

Genome Sequence and Phenotypic Characterization of Caulobacter segnis.  

PubMed

Caulobacter segnis is a unique species of Caulobacter that was initially deemed Mycoplana segnis because it was isolated from soil and appeared to share a number of features with other Mycoplana. After a 16S rDNA analysis showed that it was closely related to Caulobacter crescentus, it was reclassified C. segnis. Because the C. segnis genome sequence available in GenBank contained 126 pseudogenes, we compared the original sequencing data to the GenBank sequence and determined that many of the pseudogenes were due to sequence errors in the GenBank sequence. Consequently, we used multiple approaches to correct and reannotate the C. segnis genome sequence. In total, we deleted 247 bp, added 14 bp, and changed 8 bp resulting in 233 fewer bases in our corrected sequence. The corrected sequence contains only 15 pseudogenes compared to 126 in the original annotation. Furthermore, we found that unlike Mycoplana, C. segnis divides by fission, producing swarmer cells that have a single, polar flagellum. PMID:25398322

Patel, Sagar; Fletcher, Brock; Scott, Derrick C; Ely, Bert

2015-03-01

317

Complete Genome Sequence of Actinobaculum schaalii Strain CCUG 27420  

PubMed Central

Complete genome sequencing of the emerging uropathogen Actinobaculum schaalii indicates that an important mechanism of its virulence is attachment pili, which allow the organism to adhere to the surface of animal cells, greatly enhancing the ability of this organism to colonize the urinary tract. PMID:25189588

Kristiansen, Rikke; Dueholm, Morten S.; Bank, Steffen; Nielsen, Per Halkjær; Karst, Søren M.; Cattoir, Vincent; Lienhard, Reto; Grisold, Andrea J.; Olsen, Anne Buchhave; Reinhard, Mark; Søby, Karen Marie; Christensen, Jens Jørgen; Prag, Jørgen

2014-01-01

318

Genome Sequence of Klebsiella pneumoniae Respiratory Isolate IA565  

PubMed Central

Klebsiella pneumoniae is a clinically significant opportunistic bacterial pathogen as well as a normal member of the human microbiota. K. pneumoniae strain IA565 was isolated from a tracheal aspirate at the University of Iowa Hospitals and Clinics. Here, we present the genome sequence of K. pneumoniae IA565. PMID:25212620

Johnson, Jeremiah G.; Spurbeck, Rachel R.; Sandhu, Sukhinder K.

2014-01-01

319

Genome Sequence of the Relapsing Fever Borreliosis Species Borrelia hispanica  

PubMed Central

Borrelia hispanica is the etiological pathogen of tick-borne relapsing fever, transmitted to humans by infected Ornithodoros erraticus ticks. Here we present the 1,783,846-bp draft genome sequence, with an average G+C content of 28%. It has 2,140 open reading frames, 3 ribosomal RNAs, and 32 transfer RNAs. PMID:24435869

Elbir, Haitham; Larsson, Pär; Upreti, Mukunda; Normark, Johan

2014-01-01

320

Complete Genome Sequence of a Mosaic Bacteriophage, Waukesha92  

PubMed Central

In this study, we determined the complete genome sequence of a mosaic bacteriophage, Waukesha92, which was isolated from soil using Bacillus thuringiensis as the host organism. This temperate Myoviridae bacteriophage has similarities to phages SpaA1 and BceA1 and the Bacillus thuringiensis plasmid pBMB165. PMID:25146131

Sauder, A. Brooke; Carter, Brandon; Langouet Astrie, Christophe

2014-01-01

321

Complete Genome Sequence of Phototrophic Betaproteobacterium Rubrivivax gelatinosus IL144  

PubMed Central

Rubrivivax gelatinosus is a facultative photoheterotrophic betaproteobacterium living in freshwater ponds, sewage ditches, activated sludge, and food processing wastewater. There have not been many studies on photosynthetic betaproteobacteria. Here we announce the complete genome sequence of the best-studied phototrophic betaproteobacterium, R. gelatinosus IL-144 (NBRC 100245). PMID:22689232

Kamimura, Akiko; Shimizu, Takayuki; Nakamura-Isaki, Sanae; Aono, Eiji; Sakamoto, Koji; Ichikawa, Natsuko; Nakazawa, Hidekazu; Sekine, Mitsuo; Yamazaki, Shuji; Fujita, Nobuyuki; Shimada, Keizo; Hanada, Satoshi; Nagashima, Kenji V. P.

2012-01-01

322

Complete genome sequence of the acetic acid bacterium Gluconobacter oxydans  

Microsoft Academic Search

Gluconobacter oxydans is unsurpassed by other organisms in its ability to incompletely oxidize a great variety of carbohydrates, alcohols and related compounds. Furthermore, the organism is used for several biotechnological processes, such as vitamin C production. To further our understanding of its overall metabolism, we sequenced the complete genome of G. oxydans 621H. The chromosome consists of 2,702,173 base pairs

Christina Prust; Marc Hoffmeister; Heiko Liesegang; Arnim Wiezer; Wolfgang Florian Fricke; Armin Ehrenreich; Gerhard Gottschalk; Uwe Deppenmeier

2005-01-01

323

Draft Genome Sequence of Alicyclobacillus acidoterrestris Strain ATCC 49025  

PubMed Central

Alicyclobacillus acidoterrestris is a spore-forming Gram-positive, thermo-acidophilic, nonpathogenic bacterium which contaminates commercial pasteurized fruit juices. The draft genome sequence for A. acidoterrestris strain ATCC 49025 is reported here, providing genetic data relevant to the successful adaptation and survival of this strain in its ecological niche. PMID:24009113

Pasvolsky, Ronit; Sela, Noa; Green, Stefan J.; Zakin, Varda

2013-01-01

324

Draft Genome Sequence of Alicyclobacillus acidoterrestris Strain ATCC 49025.  

PubMed

Alicyclobacillus acidoterrestris is a spore-forming Gram-positive, thermo-acidophilic, nonpathogenic bacterium which contaminates commercial pasteurized fruit juices. The draft genome sequence for A. acidoterrestris strain ATCC 49025 is reported here, providing genetic data relevant to the successful adaptation and survival of this strain in its ecological niche. PMID:24009113

Shemesh, Moshe; Pasvolsky, Ronit; Sela, Noa; Green, Stefan J; Zakin, Varda

2013-01-01

325

Draft Genome Sequence of Fish Pathogenic Vibrio vulnificus Biotype 2.  

PubMed

Vibrio vulnificus is a marine pathogen capable of causing severe soft tissue infections and septicemia in humans. V. vulnificus biotype 2 is the etiological agent of fish vibriosis. We describe here the first draft genome sequence of V. vulnificus biotype 2, strain ES-7601, isolated from an infected eel in Japan. PMID:25428972

Koton, Yael; Eghbaria, Saleh; Gordon, Michal; Chalifa-Caspi, Vered; Bisharat, Naiel

2014-01-01

326

Draft Genome Sequence of Rhodococcus rhodochrous Strain ATCC 21198  

PubMed Central

Rhodococcus rhodochrous is a Gram-positive red-pigmented bacterium commonly found in the soil. The draft genome sequence for R. rhodochrous strain ATCC 21198 is presented here to provide genetic data for a better understanding of its lipid-accumulating capabilities. PMID:24526639

Shields-Menard, Sara A.; Klingeman, Dawn M.; Indest, Karl; Hancock, Dawn; Wewalwela, Jayani J.; Donaldson, Janet R.

2014-01-01

327

PHYTOPHTHORA GENOME SEQUENCES UNCOVER EVOLUTIONARY ORIGINS AND MECHANISMS OF PATHOGENESIS  

Technology Transfer Automated Retrieval System (TEKTRAN)

Draft genome sequences of the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum have been determined to depths of 9x and 7.7x, respectively. Oomycetes such as these Phytophthora species share the kingdom Stramenopiles with photosynthetic algae such as diatoms...

328

Genome sequence of lineage III Listeria monocytogenes strain HCC23.  

PubMed

More than 98% of reported human listeriosis cases are caused by Listeria monocytogenes serotypes within lineages I and II. Serotypes within lineage III (4a and 4c) are commonly isolated from environmental and food specimens. We report the first complete genome sequence of a lineage III isolate, HCC23, which will be used for comparative analysis. PMID:21602330

Steele, Chelsea L; Donaldson, Janet R; Paul, Debarati; Banes, Michelle M; Arick, Tony; Bridges, Susan M; Lawrence, Mark L

2011-07-01

329

Genome Sequence of the Yeast Cyberlindnera fabianii (Hansenula fabianii)  

PubMed Central

The yeast Cyberlindnera fabianii is used in wastewater treatment, fermentation of alcoholic beverages, and has caused blood infections. To assist in the accurate identification of this species, and to determine the genetic basis for properties involved in fermentation and water treatment, we sequenced and annotated the genome of C. fabianii (YJS4271). PMID:25103752

Freel, Kelle C.; Sarilar, Véronique; Neuvéglise, Cécile; Devillers, Hugo; Friedrich, Anne

2014-01-01

330

Draft Genome Sequence of Buttiauxella agrestis, Isolated from Surface Water  

PubMed Central

MI agar is routinely used for quantifying Escherichia coli in drinking water. A suspect E. coli colony isolated from a water sample was identified as Buttiauxella agrestis. The whole genome sequence of B. agrestis was determined to understand the genetic basis for its phenotypic resemblance to E. coli on MI agar. PMID:25323724

Kahler, Amy; Strockbine, Nancy; Gladney, Lori; Hill, Vincent R.

2014-01-01

331

Draft Genome Sequence of Fish Pathogenic Vibrio vulnificus Biotype 2  

PubMed Central

Vibrio vulnificus is a marine pathogen capable of causing severe soft tissue infections and septicemia in humans. V. vulnificus biotype 2 is the etiological agent of fish vibriosis. We describe here the first draft genome sequence of V. vulnificus biotype 2, strain ES-7601, isolated from an infected eel in Japan. PMID:25428972

Koton, Yael; Eghbaria, Saleh; Gordon, Michal; Chalifa-Caspi, Vered

2014-01-01

332

The complete genome sequence of polygonum ringspot virus.  

PubMed

The complete genome sequence of polygonum ringspot virus (PolRSV), genus Tospovirus, family Bunyaviridae, was determined. This is the first report of the complete genome sequence for a European tospovirus isolate. The large RNA of PolRSV was 8893 nucleotides (nt) in size and contained a single open reading frame of 8628 nucleotides in the viral-complementary sense, coding for a predicted RNA-dependent RNA polymerase of 330.9 kDa. Two untranslated regions of 230 and 32 nucleotides were present at the 5' and 3' termini, respectively, which showed conserved terminal sequences, as commonly observed for tospovirus genomic RNAs. The medium and small (S) RNAs were 4710 and 2485 nucleotides in size, respectively, and showed 99 % homology to the corresponding genomic segment of a previously partially characterized PolRSV isolate, Plg3. Protein sequences for GN/GC, N and NSs were identical in length in the two PolRSV isolates, while an amino acid insertion was observed for the NSm protein of the newly characterized isolate. The noncoding intergenic region of the S RNA was very short (183 nt) and was not predicted to form a hairpin structure, confirming that this unique characteristic within tospoviruses, previously observed for Plg3, is not isolate specific. PMID:25000901

Margaria, P; Miozzi, L; Ciuffo, M; Pappu, H; Turina, M

2014-11-01

333

Complete Genome Sequence of Bacillus megaterium Myophage Mater  

PubMed Central

Bacillus megaterium is a ubiquitous, soil inhabiting Gram-positive bacterium that is a common model organism and is used in industrial applications for protein production. The following reports the complete sequencing and annotation of the genome of B. megaterium myophage Mater and describes the major features identified. PMID:25593262

Lancaster, Jacob C.; Hodde, Mary K.; Hernandez, Adriana C.

2015-01-01

334

Genome Sequence of Vibrio rotiferianus Strain DAT722?  

PubMed Central

Vibrio rotiferianus is a marine pathogen capable of causing disease in various aquatic organisms. We announce the genome sequence of V. rotiferianus DAT722, which has a large chromosomal integron containing 116 gene cassettes and is a model organism for studying the role of this system in vibrio evolution. PMID:21551292

Chowdhury, Piklu Roy; Boucher, Yan; Hassan, Karl A.; Paulsen, Ian T.; Stokes, H. W.; Labbate, Maurizio

2011-01-01

335

Draft Genome Sequence of Vibrio mimicus Strain CAIM 602T  

PubMed Central

Vibrio mimicus is a Gram-negative bacterium associated with gastrointestinal diseases in humans around the world. We report the complete genome sequence of the Vibrio mimicus strain CAIM 602T (CDC1721-77, LMG 7896T, ATCC 33653T). PMID:23516211

Guardiola-Avila, Iliana; Acedo-Felix, Evelia; Yepiz-Plascencia, Gloria; Sifuentes-Romero, Itzel

2013-01-01

336

Complete Genome Sequence of Acinetobacter baumannii ZW85-1  

PubMed Central

Acinetobacter baumannii is an aerobic, nonmotile Gram-negative bacterium that causes nosocomial infections worldwide. Here, we report the complete genome sequence of Acinetobacter baumannii strain ZW85-1 and its two plasmids. One of the plasmids carries genes for NDM-1, which can hydrolyze a wide range of antibiotics. PMID:24459253

Wang, Xin; Zhang, Zhewen; Hao, Qiong; Wu, Jiayan

2014-01-01

337

Complete Genome Sequence of Citrobacter freundii Myophage Moogle  

PubMed Central

Citrobacter freundii is an opportunistic pathogen that has been linked to nosocomial infections, such as brain abscesses and pneumonia. Further study on phages infecting C. freundii may provide therapeutics for these infections. Here, we announce the complete genome sequence of the FelixO1-like myophage Moogle and describe its features. PMID:25635026

Nguyen, Quynh T.; Luna, Adrian J.; Hernandez, Adriana C.

2015-01-01

338

Genome Sequence of the Paleopolyploid Soybean (Glycine max (L.) Merr.)  

Technology Transfer Automated Retrieval System (TEKTRAN)

We report the genome sequence for soybean (Glycine max var. Williams 82), one of the most important crop plants worldwide because of its ability to produce both protein and oil. Soybean is a recently domesticated legume that plays a vital role in crop rotation as it fixes atmospheric nitrogen via s...

339

Draft Genome Sequence of Rice Isolate Pseudomonas chlororaphis EA105  

PubMed Central

Pseudomonas chlororaphis EA105, a strain isolated from rice rhizosphere, has shown antagonistic activities against a rice fungal pathogen, and could be important in defense against rice blast. We report the draft genome sequence of EA105, which is an estimated size of 6.6 Mb. PMID:25540352

McCully, Lucy M.; Bitzer, Adam S.; Spence, Carla A.; Bais, Harsh P.

2014-01-01

340

Draft Genome Sequence of Enterococcus faecalis MB5259  

PubMed Central

In this study, we present a draft genome sequence of Enterococcus faecalis MB5259, a promising probiotic strain. The identified differences and common features between this strain and reference strains will assist in better understanding the mechanism of antibacterial action and in developing novel probiotics. PMID:24948775

Robyn, Joris; Rasschaert, Geertrui; Heyndrickx, Marc

2014-01-01

341

A Mitochondrial Genome Sequence of the Tibetan Antelope (Pantholops hodgsonii)  

Microsoft Academic Search

To investigate genetic mechanisms of high altitude adaptations of native mammals on the Tibetan Plateau, we compared mitochondrial sequences of the endangered Pantholops hodgsonii with its lowland distant relatives Ovis aries and Capra hir- cus, as well as other mammals. The complete mitochondrial genome of P. hodgsonii (16,498 bp) revealed a similar gene order as of other mammals. Because of

Shu-Qing Xu; Ying-Zhong Yang; Jun Zhou; Guo-En Jin; Yun-Tian Chen; Jun Wang; Huan-Ming Yang; Jian Wang; Jun Yu; Xiao-Guang Zheng; Ri-Li Ge

342

The Genome Sequence of the Malaria Mosquito Anopheles gambiae  

E-print Network

The Genome Sequence of the Malaria Mosquito Anopheles gambiae Robert A. Holt,1 * G. Mani insights into the phys- iological adaptations of a hematophagous insect. The mosquito is both an elegant, exquisitely adapted organism and a scourge of humanity. The principal mosquito-borne human illnesses

Salzberg, Steven

343

Complete genome sequence of chikungunya virus isolated in the Philippines.  

PubMed

Chikungunya virus is an alphavirus of the Togaviridae family, which causes a febrile illness with arthralgia in humans. We report here on the complete genome sequence of chikungunya virus strain CHIKV-13-112A isolated from a patient in the Philippines who was suspected to have dengue virus. Phylogenetic analysis revealed that the strain is of the Asian genotype. PMID:24970822

Kawashima, Kent D; Suarez, Lady-Anne C; Labayo, Hannah Karen M; Liles, Veni R; Salvoza, Noel C; Klinzing, David C; Daroy, Maria Luisa G; Matias, Ronald R; Natividad, Filipinas F

2014-01-01

344

Complete Genome Sequence of Chikungunya Virus Isolated in the Philippines  

PubMed Central

Chikungunya virus is an alphavirus of the Togaviridae family, which causes a febrile illness with arthralgia in humans. We report here on the complete genome sequence of chikungunya virus strain CHIKV-13-112A isolated from a patient in the Philippines who was suspected to have dengue virus. Phylogenetic analysis revealed that the strain is of the Asian genotype. PMID:24970822

Kawashima, Kent D.; Suarez, Lady-Anne C.; Labayo, Hannah Karen M.; Liles, Veni R.; Salvoza, Noel C.; Klinzing, David C.; Natividad, Filipinas F.

2014-01-01

345

Complete Genome Sequence of Carnobacterium sp. 17-4?  

PubMed Central

Members of the carnobacteria have been extensively studied as probiotic cultures in aquacultures and protective cultures in seafood, diary, and meat. We report on the finished genome sequence of Carnobacterium sp. 17-4, which has been isolated from permanently cold seawater. The genetic information reveals a new circular bacteriocin biosynthesis cluster. PMID:21551290

Voget, Sonja; Klippel, Barbara; Daniel, Rolf; Antranikian, Garabed

2011-01-01

346

2005 Nature Publishing Group Genome sequence, comparative analysis  

E-print Network

for diseases and traits, with important consequences for human and companion animal health. Man's best friend of the domestic dog Kerstin Lindblad-Toh1 , Claire M Wade1,2 , Tarjei S. Mikkelsen1,3 , Elinor K. Karlsson1 a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map

Kellis, Manolis

347

The Genome Sequence of the Malaria Mosquito Anopheles gambiae  

Microsoft Academic Search

Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303

Robert A. Holt; G. Mani Subramanian; Aaron Halpern; Granger G. Sutton; Rosane Charlab; Deborah R. Nusskern; Patrick Wincker; Andrew G. Clark; José M. C. Ribeiro; Ron Wides; Steven L. Salzberg; Brendan Loftus; Mark Yandell; William H. Majoros; Douglas B. Rusch; Zhongwu Lai; Cheryl L. Kraft; Josep F. Abril; Veronique Anthouard; Peter Arensburger; Peter W. Atkinson; Holly Baden; Veronique de Berardinis; Danita Baldwin; Vladimir Benes; Jim Biedler; Claudia Blass; Randall Bolanos; Didier Boscus; Mary Barnstead; Shuang Cai; Kabir Chatuverdi; George K. Christophides; Mathew A. Chrystal; Michele Clamp; Anibal Cravchik; Val Curwen; Ali Dana; Art Delcher; Ian Dew; Cheryl A. Evans; Michael Flanigan; Anne Grundschober-Freimoser; Lisa Friedli; Zhiping Gu; Ping Guan; Roderic Guigo; Maureen E. Hillenmeyer; Susanne L. Hladun; James R. Hogan; Young S. Hong; Jeffrey Hoover; Olivier Jaillon; Zhaoxi Ke; Chinnappa Kodira; Elena Kokoza; Anastasios Koutsos; Ivica Letunic; Alex Levitsky; Yong Liang; Jhy-Jhu Lin; Neil F. Lobo; John R. Lopez; Joel A. Malek; Tina C. McIntosh; Stephan Meister; Jason Miller; Clark Mobarry; Emmanuel Mongin; Sean D. Murphy; David A. O'Brochta; Cynthia Pfannkoch; Rong Qi; Megan A. Regier; Karin Remington; Hongguang Shao; Maria V. Sharakhova; Cynthia D. Sitter; Jyoti Shetty; Thomas J. Smith; Renee Strong; Jingtao Sun; Dana Thomasova; Lucas Q. Ton; Pantelis Topalis; Zhijian Tu; Maria F. Unger; Brian Walenz; Aihui Wang; Jian Wang; Mei Wang; Xuelan Wang; Kerry J. Woodford; Jennifer R. Wortman; Martin Wu; Evgeny M. Zdobnov; Hongyu Zhang; Qi Zhao; Shaying Zhao; Shiaoping C. Zhu; Igor Zhimulev; Mario Coluzzi; Alessandra della Torre; Charles W. Roth; Christos Louis; Francis Kalush; Richard J. Mural; Eugene W. Myers; Mark D. Adams; Hamilton O. Smith; Samuel Broder; Malcolm J. Gardner; Claire M. Fraser; Ewan Birney; Peer Bork; Paul T. Brey; J. Craig Venter; Jean Weissenbach; Fotis C. Kafatos; Frank H. Collins; Stephen L. Hoffman

2002-01-01

348

Draft Genome Sequence of Rhodococcus rhodochrous Strain ATCC 21198  

SciTech Connect

Rhodococcus rhodochrous is a Gram-positive red-pigmented bacterium commonly found in the soil. The draft genome sequence for R. rhodochrous strain ATCC 21198 is presented here to provide genetic data for a better understanding of its lipid-accumulating capabilities.

Shields-Menard, Sara A. [Mississippi State University (MSU)] [Mississippi State University (MSU); Brown, Steven D [ORNL] [ORNL; Klingeman, Dawn Marie [ORNL] [ORNL; Indest, Karl [University of Tennessee (UTK) and Oak Ridge National Laboratory (ORNL)] [University of Tennessee (UTK) and Oak Ridge National Laboratory (ORNL); Hancock, Dawn [U.S. Army Engineer Research and Development Center] [U.S. Army Engineer Research and Development Center; Wewalwela, Jayani [Mississippi State University (MSU)] [Mississippi State University (MSU); French, Todd [Mississippi State University (MSU)] [Mississippi State University (MSU); Donaldson, Janet [Mississippi State University] [Mississippi State University

2014-01-01

349

Draft Genome Sequence of Geobacillus thermopakistaniensis Strain MAS1.  

PubMed

Geobacillus thermopakistaniensis strain MAS1 was isolated from a hot spring located in the Northern Areas of Pakistan. The draft genome sequence was 3.5 Mb and identified a number of genes of potential industrial importance, including genes encoding glycoside hydrolases, pullulanase, amylopullulanase, glycosidase, and alcohol dehydrogenases. PMID:24903880

Siddiqui, Masood Ahmed; Rashid, Naeem; Ayyampalayam, Saravanaraj; Whitman, William B

2014-01-01

350

Genome Sequence of Klebsiella pneumoniae Urinary Tract Isolate Top52  

PubMed Central

Klebsiella pneumoniae is a significant cause of nosocomial infections, including ventilator-associated pneumonias and catheter-associated urinary tract infections. K. pneumoniae strain TOP52 #1721 (Top52) was isolated from a woman presenting with acute cystitis and subsequently characterized using various murine models of infection. Here we present the genome sequence of K. pneumoniae Top52. PMID:24994806

Johnson, Jeremiah G.; Spurbeck, Rachel R.; Sandhu, Sukhinder K.

2014-01-01

351

Complete Genome Sequence of the Haloalkaliphilic, Hydrogen Producing Halanaerobium hydrogenoformans  

SciTech Connect

Halanaerobium hydrogenoformans is an alkaliphilic bacterium capable of biohydrogen production at pH 11 and 7% (w/v) salt. We present the 2.6 Mb genome sequence to provide insights into its physiology and potential for bioenergy applications.

Brown, Steven D [ORNL; Begemann, Matthew B [University of Wisconsin, Madison; Mormile, Dr. Melanie R. [Missouri University of Science and Technology; Wall, Judy D. [University of Missouri; Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Samual [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Elias, Dwayne A [ORNL

2011-01-01

352

Genome sequence of the human malaria parasite Plasmodium falciparum  

E-print Network

Genome sequence of the human malaria parasite Plasmodium falciparum Malcolm J. Gardner1 , Neil Hall ........................................................................................................................................................................................................................... The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted

Arnold, Jonathan

353

Complete Genome Sequence of Leuconostoc citreum KM20?  

PubMed Central

Leuconostoc citreum is one of the most prevalent lactic acid bacteria during the manufacturing process of kimchi, the best-known Korean traditional dish. We have determined the complete genome sequence of L. citreum KM20. It consists of a 1.80-Mb chromosome and four circular plasmids and reveals genes likely involved in kimchi fermentation and its probiotic effects. PMID:18281406

Kim, Jihyun F.; Jeong, Haeyoung; Lee, Jung-Sook; Choi, Sang-Haeng; Ha, Misook; Hur, Cheol-Goo; Kim, Ji-Sun; Lee, Soohyun; Park, Hong-Seog; Park, Yong-Ha; Oh, Tae Kwang

2008-01-01

354

Sequencing and Analyses of All Known Human Rhinovirus Genomes  

E-print Network

Sequencing and Analyses of All Known Human Rhinovirus Genomes Reveal Structure and Evolution Ann C Claire M. Fraser-Liggett,4 Stephen B. Liggett3 Infection by human rhinovirus (HRV) is a major cause-based epidemiologic studies and antiviral or vaccine development. H uman rhinovirus (HRV), the disease agent

355

Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002  

PubMed Central

Pectobacterium wasabiae, originally causing soft rot disease in horseradish in Japan, was recently found to cause blackleg-like symptoms on potato in the United States, Canada, and Europe. A draft genome sequence of a Canadian potato isolate of P. wasabiae CFIA1002 will enhance the characterization of its pathogenicity and host specificity features. PMID:24831134

Yuan, Kat (Xiaoli); Adam, Zaky; Tambong, James; Lévesque, C. André; Chen, Wen; Lewis, Christopher T.; De Boer, Solke H.

2014-01-01

356

Complete genome sequence of Oceanithermus profundus type strain (506T)  

SciTech Connect

Oceanithermus profundus Miroshnichenko et al. 2003 is the type species of the genus Oceanithermus, which belongs to the family Thermaceae. The genus currently comprises two species whose members are thermophilic and are able to reduce sulfur compounds and nitrite. The organism is adapted to the salinity of sea water, is able to utilize a broad range of carbohydrates, some proteinaceous substrates, organic acids and alcohols. This is the first completed genome sequence of a member of the genus Oceanithermus and the fourth sequence from the family Thermaceae. The 2,439,291 bp long genome with its 2,391 protein-coding and 54 RNA genes consists of one chromosome and a 135,351 bp long plasmid, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Zhang, Xiaojing [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Ruhl, Alina [U.S. Department of Energy, Joint Genome Institute; Mwirichia, Romano [University of Munster, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Tindall, Brian [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Land, Miriam L [ORNL

2011-01-01

357

The Homeodomain Resource: sequences, structures and genomic information.  

PubMed

The Homeodomain Resource is a comprehensive collection of sequence, structure and genomic information on the homeodomain protein family. Available through the Resource are both full-length and domain-only sequence data, as well as X-ray and NMR structural data for proteins and protein-DNA complexes. Also available is information on human genetic diseases and disorders in which proteins from the homeodomain family play an important role; genomic information includes relevant gene symbols, cytogenetic map locations, and specific mutation data. Search engines are provided to allow users to easily query the component databases and assemble specialized data sets. The Homeodomain Resource is available through the World Wide Web at http://genome.nhgri.nih.gov/homeodomain PMID:9847220

Banerjee-Basu, S; Ferlanti, E S; Ryan, J F; Baxevanis, A D

1999-01-01

358

Characterization of the complete genome sequence of pike fry rhabdovirus.  

PubMed

The complete genome sequence of pike fry rhabdovirus (PFRV), consisting of 11,097 nucleotides, was determined. The genome contains five genes, encoding the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G), and RNA-dependent RNA polymerase (L) protein in the order 3'-N-P-M-G-L-5'. 3' leader- and 5' trailer-sequences in the PFRV genome show inverse complementarity. The PFRV proteins share the highest homology to the proteins of spring viremia of carp virus (SVCV), ranging from 55.3 to 91.4%. Phylogenetic analysis of the five proteins showed that PFRV clusters with SVCV and is closely related to the mammalian vesiculoviruses, 903/87, STRV and SCRV. PMID:19603256

Chen, Hong-Lian; Liu, Hong; Liu, Zong-Xiao; He, Jun-Qiang; Gao, Long-Ying; Shi, Xiu-Jie; Jiang, Yu-Lin

2009-01-01

359

Simultaneous rapid sequencing of multiple RNA virus genomes.  

PubMed

Comparing sequences of archived viruses collected over many years to the present allows the study of viral evolution and contributes to the design of new vaccines. However, the difficulty, time and expense of generating full-length sequences individually from each archived sample have hampered these studies. Next generation sequencing technologies have been utilized for analysis of clinical and environmental samples to identify viral pathogens that may be present. This has led to the discovery of many new, uncharacterized viruses from a number of viral families. Use of these sequencing technologies would be advantageous in examining viral evolution. In this study, a sequencing procedure was used to sequence simultaneously and rapidly multiple archived samples using a single standard protocol. This procedure utilized primers composed of 20 bases of known sequence with 8 random bases at the 3'-end that also served as an identifying barcode that allowed the differentiation each viral library following pooling and sequencing. This conferred sequence independence by random priming both first and second strand cDNA synthesis. Viral stocks were treated with a nuclease cocktail to reduce the presence of host nucleic acids. Viral RNA was extracted, followed by single tube random-primed double-stranded cDNA synthesis. The resultant cDNAs were amplified by primer-specific PCR, pooled, size fractionated and sequenced on the Ion Torrent PGM platform. The individual virus genomes were readily assembled by both de novo and template-assisted assembly methods. This procedure consistently resulted in near full length, if not full-length, genomic sequences and was used to sequence multiple bovine pestivirus and coronavirus isolates simultaneously. PMID:24589514

Neill, John D; Bayles, Darrell O; Ridpath, Julia F

2014-06-01

360

Deriving Group A Streptococcus Typing Information from Short-Read Whole-Genome Sequencing Data  

PubMed Central

Typing of group A Streptococcus (GAS) is crucial for infection control and epidemiology. While whole-genome sequencing (WGS) is revolutionizing the way that bacterial organisms are typed, it is necessary to provide backward compatibility with currently used typing schemas to facilitate comparisons and understanding of epidemiological trends. Here, we sequenced the genomes of 191 GAS isolates representing 42 different emm types and used bioinformatics tools to derive commonly used GAS typing information directly from the short-read WGS data. We show that emm typing and multilocus sequence typing can be achieved rapidly and efficiently using this approach, which also permits the determination of the presence or absence of genes associated with GAS tissue tropism. We also report on how the WGS data analysis was instrumental in identifying ambiguities present in the commonly used emm type database hosted by the U.S. Centers for Disease Control and Prevention. PMID:24648555

Athey, Taryn B. T.; Teatero, Sarah; Li, Aimin; Marchand-Austin, Alex; Beall, Bernard W.

2014-01-01

361

Deriving group A Streptococcus typing information from short-read whole-genome sequencing data.  

PubMed

Typing of group A Streptococcus (GAS) is crucial for infection control and epidemiology. While whole-genome sequencing (WGS) is revolutionizing the way that bacterial organisms are typed, it is necessary to provide backward compatibility with currently used typing schemas to facilitate comparisons and understanding of epidemiological trends. Here, we sequenced the genomes of 191 GAS isolates representing 42 different emm types and used bioinformatics tools to derive commonly used GAS typing information directly from the short-read WGS data. We show that emm typing and multilocus sequence typing can be achieved rapidly and efficiently using this approach, which also permits the determination of the presence or absence of genes associated with GAS tissue tropism. We also report on how the WGS data analysis was instrumental in identifying ambiguities present in the commonly used emm type database hosted by the U.S. Centers for Disease Control and Prevention. PMID:24648555

Athey, Taryn B T; Teatero, Sarah; Li, Aimin; Marchand-Austin, Alex; Beall, Bernard W; Fittipaldi, Nahuel

2014-06-01

362

The complete mitochondrial genome sequence of Schizothorax dolichonema (Cypriniformes: Cyprinidae).  

PubMed

Abstract The complete mitochondrial genome sequence of Schizothorax dolichonema has been sequenced, which contains 22 tRNA genes, 13 protein-coding genes, 2 rRNA genes and 2 non-coding regions: origin of light-strand replication and control region, with the total length of 16,583?bp. The gene order and composition are similar to most of other vertebrates. Most of the genes are encoded on heavy strand, except for eight tRNA and ND6 genes. The mitogenome sequence of S. dolichonema would contribute for better understanding of biogeography and evolution of Schizothoracine fishes. PMID:24617487

Yue, Xingjian; Zhou, Chuanjiang; Shi, Jinrong; Zou, Yuanchao

2014-03-11

363

Short-sequence DNA repeats in prokaryotic genomes  

Microsoft Academic Search

Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic\\u000a and many prokaryotic genomes. These loci harbor short or long stretches of\\u000a repeated nucleotide sequence motifs. DNA sequence motifs in a single locus\\u000a can be identical and\\/or heterogeneous. SSRs are encountered in many\\u000a different branches of the prokaryote kingdom. They are found in genes\\u000a encoding products as diverse as

Belkum van A. F; STEWART SCHERER; Alphen van A. J. W; HENRI VERBRUGH

1998-01-01

364

Clustered repeat sequences in the genome of Epstein Barr virus.  

PubMed Central

The genome of Epstein-Barr virus is composed of unique DNA interspersed with repetitive sequences. This organization suggests that Epstein-Barr virus provides a useful model for studying the function(s) of repetitive sequences in eukaryotic chromosomes. The primary structure of two of the repeat sequences, the 3072 bp large internal repeat, or BamHI-W repeat, and a smaller 125 bp, G, C-rich NotI repeat, are presented here. Their structures and possible functions are discussed. Images PMID:6306567

Jones, M D; Griffin, B E

1983-01-01

365

Realistic artificial DNA sequences as negative controls for computational genomics  

PubMed Central

A common practice in computational genomic analysis is to use a set of ‘background’ sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such ‘background’ sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by ‘shuffling’ real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/. PMID:24803667

Caballero, Juan; Smit, Arian F. A.; Hood, Leroy; Glusman, Gustavo

2014-01-01

366

Human genetics and genomics a decade after the release of the draft sequence of the human genome  

PubMed Central

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605

2011-01-01

367

Improvements to pairwise sequence comparison (PASC): a genome-based web tool for virus classification.  

PubMed

The number of viral genome sequences in the public databases is increasing dramatically, and these sequences are playing an important role in virus classification. Pairwise sequence comparison is a sequence-based virus classification method. A program using this method calculates the pairwise identities of virus sequences within a virus family and displays their distribution, and visual analysis helps to determine demarcations at different taxonomic levels such as strain, species, genus and subfamily. Subsequent comparison of new sequences against existing ones allows viruses from which the new sequences were derived to be classified. Although this method cannot be used as the only criterion for virus classification in some cases, it is a quantitative method and has many advantages over conventional virus classification methods. It has been applied to several virus families, and there is an increasing interest in using this method for other virus families/groups. The Pairwise Sequence Comparison (PASC) classification tool was created at the National Center for Biotechnology Information. The tool's database stores pairwise identities for complete genomes/segments of 56 virus families/groups. Data in the system are updated every day to reflect changes in virus taxonomy and additions of new virus sequences to the public database. The web interface of the tool ( http://www.ncbi.nlm.nih.gov/sutils/pasc/ ) makes it easy to navigate and perform analyses. Multiple new viral genome sequences can be tested simultaneously with this system to suggest the taxonomic position of virus isolates in a specific family. PASC eliminates potential discrepancies in the results caused by different algorithms and/or different data used by researchers. PMID:25119676

Bao, Yiming; Chetvernin, Vyacheslav; Tatusova, Tatiana

2014-12-01

368

Ensemble analysis of adaptive compressed genome sequencing strategies  

PubMed Central

Background Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity. Results In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to log of the number of cells and proportional linearly with the number of distinct genomes. The probability of missing a genome depends on its abundance and the ratio of its size over the maximum genome size in the sample. The modified resource allocation method accommodates a parameter to control that probability. Availability The squeezambler 2.0 C++ source code is available at http://sourceforge.net/projects/hyda/. The ensemble analysis MATLAB code is available at http://sourceforge.net/projects/distilled-sequencing/. PMID:25252999

2014-01-01

369

Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU  

E-print Network

Comparative Microbial Genomics group Centerfor%-8531)1-803,% - or - Where Does Vibrio cholera come from? #12;Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU #12;Comparative Microbial Genomics

Ussery, David W.

370

Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU  

E-print Network

Comparative Microbial Genomics group CenterforBiologicalSequenceAnalysisTheTechnicalUniversityofDenmarkDTU Minimal genomes in bacterial Genera Dave Ussery European Conference on Synthetic Biology: Design 2007 #12;Comparative Microbial Genomics group Centerfor

Ussery, David W.

371

Genome sequencing: missing a stage, John SulstonSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

Interviewee: John Sulston DNAi Location:Genome>The project>players>Private Genome shotgun: missing a stage John Sulston, a key figure in the public genome project, speaks about the difficulties posed by missing a step in the sequencing process.

2008-10-06

372

Co-barcoded sequence reads from long DNA fragments: a cost-effective solution for “perfect genomesequencing  

PubMed Central

Next generation sequencing (NGS) technologies, primarily based on massively parallel sequencing, have touched and radically changed almost all aspects of research worldwide. These technologies have allowed for the rapid analysis, to date, of the genomes of more than 2,000 different species. In humans, NGS has arguably had the largest impact. Over 100,000 genomes of individual humans (based on various estimates) have been sequenced allowing for deep insights into what makes individuals and families unique and what causes disease in each of us. Despite all of this progress, the current state of the art in sequence technology is far from generating a “perfect genomesequence and much remains to be understood in the biology of human and other organisms’ genomes. In the article that follows, we outline why the “perfect genome” in humans is important, what is lacking from current human whole genome sequences, and a potential strategy for achieving the “perfect genome” in a cost effective manner. PMID:25642240

Peters, Brock A.; Liu, Jia; Drmanac, Radoje

2015-01-01

373

Complete Genome Sequence and Comparative Genomics of Shigella flexneri Serotype 2a Strain 2457T  

Microsoft Academic Search

We determined the complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp). Shigella species cause >1 million deaths per year from dysentery and diarrhea and have a lifestyle that is markedly different from those of closely related bacteria, including Escherichia coli. The genome exhibits the backbone and island mosaic structure of E. coli pathogens, albeit with much

J. Wei; M. B. Goldberg; V. Burland; M. M. Venkatesan; W. Deng; G. Fournier; G. F. Mayhew; G. Plunkett; D. J. Rose; A. Darling; B. Mau; N. T. Perna; S. M. Payne; L. J. Runyen-Janecky; S. Zhou; D. C. Schwartz; F. R. Blattner

2003-01-01

374

Genomic Sequencing and Analysis of Sucra jujuba Nucleopolyhedrovirus  

PubMed Central

The complete nucleotide sequence of Sucra jujuba nucleopolyhedrovirus (SujuNPV) was determined by 454 pyrosequencing. The SujuNPV genome was 135,952 bp in length with an A+T content of 61.34%. It contained 131 putative open reading frames (ORFs) covering 87.9% of the genome. Among these ORFs, 37 were conserved in all baculovirus genomes that have been completely sequenced, 24 were conserved in lepidopteran baculoviruses, 65 were found in other baculoviruses, and 5 were unique to the SujuNPV genome. Seven homologous regions (hrs) were identified in the SujuNPV genome. SujuNPV contained several genes that were duplicated or copied multiple times: two copies of helicase, DNA binding protein gene (dbp), p26 and cg30, three copies of the inhibitor of the apoptosis gene (iap), and four copies of the baculovirus repeated ORF (bro). Phylogenetic analysis suggested that SujuNPV belongs to a subclade of group II alphabaculovirus, which differs from other baculoviruses in that all nine members of this subclade contain a second copy of dbp. PMID:25329074

Liu, Xiaoping; Yin, Feifei; Zhu, Zheng; Hou, Dianhai; Wang, Jun; Zhang, Lei; Wang, Manli; Wang, Hualin; Hu, Zhihong; Deng, Fei

2014-01-01

375

Complete genome sequence of Syntrophobacter fumaroxidans strain (MPOBT)  

PubMed Central

Syntrophobacter fumaroxidans strain MPOBT is the best-studied species of the genus Syntrophobacter. The species is of interest because of its anaerobic syntrophic lifestyle, its involvement in the conversion of propionate to acetate, H2 and CO2 during the overall degradation of organic matter, and its release of products that serve as substrates for other microorganisms. The strain is able to ferment fumarate in pure culture to CO2 and succinate, and is also able to grow as a sulfate reducer with propionate as an electron donor. This is the first complete genome sequence of a member of the genus Syntrophobacter and a member genus in the family Syntrophobacteraceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 4,990,251 bp long genome with its 4,098 protein-coding and 81 RNA genes is a part of the Microbial Genome Program (MGP) and the Genomes to Life (GTL) Program project. PMID:23450070

Plugge, Caroline M.; Henstra, Anne M.; Worm, Petra; Swarts, Daan C.; Paulitsch-Fuchs, Astrid H.; Scholten, Johannes C.M.; Lykidis, Athanasios; Lapidus, Alla L.; Goltsman, Eugene; Kim, Edwin; McDonald, Erin; Rohlin, Lars; Crable, Bryan R.; Gunsalus, Robert P.; Stams, Alfons J.M.; McInerney, Michael J.

2012-01-01

376

Complete genome sequence of Actinosynnema mirum type strain (101T)  

SciTech Connect

Actinosynnema mirum Hasegawa et al. 1978 is the type species of the genus, and is of phylogenetic interest because of its central phylogenetic location in the Actino-synnemataceae, a rapidly growing family within the actinobacterial suborder Pseudo-nocardineae. A. mirum is characterized by its motile spores borne on synnemata and as a producer of nocardicin antibiotics. It is capable of growing aerobically and under a moderate CO2 atmosphere. The strain is a Gram-positive, aerial and substrate mycelium producing bacterium, originally isolated from a grass blade collected from the Raritan River, New Jersey. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first complete genome sequence of a member of the family Actinosynnemataceae, and only the second sequence from the actinobacterial suborder Pseudonocardineae. The 8,248,144 bp long single replicon genome with its 7100 protein-coding and 77 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Land, Miriam; Lapidus, Alla; Mayilraj, Shanmugam; Chen, Feng; Copeland, Alex; Glavina Del Rio, Tijana; Nolan, Matt; Lucas, Susan; Tice, Hope; Cheng, Jan-Fang; Chertkov, Olga; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Rohde, Manfred; Goker, Markus; Pati, Amrita; Ivanova, Natalia; Mavrommatis, Konstantinos; Chen, Amy; Palaniappan, Krishna; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia; Brettin, Thomas; Detter, John C.; Han, Cliff; Chain, Patrick; Tindall, Brian; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

2009-05-20

377

Genomic sequence analysis of Helicoverpa armigera nucleopolyhedrovirus isolated from Australia.  

PubMed

The complete genomic sequence of Helicoverpa armigera nucleopolyhedrovirus from Australia, HearNPV-Au, was determined and analyzed. The HearNPV-Au genome was 130,992 bp in size with a G + C content of 39 mol% and contained 134 predicted open reading frames (ORFs) consisting of more than 150 nucleotides. HearNPV-Au shared 94 ORFs with AcMNPV, HearSNPV-G4 and SeMNPV, and was most closely related to HearSNPV-G4. The nucleotide sequence identity between HearNPV-Au and HearSNPV-G4 genome was 99 %. The major differences were found in homologous regions (hrs) and baculovirus repeat ORFs (bro) genes. Five hrs and two bro genes were identified in the HearNPV-Au genome. All of the 134 ORFs identified in HearNPV-Au were also found in HearSNPV-G4, except the homologue of ORF59 (bro) in HearSNPV-G4. The sequence data strongly suggested that HearNPV-Au and HearSNPV-G4 belong to the same virus species. PMID:24757712

Zhang, Huan; Yang, Qing; Qin, Qi-Lian; Zhu, Wei; Zhang, Zhi-Fang; Li, Yi-Nü; Zhang, Ning; Zhang, Ji-Hong

2014-03-01

378

Genomic sequence analysis of Helicoverpa armigera nucleopolyhedrovirus isolated from Australia.  

PubMed

The complete genomic sequence of Helicoverpa armigera nucleopolyhedrovirus from Australia, HearNPV-Au, was determined and analyzed. The HearNPV-Au genome was 130,992 bp in size with a G+C content of 39 mol% and contained 134 predicted open reading frames (ORFs) consisting of more than 150 nucleotides. HearNPV-Au shared 94 ORFs with AcMNPV, HearSNPV-G4 and SeMNPV, and was most closely related to HearSNPV-G4. The nucleotide sequence identity between HearNPV-Au and HearSNPV-G4 genome was 99%. The major differences were found in homologous regions (hrs) and baculovirus repeat ORFs (bro) genes. Five hrs and two bro genes were identified in the HearNPV-Au genome. All of the 134 ORFs identified in HearNPV-Au were also found in HearSNPV-G4, except the homologue of ORF59 (bro) in HearSNPV-G4. The sequence data strongly suggested that HearNPV-Au and HearSNPV-G4 belong to the same virus species. PMID:24077655

Zhang, Huan; Yang, Qing; Qin, Qi-Lian; Zhu, Wei; Zhang, Zhi-Fang; Li, Yi-Nü; Zhang, Ning; Zhang, Ji-Hong

2014-03-01

379

Insights into hominid evolution from the gorilla genome sequence  

PubMed Central

Summary Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago (Mya). In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution. PMID:22398555

Scally, Aylwyn; Dutheil, Julien Y.; Hillier, LaDeana W.; Jordan, Greg E.; Goodhead, Ian; Herrero, Javier; Hobolth, Asger; Lappalainen, Tuuli; Mailund, Thomas; Marques-Bonet, Tomas; McCarthy, Shane; Montgomery, Stephen H.; Schwalie, Petra C.; Tang, Y. Amy; Ward, Michelle C.; Xue, Yali; Yngvadottir, Bryndis; Alkan, Can; Andersen, Lars N.; Ayub, Qasim; Ball, Edward V.; Beal, Kathryn; Bradley, Brenda J.; Chen, Yuan; Clee, Chris M.; Fitzgerald, Stephen; Graves, Tina A.; Gu, Yong; Heath, Paul; Heger, Andreas; Karakoc, Emre; Kolb-Kokocinski, Anja; Laird, Gavin K.; Lunter, Gerton; Meader, Stephen; Mort, Matthew; Mullikin, James C.; Munch, Kasper; O’Connor, Timothy D.; Phillips, Andrew D.; Prado-Martinez, Javier; Rogers, Anthony S.; Sajjadian, Saba; Schmidt, Dominic; Shaw, Katy; Simpson, Jared T.; Stenson, Peter D.; Turner, Daniel J.; Vigilant, Linda; Vilella, Albert J.; Whitener, Weldon; Zhu, Baoli; Cooper, David N.; de Jong, Pieter; Dermitzakis, Emmanouil T.; Eichler, Evan E.; Flicek, Paul; Goldman, Nick; Mundy, Nicholas I.; Ning, Zemin; Odom, Duncan T.; Ponting, Chris P.; Quail, Michael A.; Ryder, Oliver A.; Searle, Stephen M.; Warren, Wesley C.; Wilson, Richard K.; Schierup, Mikkel H.; Rogers, Jane; Tyler-Smith, Chris; Durbin, Richard

2012-01-01

380

Draft Genome Sequence of the Versatile Alkane-Degrading Bacterium Aquabacterium sp. Strain NJ1  

PubMed Central

The draft genome sequence of a soil bacterium, Aquabacterium sp. strain NJ1, capable of utilizing both liquid and solid alkanes, was deciphered. This is the first report of an Aquabacterium genome sequence. PMID:25477416

Shiwa, Yuh; Yoshikawa, Hirofumi; Zylstra, Gerben J.

2014-01-01

381

Draft Genome Sequence of the Brazilian Cyanobium sp. Strain CACIAM 14  

PubMed Central

Given the scarcity of data pertaining to whole-genome sequences of cyanobacterial strains isolated in Brazil, we hereby present the draft genome sequence of the Cyanobium sp. strain CACIAM 14, isolated in southeastern Amazonia. PMID:25013140

Siqueira, Andrei Santos; dos Santos, Bruno Garcia Simões; da Silva, Fábio Daniel Florêncio; Lima, Clayton Pereira; Cardoso, Jedson Ferreira; Vianez Júnior, João Lídio da Silva Gonçalves; Dall'Agnol, Leonardo Teixeira; McCulloch, John Anthony; Nunes, Márcio Roberto Teixeira; Gonçalves, Evonnildo Costa

2014-01-01

382

Draft Genome Sequence of the Versatile Alkane-Degrading Bacterium Aquabacterium sp. Strain NJ1.  

PubMed

The draft genome sequence of a soil bacterium, Aquabacterium sp. strain NJ1, capable of utilizing both liquid and solid alkanes, was deciphered. This is the first report of an Aquabacterium genome sequence. PMID:25477416

Masuda, Hisako; Shiwa, Yuh; Yoshikawa, Hirofumi; Zylstra, Gerben J

2014-01-01

383

Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™  

PubMed Central

Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.

2015-01-01

384

Uncovering Genomic Features and Maternal Origin of Korean Native Chicken by Whole Genome Sequencing  

PubMed Central

The Korean Native Chicken (KNC) is an important endemic biological resource in Korea. While numerous studies have been conducted exploring this breed, none have used next-generation sequencing to identify its specific genomic features. We sequenced five strains of KNC and identified 10.9 million SNVs and 1.3 million InDels. Through the analysis, we found that the highly variable region common to all 5 strains had genes like PCHD15, CISD1, PIK3C2A, and NUCB2 that might be related to the phenotypic traits of the chicken such as auditory sense, growth rate and egg traits. In addition, we assembled unaligned reads that could not be mapped to the reference genome. By assembling the unaligned reads, we were able to present genomic sequences characteristic to the KNC. Based on this, we also identified genes related to the olfactory receptors and antigen that are common to all 5 strains. Finally, through the reconstructed mitochondrial genome sequences, we performed phylogenomic analysis and elucidated the maternal origin of the artificially restored KNC. Our results revealed that the KNC has multiple maternal origins which are in agreement with Korea's history of chicken breed imports. The results presented here provide a valuable basis for future research on genomic features of KNC and further understanding of KNC's origin. PMID:25501044

Oh, Jae-Don; Heo, Kang-Nyeong; Lee, Jun-Heon; Lee, Woon Kyu; Yoon, Sook Hee; Kim, Heebal; Cho, Seoae; Lee, Hak-Kyo

2014-01-01

385

Genomic sequencing and analysis of a Chinese hamster ovary cell line using Illumina sequencing technology  

Microsoft Academic Search

BACKGROUND: Chinese hamster ovary (CHO) cells are among the most widely used hosts for therapeutic protein production. Yet few genomic resources are available to aid in engineering high-producing cell lines. RESULTS: High-throughput Illumina sequencing was used to generate a 1x genomic coverage of an engineered CHO cell line expressing secreted alkaline phosphatase (SEAP). Reference-guided alignment and assembly produced 3.57 million

Stephanie Hammond; Jeffrey C Swanberg; Mihailo Kaplarevic; Kelvin H Lee

2011-01-01

386

Whole-genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome evolution.  

PubMed

The wild species of the genus Oryza contain a largely untapped reservoir of agronomically important genes for rice improvement. Here we report the 261-Mb de novo assembled genome sequence of Oryza brachyantha. Low activity of long-terminal repeat retrotransposons and massive internal deletions of ancient long-terminal repeat elements lead to the compact genome of Oryza brachyantha. We model 32,038 protein-coding genes in the Oryza brachyantha genome, of which only 70% are located in collinear positions in comparison with the rice genome. Analysing breakpoints of non-collinear genes suggests that double-strand break repair through non-homologous end joining has an important role in gene movement and erosion of collinearity in the Oryza genomes. Transition of euchromatin to heterochromatin in the rice genome is accompanied by segmental and tandem duplications, further expanded by transposable element insertions. The high-quality reference genome sequence of Oryza brachyantha provides an important resource for functional and evolutionary studies in the genus Oryza. PMID:23481403

Chen, Jinfeng; Huang, Quanfei; Gao, Dongying; Wang, Junyi; Lang, Yongshan; Liu, Tieyan; Li, Bo; Bai, Zetao; Luis Goicoechea, Jose; Liang, Chengzhi; Chen, Chengbin; Zhang, Wenli; Sun, Shouhong; Liao, Yi; Zhang, Xuemei; Yang, Lu; Song, Chengli; Wang, Meijiao; Shi, Jinfeng; Liu, Geng; Liu, Junjie; Zhou, Heling; Zhou, Weili; Yu, Qiulin; An, Na; Chen, Yan; Cai, Qingle; Wang, Bo; Liu, Binghang; Min, Jiumeng; Huang, Ying; Wu, Honglong; Li, Zhenyu; Zhang, Yong; Yin, Ye; Song, Wenqin; Jiang, Jiming; Jackson, Scott A; Wing, Rod A; Wang, Jun; Chen, Mingsheng

2013-01-01

387

Phytophthora Genome Sequences Uncover Evolutionary Origins and Mechanisms of Pathogenesis  

SciTech Connect

Draft genome sequences have been determined for the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum. Oömycetes such as these Phytophthora species share the kingdom Stramenopila with photosynthetic algae such as diatoms, and the presence of many Phytophthora genes of probable phototroph origin supports a photosynthetic ancestry for the stramenopiles. Comparison of the two species' genomes reveals a rapid expansion and diversification of many protein families associated with plant infection such as hydrolases, ABC transporters, protein toxins, proteinase inhibitors, and, in particular, a superfamily of 700 proteins with similarity to known oömycete avirulence genes.

Tyler, Brett M.; Tripathy, Sucheta; Zhang, Xuemin; Dehal, Paramvir; Jiang, Rays H. Y.; Aerts, Andrea; Arredondo, Felipe D.; Baxter, Laura; Bensasson, Douda; Beynon, JIm L.; Chapman, Jarrod; Damasceno, Cynthia M. B.; Dorrance, Anne E.; Dou, Daolong; Dickerman, Allan W.; Dubchak, Inna L.; Garbelotto, Matteo; Gijzen, Mark; Gordon, Stuart G.; Govers, Francine; Grunwald, NIklaus J.; Huang, Wayne; Ivors, Kelly L.; Jones, Richard W.; Kamoun, Sophien; Krampis, Konstantinos; Lamour, Kurt H.; Lee, Mi-Kyung; McDonald, W. Hayes; Medina, Monica; Meijer, Harold J. G.; Nordberg, Erik K.; Maclean, Donald J.; Ospina-Giraldo, Manuel D.; Morris, Paul F.; Phuntumart, Vipaporn; Putnam, Nicholas J.; Rash, Sam; Rose, Jocelyn K. C.; Sakihama, Yasuko; Salamov, Asaf A.; Savidor, Alon; Scheuring, Chantel F.; Smith, Brian M.; Sobral, Bruno W. S.; Terry, Astrid; Torto-Alalibo, Trudy A.; Win, Joe; Xu, Zhanyou; Zhang, Hongbin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Boore, Jeffrey L.

2006-04-17

388

Arrangement of repetitive sequences in the genome of herpesvirus Sylvilagus.  

PubMed

Herpesvirus sylvilagus is a lymphotropic (type gamma) herpesvirus of cottontail rabbits (Sylvilagus floridanus). Analysis of virion DNA of herpesvirus sylvilagus has revealed that the genome consists of one stretch of about 120 kilobase pairs of internal, unique DNA flanked by a variable number of 553-base-pair tandem repeats. The G + C content of the repetitive DNA is extremely high (83%), as determined by sequencing. The organization of the herpesvirus sylvilagus genome is, therefore, similar to that of the primate lymphotropic viruses herpesvirus saimiri and herpesvirus ateles. PMID:2911114

Medveczky, M M; Geck, P; Clarke, C; Byrnes, J; Sullivan, J L; Medveczky, P G

1989-02-01

389

Development of peanut EST (expressed sequence tag)-based genomic resources and tools  

Technology Transfer Automated Retrieval System (TEKTRAN)

U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

390

Development of peanut expessed sequence tag-based genomic resources and tools  

Technology Transfer Automated Retrieval System (TEKTRAN)

U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

391

Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome  

Microsoft Academic Search

Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis

Byrappa Venkatesh; Ewen F Kirkness; Yong-Hwee Loh; Aaron L Halpern; Alison P Lee; Justin Johnson; Nidhi Dandona; Lakshmi D Viswanathan; Alice Tay; J. Craig Venter; Robert L Strausberg; Sydney Brenner

2007-01-01

392

SeqEntropy: Genome-Wide Assessment of Repeats for Short Read Sequencing  

E-print Network

analysis of human genome [1] and for rapid full genome sequencing and typing of various organisms. The 1000 Genomes Project, launched in 2008, bSeqEntropy: Genome-Wide Assessment of Repeats for Short Read Sequencing Hsueh-Ting Chu1,2 , William

Chen, Chaur-Chin

393

A Model of the Statistical Power of Comparative Genome Sequence Analysis  

E-print Network

by their evolutionary conservation [1,2,3]. It will be instrumental for achieving the goal of the Human Genome Project to comprehensively identify functional elements in the human genome [4]. How many comparative genome sequences do we not contribute significant information to human genome analysis? Since sequencing is expensive and capacity

Eddy, Sean

394

Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution  

Microsoft Academic Search

We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome-composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes-provides a new perspective on vertebrate genome evolution,

LaDeana W. Hillier; Webb Miller; Ewan Birney; Wesley Warren; Ross C. Hardison; Chris P. Ponting; Peer Bork; David W. Burt; Martien A. M. Groenen; Mary E. Delany; Jerry B. Dodgson; Genome assembly; Asif T. Chinwalla; Paul F. Cliften; Sandra W. Clifton; Kimberly D. Delehaunty; Catrina Fronick; Robert S. Fulton; Tina A. Graves; Colin Kremitzki; Dan Layman; Vincent Magrini; John D. McPherson; Tracie L. Miner; Patrick Minx; William E. Nash; Michael N. Nhan; Joanne O. Nelson; Lachlan G. Oddy; Craig S. Pohl; Jennifer Randall-Maher; Scott M. Smith; John W. Wallis; Shiaw-Pyng Yang; Michael N. Romanov; Catherine M. Rondelli; Bob Paton; Jacqueline Smith; David Morrice; Laura Daniels; Helen G. Tempest; Lindsay Robertson; Julio S. Masabanda; Darren K. Griffin; Alain Vignal; Valerie Fillon; Susanne Kerje; Leif Andersson; Richard P. M. Crooijmans; Jan Aerts; Jan J. van der Poel; Hans Ellegren; cDNA sequencing; Randolph B. Caldwell; Simon J. Hubbard; Darren V. Grafham; Andrzej M. Kierzek; Stuart R. McLaren; Ian M. Overton; Hiroshi Arakawa; Kevin J. Beattie; Yuri Bezzubov; Paul E. Boardman; James K. Bonfield; Michael D. R. Croning; Robert M. Davies; Matthew D. Francis; Sean J. Humphray; Carol E. Scott; Ruth G. Taylor; Cheryll Tickle; William R. A. Brown; Jane Rogers; Jean-Marie Buerstedde; Stuart A. Wilson; Ivan Ovcharenko; Laurie Gordon; Susan Lucas; Marcia M. Miller; Hidetoshi Inoko; Takashi Shiina; Jim Kaufman; Jan Salomonsen; Karsten Skjoedt; Gane Ka-Shu Wong; Jun Wang; Bin Liu; Jian Wang; Jun Yu; Huanming Yang; Mikhail Nefedov; Maxim Koriabine; Pieter J. deJong; Leo Goodstadt; Caleb Webber; Nicholas J. Dickens; Ivica Letunic; Mikita Suyama; David Torrents; Christian von Mering; Evgeny M. Zdobnov; Kateryna Makova; Laura Elnitski; Pallavi Eswara; David C. King; Shan Yang; Svitlana Tyekucheva; Anusha Radakrishnan; Robert S. Harris; Francesca Chiaromonte; James Taylor; Jianbin He; Monique Rijnkels; Sam Griffiths-Jones; Michael M. Hoffman; Jessica Severin; Stephen M. J. Searle; Andy S. Law; David Speed; Dave Waddington; Ze Cheng; Eray Tuzun; Zhirong Bao; Paul Flicek; David D. Shteynberg; Michael R. Brent; Jacqueline M. Bye; Elizabeth J. Huckle; Sourav Chatterji; Colin Dewey; Lior Pachter; Andrei Kouranov; Zissimos Mourelatos; Artemis G. Hatzigeorgiou; Andrew H. Paterson; Robert Ivarie; Mikael Brandstrom; Erik Axelsson; Niclas Backstrom; Matthew T. Webster; Olivier Pourquie; Alexandre Reymond; Catherine Ucla; Stylianos E. Antonarakis; Manyuan Long; J. J. Emerson; Esther Betrán; Isabelle Dupanloup; Henrik Kaessmann; Angie S. Hinrichs; Gill Bejerano; Terrence S. Furey; Rachel A. Harte; Brian Raney; Adam Siepel; W. James Kent; David Haussler; Eduardo Eyras; Robert Castelo; Josep F. Abril; Sergi Castellano; Francisco Camara; Genis Parra; Roderic Guigo; Guillaume Bourque; Glenn Tesler; Pavel A. Pevzner; Arian Smit; Lucinda A. Fulton; Elaine R. Mardis; Richard K. Wilson

2004-01-01

395

Sequencing and annotation of the Ophiostoma ulmi genome  

PubMed Central

Background The ascomycete fungus Ophiostoma ulmi was responsible for the initial pandemic of the massively destructive Dutch elm disease in Europe and North America in early 1910. Dutch elm disease has ravaged the elm tree population globally and is a major threat to the remaining elm population. O. ulmi is also associated with valuable biomaterials applications. It was recently discovered that proteins from O. ulmi can be used for efficient transformation of amylose in the production of bioplastics. Results We have sequenced the 31.5 Mb genome of O.ulmi using Illumina next generation sequencing. Applying both de novo and comparative genome annotation methods, we predict a total of 8639 gene models. The quality of the predicted genes was validated using a variety of data sources consisting of EST data, mRNA-seq data and orthologs from related fungal species. Sequence-based computational methods were used to identify candidate virulence-related genes. Metabolic pathways were reconstructed and highlight specific enzymes that may play a role in virulence. Conclusions This genome sequence will be a useful resource for further research aimed at understanding the molecular mechanisms of pathogenicity by O. ulmi. It will also facilitate the identification of enzymes necessary for industrial biotransformation applications. PMID:23496816

2013-01-01

396

Physical map-assisted whole-genome shotgun sequence assemblies  

PubMed Central

We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the layout phase of WGS assemblies. This process is facilitated by FASSI, a stand-alone application that calculates BAC end and BAC overlap length constraints from clone fingerprint map contigs created by the FPC package. FASSI is designed to work with the assembly tool PCAP, but its output can be formatted to work with other WGS assembly algorithms able to use length constraints for individual clones. The FASSI method is simple to implement, potentially cost-effective, and has resulted in the increase of scaffold contiguity for both the Drosophila melanogaster and Cryptococcus gattii genomes when compared to a control assembly without map-derived constraints. A 6.5-fold coverage draft DNA sequence of the Pan troglodytes (chimpanzee) genome was assembled using map-derived constraints and resulted in a 26.1% increase in scaffold contiguity. PMID:16741162

Warren, René L.; Varabei, Dmitry; Platt, Darren; Huang, Xiaoqiu; Messina, David; Yang, Shiaw-Pyng; Kronstad, James W.; Krzywinski, Martin; Warren, Wesley C.; Wallis, John W.; Hillier, LaDeana W.; Chinwalla, Asif T.; Schein, Jacqueline E.; Siddiqui, Asim S.; Marra, Marco A.; Wilson, Richard K.; Jones, Steven J.M.

2006-01-01

397

Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes  

PubMed Central

Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The most widespread NRVSs were related to the coat protein (CP) genes of the family Partitiviridae which have bisegmented dsRNA genomes, and included plant- and fungus-infecting members. The CP of a novel fungal virus (Rosellinia necatrix partitivirus 2, RnPV2) had the greatest sequence similarity to Arabidopsis thaliana ILR2, which is thought to regulate the activities of the phytohormone auxin, indole-3-acetic acid (IAA). Furthermore, partitivirus CP-like sequences much more closely related to plant partitiviruses than to RnPV2 were identified in a wide range of plant species. In addition, the nucleocapsid protein genes of cytorhabdoviruses and varicosaviruses were found in species of over 9 plant families, including Brassicaceae and Solanaceae. A replicase-like sequence of a betaflexivirus was identified in the cucumber genome. The pattern of occurrence of NRVSs and the phylogenetic analyses of NRVSs and related viruses indicate that multiple independent integrations into many plant lineages may have occurred. For example, one of the NRVSs was retained in Ar. thaliana but not in Ar. lyrata or other related Camelina species, whereas another NRVS displayed the reverse pattern. Our study has shown that single- and double-stranded RNA viral sequences are widespread in plant genomes, and shows the potential of genome integrated NRVSs to contribute to resolve unclear phylogenetic relationships of plant species. PMID:21779172

Tani, Akio; Saisho, Daisuke; Sakamoto, Wataru; Kanematsu, Satoko; Suzuki, Nobuhiro

2011-01-01

398

Widespread endogenization of genome sequences of non-retroviral RNA viruses into plant genomes.  

PubMed

Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The most widespread NRVSs were related to the coat protein (CP) genes of the family Partitiviridae which have bisegmented dsRNA genomes, and included plant- and fungus-infecting members. The CP of a novel fungal virus (Rosellinia necatrix partitivirus 2, RnPV2) had the greatest sequence similarity to Arabidopsis thaliana ILR2, which is thought to regulate the activities of the phytohormone auxin, indole-3-acetic acid (IAA). Furthermore, partitivirus CP-like sequences much more closely related to plant partitiviruses than to RnPV2 were identified in a wide range of plant species. In addition, the nucleocapsid protein genes of cytorhabdoviruses and varicosaviruses were found in species of over 9 plant families, including Brassicaceae and Solanaceae. A replicase-like sequence of a betaflexivirus was identified in the cucumber genome. The pattern of occurrence of NRVSs and the phylogenetic analyses of NRVSs and related viruses indicate that multiple independent integrations into many plant lineages may have occurred. For example, one of the NRVSs was retained in Ar. thaliana but not in Ar. lyrata or other related Camelina species, whereas another NRVS displayed the reverse pattern. Our study has shown that single- and double-stranded RNA viral sequences are widespread in plant genomes, and shows the potential of genome integrated NRVSs to contribute to resolve unclear phylogenetic relationships of plant species. PMID:21779172

Chiba, Sotaro; Kondo, Hideki; Tani, Akio; Saisho, Daisuke; Sakamoto, Wataru; Kanematsu, Satoko; Suzuki, Nobuhiro

2011-07-01

399

Staphylococcus aureus subsp. anaerobius strain ST1464 genome sequence  

PubMed Central

Staphylococcus aureus subsp. anaerobius is responsible for Morel's disease in animals and a cause of abscess in humans. It is characterized by a microaerophilic growth, contrary to the other strains of S. aureus. The 2,604,446-bp genome (32.7% GC content) of S. anaerobius ST1464 comprises one chromosome and no plasmids. The chromosome contains 2,660 open reading frames (ORFs), 49 tRNAs and three complete rRNAs, forming one complete operon. The size of ORFs ranges between 100 to 4,600 bp except for two ORFs of 6,417 and 7,173 bp encoding segregation ATPase and non-ribosomal peptide synthase, respectively. The chromosome harbors Staphylococcus phage 2638A genome and incomplete Staphylococcus phage genome PT1028, but no detectable CRISPRS. The antibiotic resistance gene for tetracycline was found although Staphylococcus aureus subsp. anaerobius is susceptible to tetracycline in-vitro. Intact oxygen detoxification genes encode superoxide dismutase and cytochrome quinol oxidase whereas the catalase gene is impaired by a stop codon. Based on the genome, in-silico multilocus sequence typing indicates that S. aureus subsp. anaerobius emerged as a clone separated from all other S. aureus strains, illustrating host-adaptation linked to missing functions. Availability of S. aureus subsp. anaerobius genome could prompt the development of post-genomic tools for its rapid discrimination from S. aureus. PMID:24501641

Elbir, Haitham; Robert, Catherine; Nguyen, Ti Thien; Gimenez, Grégory; El Sanousi, Sulieman M.; Flock, Jan-Ingmar; Raoult, Didier

2013-01-01

400

A web-based genomic sequence database for the Streptomycetaceae: a tool for systematics and genome mining  

Technology Transfer Automated Retrieval System (TEKTRAN)

The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...

401

Genome Sequence and Comparative Genome Analysis of Lactobacillus casei: Insights into Their Niche-Associated Evolution  

PubMed Central

Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

Cai, Hui; Thompson, Rebecca; Budinich, Mateo F.; Broadbent, Jeff R.

2009-01-01

402

Genome sequence and comparative genome analysis of Lactobacillus casei: insights into their niche-associated evolution.  

PubMed

Lactobacillus casei is remarkably adaptable to diverse habitats and widely used in the food industry. To reveal the genomic features that contribute to its broad ecological adaptability and examine the evolution of the species, the genome sequence of L. casei ATCC 334 is analyzed and compared with other sequenced lactobacilli. This analysis reveals that ATCC 334 contains a high number of coding sequences involved in carbohydrate utilization and transcriptional regulation, reflecting its requirement for dealing with diverse environmental conditions. A comparison of the genome sequences of ATCC 334 to L. casei BL23 reveals 12 and 19 genomic islands, respectively. For a broader assessment of the genetic variability within L. casei, gene content of 21 L. casei strains isolated from various habitats (cheeses, n = 7; plant materials, n = 8; and human sources, n = 6) was examined by comparative genome hybridization with an ATCC 334-based microarray. This analysis resulted in identification of 25 hypervariable regions. One of these regions contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation and was thus proposed as a lifestyle adaptation island. Differences in L. casei genome inventory reveal both gene gain and gene decay. Gene gain, via acquisition of genomic islands, likely confers a fitness benefit in specific habitats. Gene decay, that is, loss of unnecessary ancestral traits, is observed in the cheese isolates and likely results in enhanced fitness in the dairy niche. This study gives the first picture of the stable versus variable regions in L. casei and provides valuable insights into evolution, lifestyle adaptation, and metabolic diversity of L. casei. PMID:20333194

Cai, Hui; Thompson, Rebecca; Budinich, Mateo F; Broadbent, Jeff R; Steele, James L

2009-01-01

403

Conserved non-genic sequences — an unexpected feature of mammalian genomes  

Microsoft Academic Search

Mammalian genomes contain highly conserved sequences that are not functionally transcribed. These sequences are single copy and comprise approximately 1–2% of the human genome. Evolutionary analysis strongly supports their functional conservation, although their potentially diverse, functional attributes remain unknown. It is likely that genomic variation in conserved non-genic sequences is associated with phenotypic variability and human disorders. So how might

Alexandre Reymond; Emmanouil T. Dermitzakis; Stylianos E. Antonarakis

2005-01-01

404

Genome sequence of the Brown Norway rat yields insights into mammalian evolution  

Microsoft Academic Search

The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality `draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered,

Richard A. Gibbs; George M. Weinstock; Michael L. Metzker; Donna M. Muzny; Erica J. Sodergren; Steven Scherer; Graham Scott; David Steffen; Kim C. Worley; Paula E. Burch; Geoffrey Okwuonu; Sandra Hines; Lora Lewis; Christine DeRamo; Oliver Delgado; Shannon Dugan-Rocha; George Miner; Margaret Morgan; Alicia Hawes; Rachel Gill; Robert A. Holt; Mark D. Adams; Peter G. Amanatides; Holly Baden-Tillson; Mary Barnstead; Soo Chin; Cheryl A. Evans; Steve Ferriera; Carl Fosler; Anna Glodek; Zhiping Gu; Don Jennings; Cheryl L. Kraft; Trixie Nguyen; Cynthia M. Pfannkoch; Cynthia Sitter; Granger G. Sutton; J. Craig Venter; Trevor Woodage; Douglas Smith; Hong-Mei Lee; Erik Gustafson; Patrick Cahill; Arnold Kana; Lynn Doucette-Stamm; Keith Weinstock; Kim Fechtel; Robert B. Weiss; Diane M. Dunn; Eric D. Green; Robert W. Blakesley; Gerard G. Bouffard; Pieter J. de Jong; Kazutoyo Osoegawa; Baoli Zhu; Marco Marra; Jacqueline Schein; Ian Bosdet; Chris Fjell; Steven Jones; Martin Krzywinski; Carrie Mathewson; Asim Siddiqui; Natasja Wye; John McPherson; Shaying Zhao; Claire M. Fraser; Jyoti Shetty; Sofiya Shatsman; Keita Geer; Yixin Chen; Sofyia Abramzon; William C. Nierman; Richard A. Gibbs; Paul H. Havlak; Rui Chen; K. James Durbin; Amy Egan; Yanru Ren; Xing-Zhi Song; Bingshan Li; Yue Liu; Xiang Qin; Simon Cawley; A. J. Cooney; Lisa M. D'Souza; Kirt Martin; Jia Qian Wu; Manuel L. Gonzalez-Garay; Andrew R. Jackson; Kenneth J. Kalafus; Michael P. McLeod; Aleksandar Milosavljevic; Davinder Virk; Andrei Volkov; David A. Wheeler; Zhengdong Zhang; Jeffrey A. Bailey; Evan E. Eichler; Ewan Birney; Emmanuel Mongin; Abel Ureta-Vidal; Cara Woodwark; Evgeny Zdobnov; Peer Bork; Mikita Suyama; David Torrents; Marina Alexandersson; Barbara J. Trask; Janet M. Young; Hui Huang; Huajun Wang; Heming Xing; Sue Daniels; Darryl Gietzen; Jeanette Schmidt; Kristian Stevens; Ursula Vitt; Jim Wingrove; Francisco Camara; M. Mar Albà; Josep F. Abril; Roderic Guigo; Arian Smit; Inna Dubchak; Edward M. Rubin; Olivier Couronne; Alexander Poliakov; Norbert Hübner; Detlev Ganten; Claudia Goesele; Oliver Hummel; Thomas Kreitler; Young-Ae Lee; Jan Monti; Herbert Schulz; Heike Zimdahl; Heinz Himmelbauer; Hans Lehrach; Howard J. Jacob; Susan Bromberg; Jo Gullings-Handley; Michael I. Jensen-Seaman; Anne E. Kwitek; Jozef Lazar; Dean Pasko; Peter J. Tonellato; Simon Twigger; Chris P. Ponting; Jose M. Duarte; Stephen Rice; Leo Goodstadt; Scott A. Beatson; Richard D. Emes; Eitan E. Winter; Caleb Webber; Petra Brandt; Gerald Nyakatura; Margaret Adetobi; Laura Elnitski; Pallavi Eswara; Ross C. Hardison; Minmei Hou; Diana Kolbe; Kateryna Makova; Webb Miller; Anton Nekrutenko; Cathy Riemer; Scott Schwartz; James Taylor; Shan Yang; Yi Zhang; Klaus Lindpaintner; T. Dan Andrews; Mario Caccamo; Michele Clamp; Laura Clarke; Valerie Curwen; Richard Durbin; Eduardo Eyras; Stephen M. Searle; Gregory M. Cooper; Serafim Batzoglou; Arend Sidow; Eric A. Stone; Bret A. Payseur; Guillaume Bourque; Carlos López-Otín; Xose S. Puente; Kushal Chakrabarti; Sourav Chatterji; Lior Pachter; Nicolas Bray; Von Bing Yap; Anat Caspi; Glenn Tesler; Pavel A. Pevzner; David Haussler; Krishna M. Roskin; Robert Baertsch; Hiram Clawson; Terrence S. Furey; Angie S. Hinrichs; Donna Karolchik; William J. Kent; Kate R. Rosenbloom; Heather Trumbower; Matt Weirauch; David N. Cooper; Peter D. Stenson; Bin Ma; Michael Brent; Manimozhiyan Arumugam; David Shteynberg; Richard R. Copley; Martin S. Taylor; Harold Riethman; Uma Mudunuri; Jane Peterson; Mark Guyer; Adam Felsenfeld; Susan Old; Stephen Mockrin; Francis Collins

2004-01-01

405

Edinburgh Research Explorer Draft Genome Sequence of a Streptococcus agalactiae Strain  

E-print Network

Edinburgh Research Explorer Draft Genome Sequence of a Streptococcus agalactiae Strain Isolated, O'driscoll, A, Templeton, K, Ghazal, P & Sleator, RD 2014, 'Draft Genome Sequence of a Streptococcus date: 11. Dec. 2014 #12;Draft Genome Sequence of a Streptococcus agalactiae Strain Isolated from

Millar, Andrew J.

406

Draft Genome Sequences of Streptococcus bovis Strains ATCC 33317 and JB1  

PubMed Central

We report the draft genome sequences of Streptococcus bovis strain ATCC 33317 (CVM42251) isolated from cow dung and strain JB1 (CVM42252) isolated from a cow rumen in 1977. The strains were sequenced using the Genome Sequencer FLX 454 system. The genome sizes are approximately 2 Mb and 2.2 Mb, respectively. PMID:25301652

Benahmed, Faiza H.; Gopinath, Gopal R.; Harbottle, Heather; Cotta, Michael A.; Luo, Yan; Henderson, Carol; Teri, Plona; Soppet, Daniel; Rasmussen, Mark; Davidson, Maureen

2014-01-01

407

Complete Genome Sequence of Pelosinus sp. Strain UFO1 Assembled Using Single-Molecule Real-Time DNA Sequencing Technology  

PubMed Central

Pelosinus species can reduce metals such as Fe(III), U(VI), and Cr(VI) and have been isolated from diverse geographical regions. Five draft genome sequences have been published. We report the complete genome sequence for Pelosinus sp. strain UFO1 using only PacBio DNA sequence data and without manual finishing. PMID:25189589

Brown, Steven D.; Utturkar, Sagar M.; Magnuson, Timothy S.; Ray, Allison E.; Poole, Farris L.; Lancaster, W. Andrew; Thorgersen, Michael P.; Adams, Michael W. W.

2014-01-01

408

Repetitive sequences, genomic instability and Barrett's esophageal adenocarcinoma  

PubMed Central

Barrett's esophageal adenocarcinoma (BAC) is a cancer associated with heartburn. If gastroesophageal reflux is not treated, the exposure to acid over the years, leads to a premalignant condition known as Barrett's esophagus (BE) which then progresses through low grade and high grade dysplasias to Barrett's adenocarcinoma. Genomic instability, which seems to arise early at BE stage, leads to accrual of mutational changes which underlie the the succession of histological and physiological changes associated with this disease. Genomic instability is therefore an important target for prevention and treatment of cancer and it is important to elucidate the mechanisms associated with this problem. We have shown that elevated/deregulated homologous recombination mediates genomic instability in cancer. Recently we also demonstrated that the mutational rates of individual chromosomes in BAC cells correlate with their ALU frequency. The aims of this article are to briefly discuss different types of repetitive sequences and highlight their importance in physiology of normal and cancer cells, especially BAC. PMID:22479688

2011-01-01

409

Complete genome sequence of “Enterobacter lignolyticus” SCF1  

PubMed Central

In an effort to discover anaerobic bacteria capable of lignin degradation, we isolated “Enterobacter lignolyticus” SCF1 on minimal media with alkali lignin as the sole source of carbon. This organism was isolated anaerobically from tropical forest soils collected from the Short Cloud Forest site in the El Yunque National Forest in Puerto Rico, USA, part of the Luquillo Long-Term Ecological Research Station. At this site, the soils experience strong fluctuations in redox potential and are net methane producers. Because of its ability to grow on lignin anaerobically, we sequenced the genome. The genome of “E. lignolyticus” SCF1 is 4.81 Mbp with no detected plasmids, and includes a relatively small arsenal of lignocellulolytic carbohydrate active enzymes. Lignin degradation was observed in culture, and the genome revealed two putative laccases, a putative peroxidase, and a complete 4-hydroxyphenylacetate degradation pathway encoded in a single gene cluster. PMID:22180812

D’Haeseleer, Patrik; Chivian, Dylan; Fortney, Julian L.; Khudyakov, Jane; Simmons, Blake; Woo, Hannah; Arkin, Adam P.; Davenport, Karen Walston; Goodwin, Lynne; Chen, Amy; Ivanova, Natalia; Kyrpides, Nikos C.; Mavromatis, Konstantinos; Woyke, Tanja; Hazen, Terry C.

2011-01-01

410

Complete genome sequence of an attenuated Sparfloxacin-resistant Streptococcus agalactiae strain 138spar  

Technology Transfer Automated Retrieval System (TEKTRAN)

The complete genome of a sparfloxacin-resistant Streptococcus agalactiae vaccine strain 138spar is 1,838,126 bp in size. The genome has 1892 coding sequences and 82 RNAs. The annotation of the genome is added by the NCBI Prokaryotic Genome Annotation Pipeline. The publishing of this genome will allo...

411

Hellbender Genome Sequences Shed Light on Genomic Expansion at the Base of Crown Salamanders  

PubMed Central

Among animals, genome sizes range from 20 Mb to 130 Gb, with 380-fold variation across vertebrates. Most of the largest vertebrate genomes are found in salamanders, an amphibian clade of 660 species. Thus, salamanders are an important system for studying causes and consequences of genomic gigantism. Previously, we showed that plethodontid salamander genomes accumulate higher levels of long terminal repeat (LTR) retrotransposons than do other vertebrates, although the evolutionary origins of such sequences remained unexplored. We also showed that some salamanders in the family Plethodontidae have relatively slow rates of DNA loss through small insertions and deletions. Here, we present new data from Cryptobranchus alleganiensis, the hellbender. Cryptobranchus and Plethodontidae span the basal phylogenetic split within salamanders; thus, analyses incorporating these taxa can shed light on the genome of the ancestral crown salamander lineage, which underwent expansion. We show that high levels of LTR retrotransposons likely characterize all crown salamanders, suggesting that disproportionate expansion of this transposable element (TE) class contributed to genomic expansion. Phylogenetic and age distribution analyses of salamander LTR retrotransposons indicate that salamanders’ high TE levels reflect persistence and diversification of ancestral TEs rather than horizontal transfer events. Finally, we show that relatively slow DNA loss rates through small indels likely characterize all crown salamanders, suggesting that a decreased DNA loss rate contributed to genomic expansion at the clade’s base. Our identification of shared genomic features across phylogenetically distant salamanders is a first step toward identifying the evolutionary processes underlying accumulation and persistence of high levels of repetitive sequence in salamander genomes. PMID:25115007

Sun, Cheng; Mueller, Rachel Lockridge

2014-01-01

412

Streamlined Genome Sequence Compression using Distributed Source Coding  

PubMed Central

We aim at developing a streamlined genome sequence compression algorithm to support alternative miniaturized sequencing devices, which have limited communication, storage, and computation power. Existing techniques that require heavy client (encoder side) cannot be applied. To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side. Based on the variation between source and reference, our protocol will pick adaptively either syndrome coding or hash coding to compress subsequences of changing code length. Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS). PMID:25520552

Wang, Shuang; Jiang, Xiaoqian; Chen, Feng; Cui, Lijuan; Cheng, Samuel

2014-01-01

413

Permanent draft genome sequence of Geobacillus thermocatenulatus strain GS-1.  

PubMed

Geobacillus thermocatenulatus strain GS-1 is a thermophilic bacillus having a growth optimum at 60°C, capable of degrading alkanes. It was isolated from the formation water of a high-temperature deep oil reservoir in Qinghai oilfield, China. Here, we report the draft genome sequence with an estimated assembly size of 3.5Mb. A total of 3371 protein-coding sequences, including monooxygenase, alcohol dehydrogenase, aldehyde dehydrogenase, fatty acid-CoA ligase, acyl-CoA dehydrogenase, enoyl-CoA hydrogenase, hydroxyacyl-CoA dehydrogenase and thiolase, were detected in the genome, which are involved in the alkane degradation pathway. Our results may provide insights into the genetic basis of the adaptation of this strain to high-temperature oilfield ecosystems. PMID:25280889

Zheng, Beiwen; Zhang, Fan; Chai, Lujun; Yu, Gaoming; Shu, Fuchang; Wang, Zhengliang; Su, Sanbao; Xiang, Tingsheng; Zhang, Zhongzhi; Hou, DuJie; She, Yuehui

2014-10-01

414

Construction of an integrated database to support genomic sequence analysis  

SciTech Connect

The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

Gilbert, W.; Overbeek, R.

1994-11-01

415

Permanent draft genome sequence of Comamonas testosteroni KF-1  

PubMed Central

Comamonas testosteroni KF-1 is a model organism for the elucidation of the novel biochemical degradation pathways for xenobiotic 4-sulfophenylcarboxylates (SPC) formed during biodegradation of synthetic 4-sulfophenylalkane surfactants (linear alkylbenzenesulfonates, LAS) by bacterial communities. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 6,026,527 bp long chromosome (one sequencing gap) exhibits an average G+C content of 61.79% and is predicted to encode 5,492 protein-coding genes and 114 RNA genes. PMID:23991256

Weiss, Michael; Kesberg, Anna I.; LaButti, Kurt M.; Pitluck, Sam; Bruce, David; Hauser, Loren; Copeland, Alex; Woyke, Tanja; Lowry, Stephen; Lucas, Susan; Land, Miriam; Goodwin, Lynne; Kjelleberg, Staffan; Cook, Alasdair M.; Buhmann, Matthias; Thomas, Torsten; Schleheck, David

2013-01-01

416

Complete genome sequence of a porcine epidemic diarrhea virus variant.  

PubMed

In 2011, outbreaks of viral diarrhea were observed on most swine-breeding farms in most of the provinces of China. The disease is characterized by vomiting, severe diarrhea, and a high mortality rate (82.3%) in newborn piglets. The clinical appearance was similar to that of porcine epidemic diarrhea virus (PEDV) infection. PEDVs were detected in samples (feces or small intestines) from most farms. In order to investigate whether there is a PEDV variant circulating in China, we sequenced and analyzed the complete genome of the recently identified field strain, CH/FJND-3/2011. The sequence data indicate that this PEDV variant prevails in China. PMID:22354946

Chen, Jianfei; Liu, Xiaozhen; Shi, Da; Shi, Hongyan; Zhang, Xin; Feng, Li

2012-03-01

417

Comparison of three methods of parasitoid polydnavirus genomic DNA isolation to facilitate polydnavirus genomic sequencing.  

PubMed

A major long-term goal of polydnavirus (PDV) genome research is to identify novel virally encoded molecules that may serve as biopesticides to target insect pests that threaten agriculture and human health. As PDV viral replication in cell culture in vitro has not yet been achieved, several thousands of wasps must be dissected to yield enough viral DNA from the adult ovaries to carry out PDV genomic sequencing. This study compares three methods of PDV genomic DNA isolation for the PDV of Cotesia flavipes, which parasitizes the sugarcane borer, Diatraea saccharalis, preparatory to sequencing the C. flavipes bracovirus genome. Two of these protocols incorporate phenol-chloroform DNA extraction steps in the procedure and the third protocol uses a modified Qiagen DNA kit method to extract viral DNA. The latter method proved significantly less time-consuming and more cost-effective. Efforts are currently underway to bioengineer insect pathogenic viruses with PDV genes, so that their gene products will enhance baculovirus virulence for agricultural insect pests, either via suppression of the immune system of the host or by PDV-mediated induction of its developmental arrest. Sequencing a growing number of complete PDV genomes will enhance those efforts, which will be facilitated by the study reported here. PMID:18348210

Rodríguez-Pérez, Mario A; Beckage, Nancy E

2008-04-01

418

Complete genome sequence of Desulfomicrobium baculatum type strain (XT)  

SciTech Connect

Desulfomicrobium baculatum is the type species of the genus Desulfomicrobium, which is the type genus of the family Desulfomicrobiaceae. It is of phylogenetic interest because of the isolated location of the family Desulfomicrobiaceae within the order Desulfovibrionales. D. baculatum strain XT is a Gram-negative, motile, sulfate-reducing bacterium isolated from water-saturated manganese carbonate ore. It is strictly anaerobic and does not require NaCl for growth, although NaCl concentrations up to 6percent (w/v) are tolerated. The metabolism is respiratory or fermentative. In the presence of sulfate, pyruvate and lactate are incompletely oxidized to acetate and CO2. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the deltaproteobacterial family Desulfomicrobiaceae, and this 3,942,657 bp long single replicon genome with its 3494 protein-coding and 72 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Copeland, Alex; Spring, Stefan; Goker, Markus; Schneider, Susanne; Lapidus, Alla; Glavina Del Rio, Tijana; Tice, Hope; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C; Meincke, Linda; Sims, David; Brettin, Thomas; Detter, John C; Han, Cliff; Chain, Patrick; Bristow, James; Eisen, Jonathan; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C; Lucas, Susan

2009-05-20

419

Structure and sequence of the saimiriine herpesvirus 1 genome.  

PubMed

We report here the complete genome sequence of the squirrel monkey ?-herpesvirus saimiriine herpesvirus 1 (HVS1). Unlike the simplexviruses of other primate species, only the unique short region of the HVS1 genome is bounded by inverted repeats. While all Old World simian simplexviruses characterized to date lack the herpes simplex virus RL1 (?34.5) gene, HVS1 has an RL1 gene. HVS1 lacks several genes that are present in other primate simplexviruses (US8.5, US10-12, UL43/43.5 and UL49A). Although the overall genome structure appears more like that of varicelloviruses, the encoded HVS1 proteins are most closely related to homologous proteins of the primate simplexviruses. Phylogenetic analyses confirm that HVS1 is a simplexvirus. Limited comparison of two HVS1 strains revealed a very low degree of sequence variation more typical of varicelloviruses. HVS1 is thus unique among the primate ?-herpesviruses in that its genome has properties of both simplexviruses and varicelloviruses. PMID:21130483

Tyler, Shaun; Severini, Alberto; Black, Darla; Walker, Matthew; Eberle, R

2011-02-01

420

Structure and sequence of the saimiriine herpesvirus 1 genome  

PubMed Central

We report here the complete genome sequence of the squirrel monkey ?-herpesvirus saimiriine herpesvirus 1 (HVS1). Unlike the simplexviruses of other primate species, only the unique short region of the HVS1 genome is bounded by inverted repeats. While all Old World simian simplexviruses characterized to date lack the herpes simplex virus RL1 (?34.5) gene, HVS1 has an RL1 gene. HVS1 lacks several genes that are present in other primate simplexviruses (US8.5, US10–12, UL43/43.5 and UL49A). Although the overall genome structure appears more like that of varicelloviruses, the encoded HVS1 proteins are most closely related to homologous proteins of the primate simplexviruses. Phylogenetic analyses confirm that HVS1 is a simplexvirus. Limited comparison of two HVS1 strains revealed a very low degree of sequence variation more typical of varicelloviruses. HVS1 is thus unique among the primate ?-herpesviruses in that its genome has properties of both simplexviruses and varicelloviruses. PMID:21130483

Tyler, Shaun; Severini, Alberto; Black, Darla; Walker, Matthew; Eberle, R.

2010-01-01

421

Complete Genome Sequence of Serratia marcescens WW4  

PubMed Central

Serratia marcescens WW4 is a biofilm-forming bacterium isolated from paper machine aggregates. Under conditions of phosphate limitation, this bacterium exhibits intergeneric inhibition of Pseudomonas aeruginosa. Here, the complete genome sequence of S. marcescens WW4, which consists of one circular chromosome (5,241,455 bp) and one plasmid (pSmWW4; 3,248 bp), was determined. PMID:23558532

Chung, Wan-Chia; Chen, Ling-Ling; Lo, Wen-Sui; Kuo, Pei-An; Tu, Jenn

2013-01-01

422

Complete Genome Sequence of Mycobacterium xenopi Type Strain RIVM700367  

PubMed Central

Mycobacterium xenopi is a slow-growing, thermophilic, water-related Mycobacterium species. Like other nontuberculous mycobacteria, M. xenopi more commonly infects humans with altered immune function, such as chronic obstructive pulmonary disease patients. It is considered clinically relevant in a significant proportion of the patients from whom it is isolated. We report here the whole genome sequence of M. xenopi type strain RIVM700367. PMID:22628510

Rashid, Mamoon; Adroub, Sabir A.; Elabdalaoui, Hafida; Ali, Shahjahan; van Soolingen, Dick; Bitter, Wilbert

2012-01-01

423

Complete Genome Sequence of Vibrio alginolyticus ATCC 17749T  

PubMed Central

Vibrio alginolyticus is a Gram-negative halophilic bacterium and has been recognized as an opportunistic pathogen in both humans and marine animals. It is the causative agent of food-borne diseases, such as gastroenteritis, and it invades through wounds in predisposed individuals. In this study, we present the completed genome of V. alginolyticus ATCC 17749T through high-throughput sequencing. PMID:25635021

Liu, Xiao-Fei; Cao, Yuan; Zhang, He-Lin; Chen, Ying-Jian

2015-01-01

424

Genome Sequence of Proteus mirabilis Clinical Isolate C05028.  

PubMed

Genomic DNA of Proteus mirabilis C05028 was sequenced by an Illumina HiSeq platform and was assembled to 39 scaffolds with a total length of 3.8 Mb. Next, open reading frames (ORFs) were identified and were annotated by the KEGG, COG, and NR databases. Finally, we found special virulence factors only existing in P. mirabilis C05028. PMID:24675851

Shi, Xiaolu; Zhu, Yuanfang; Li, Yinghui; Jiang, Min; Lin, Yiman; Qiu, Yaqun; Chen, Qiongcheng; Yuan, Yanting; Ni, Peixiang; Hu, Qinghua; Huang, Shenghe

2014-01-01

425

Global Genomic Diversity of Human Papillomavirus 6 Based on 724 Isolates and 190 Complete Genome Sequences  

PubMed Central

ABSTRACT Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. IMPORTANCE This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV6 variant lineages and five sublineages were identified and showed some degree of association with geographical location, anatomical site of infection/disease, and/or gender. We additionally identified several HPV6 lineage- and sublineage-specific SNPs to facilitate the identification of HPV6 variants and determined a representative region within the L2 gene that is suitable for HPV6 whole-genome-based phylogenetic analysis. This study complements and significantly expands the current knowledge of HPV6 genetic diversity and forms a comprehensive basis for future epidemiological, evolutionary, functional, pathogenicity, vaccination, and molecular assay development studies. PMID:24741079

Jelen, Mateja M.; Chen, Zigui; Kocjan, Boštjan J.; Burt, Felicity J.; Chan, Paul K. S.; Chouhy, Diego; Combrinck, Catharina E.; Coutlée, François; Estrade, Christine; Ferenczy, Alex; Fiander, Alison; Franco, Eduardo L.; Garland, Suzanne M.; Giri, Adriana A.; González, Joaquín Víctor; Gröning, Arndt; Heidrich, Kerstin; Hibbitts, Sam; Hošnjak, Lea; Luk, Tommy N. M.; Marinic, Karina; Matsukura, Toshihiko; Neumann, Anna; Oštrbenk, Anja; Picconi, Maria Alejandra; Richardson, Harriet; Sagadin, Martin; Sahli, Roland; Seedat, Riaz Y.; Seme, Katja; Severini, Alberto; Sinchi, Jessica L.; Smahelova, Jana; Tabrizi, Sepehr N.; Tachezy, Ruth; Tohme, Sarah; Uloza, Virgilijus; Vitkauskiene, Astra; Wong, Yong Wee; Židovec Lepej, Snježana; Burk, Robert D.

2014-01-01

426

Sequencing of a New Target Genome: the Pediculus humanus humanus (Phthiraptera: Pediculidae) Genome Project  

Microsoft Academic Search

The human body louse, Pediculus humanus humanus (L.), and the human head louse, Pediculus humanus capitis, belong to the hemimetabolous order Phthiraptera. The body louse is the primary vector that transmits the bacterial agents of louse-borne relapsing fever, trench fever, and epidemic typhus. The genomes of the bacterial causative agents of several of these aforementioned diseases have been sequenced. Thus,

B. R. Pittendrigh; J. M. Clark; J. S. Johnston; S. H. Lee; J. Romero-Severson; G. A. Dasch

2006-01-01

427

Unraveling genomic variation from next generation sequencing data  

PubMed Central

Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

2013-01-01

428

Complete DNA Sequence of the Rat Cytomegalovirus Genome  

PubMed Central

We have determined the complete genome sequence of the Maastricht strain of rat cytomegalovirus (RCMV). The RCMV genome has a length of 229,896 bp and is arranged as a single unique sequence flanked by 504-bp terminal direct repeats. RCMV was found to have counterparts of all but one of the open reading frames (ORFs) that are conserved between murine CMV (MCMV) and human CMV (HCMV). Like HCMV, RCMV lacks homologs of the genes belonging to the MCMV m02 glycoprotein gene family. However, RCMV contains 15 ORFs with homology to members of the MCMV m145 glycoprotein gene family. Four ORFs are predicted to encode homologs of host proteins; R33 and R78 both putatively encode G protein-coupled receptors, whereas r144 and r131 encode homologs of major histocompatibility class I heavy chains and CC chemokines, respectively. An intriguing feature of the RCMV genome is the presence of an ORF, r127, with similarity to the rep gene of parvoviruses as well as ORF U94 of human herpesvirus 6A (HHV-6A) and HHV-6B. Counterparts of these ORFs have not been found in the other sequenced herpesviruses. PMID:10906222

Vink, Cornelis; Beuken, Erik; Bruggeman, Cathrien A.

2000-01-01

429