These are representative sample records from Science.gov related to your search topic.
For comprehensive and current results, perform a real-time search at Science.gov.
1

Genome Sequencing Centers  

Cancer.gov

The Cancer Genome Atlas (TCGA) Genome Sequencing Centers (GSCs) perform large-scale DNA sequencing using the latest sequencing technologies. Supported by the National Human Genome Research Institute (NHGRI) large-scale sequencing program, the GSCs generate the enormous volume of data required by TCGA, while continually improving existing technologies and methods to expand the frontier of what can be achieved in cancer genome sequencing.

2

The Genome Sequencing Center at NCGR  

SciTech Connect

Faye Schilkey from the National Center for Genome Resources discusses NCGR's research, sequencing and analysis experience on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Schilkey, Faye [National Center for Genome Resources

2010-06-02

3

A large genome center's improvements to the Illumina sequencing system  

Microsoft Academic Search

The Wellcome Trust Sanger Institute is one of the world's largest genome centers, and a substantial amount of our sequencing is performed with 'next-generation' massively parallel sequencing technologies: in June 2008 the quantity of purity-filtered sequence data generated by our Genome Analyzer (Illumina) platforms reached 1 terabase, and our average weekly Illumina production output is currently 64 gigabases. Here we

Michael A Quail; Iwanka Kozarewa; Frances Smith; Aylwyn Scally; Philip J Stephens; Richard Durbin; Harold Swerdlow; Daniel J Turner

2008-01-01

4

Genome Science: A Video Tour of the Washington University Genome Sequencing Center for High School and Undergraduate Students  

ERIC Educational Resources Information Center

Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…

Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.

2005-01-01

5

Operational streamlining in a high-throughput genome sequencing center  

E-print Network

Advances in medicine rely on accurate data that is rapidly provided. It is therefore critical for the Genome Sequencing platform of the Broad Institute of MIT and Harvard to continually strive to reduce cost, improve ...

Person, Kerry P. (Kerry Patrick)

2006-01-01

6

Nevada Genomics Center These are general instructions on how to use dnaTools to submit sequencing  

E-print Network

Nevada Genomics Center These are general instructions on how to use dnaTools to submit sequencing samples. We here at the Nevada Genomics Center feel that dnaTools is user friendly and fairly intuitive-784-1657) or email us (Genomics@unr.nevada.edu) and we will assist you. How to use dnaTools Table of Contents

Hemmers, Oliver

7

Integration of PacBio RS into Massive Parallel Sequencing and Data Analysis Pipelining at the UC Davis Genome Center  

PubMed Central

Whole genome sequencing and genomic biology has been widely adopted in many fields of biology as next-generation sequencing technology (NGS) has rapidly improved quality, read length, and throughput to make whole genome sequencing and association studies possible in a very cost effective manner. Continued improvement and development of sample preparation protocols and data analysis tools have been significant in helping to extend genome sequencing technology to genomes that were previously difficult to sequence. Recent arrival of Pacific Biosciences RS (PacBio) contributed in furthering such opportunity by providing options for single molecule long read sequencing in real time and kinetic analysis (methylation). PacBio has been employed successfully for sequencing low complexity genomic region such as extremely high GC, long repeats, rearrangement, gene fusion, etc. In this poster we present the optimization of PacBio sample preparation that was fine-tuned to meet unique challenges of sequencing through “difficult-to-sequence” template. We discuss the integration of PacBio into the wet lab equipped with other NGS platforms and data pipelining workflow including cloud computing and robotic sample preparation at the Genome Center. UC Davis Genome Center currently operates NGS technology platforms including HiSeq, MiSeq, PacBio, and has genotyping capacity using Illumina Infinium and GoldenGate technology. UC Davis Genome Center and Bioinformatics Program provides most up-to-date genome technology and informatics support tailored for specific biological goals meeting needs for more than 80 faculty members within Genome Center and more than 200 campus and off-campus researchers.

Vanessa, Rashbrook; O'Geen, Henriette; Nguyen, Oanh; Ashtari, Siranoosh; Fan, Xiaohong; Kim, Ryan

2013-01-01

8

Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)  

SciTech Connect

John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

Crow, John [National Center for Genome Resources] [National Center for Genome Resources

2012-06-01

9

Introducing National Center for Genome Resources (NCGR) Informatics (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)  

ScienceCinema

John Crow from the National Center for Genome Resources discusses his organization's informatics at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

Crow, John [National Center for Genome Resources

2013-01-25

10

Whole Genome Sequencing  

MedlinePLUS

... Most of the information you get from a genomic test tells you about your risk for disease, ... A health forecast: Understanding disease risk from whole genomic sequencing Weather forecasting tries to predict what the ...

11

Sequencing technologies and genome sequencing  

Microsoft Academic Search

The high-throughput - next generation sequencing (HT-NGS) technologies are currently the hottest topic in the field of human\\u000a and animals genomics researches, which can produce over 100 times more data compared to the most sophisticated capillary sequencers\\u000a based on the Sanger method. With the ongoing developments of high throughput sequencing machines and advancement of modern\\u000a bioinformatics tools at unprecedented pace,

Chandra Shekhar Pareek; Rafal Smoczynski; Andrzej Tretyn

12

Genome Characterization Centers  

Cancer.gov

Genomics is a fast-moving field with novel technologies and platforms that help characterize the genome being made available to the research community on a continual basis. The Cancer Genome Atlas (TCGA) Genome Characterization Centers (GCCs) are responsible for characterizing all of the genomic changes found in the tumors studied as part of the TCGA program.

13

Genome Data Analysis Centers  

Cancer.gov

The use of novel technologies, the need to integrate different data types and the immense quantity of data generated by The Cancer Genome Atlas (TCGA) Research Network has led to an expansion of the TCGA Research Network to include new centers devoted to data analysis. The Genome Data Analysis Centers (GDACs) work hand-in-hand with the Genome Characterization Centers (GCCs) to develop state-of-the-art tools that assist researchers with processing and integrating data analyses across the entire genome.

14

Human Genome Center  

NSDL National Science Digital Library

Human Genome Center At Lawrence Berkeley Lab (LBL), Berkeley, California: offering information about projects in Biology, Informatics and Instrumentation, photos of LBL robotic instruments, software, and online access to one LBL genomic database.

15

Porcine Genomic Sequencing Initiative  

Microsoft Academic Search

A. Specific biological rationales for the utility of the porcine sequence information Rationale and Objectives. Completion of the human genome sequence provides the starting point for understanding the genetic complexity of humans and how genetic variation contributes to diverse phenotypes and disease. It is clear that model organisms have played an invaluable role in the synthesis of this understanding. It

Gary Rohrer; Jonathan E. Beever; Max F. Rothschild; Lawrence Schook; Richard Gibbs; George Weinstock; W. Gregory

16

Operations capability improvement of a molecular biology laboratory in a high throughput genome sequencing center  

E-print Network

The Broad Institute is a research collaboration of MIT, Harvard University and affiliated hospitals, and the Whitehead Institute for Biomedical Research. Its scientific mission is to "(1) create tools for genomic medicine ...

Vokoun, Matthew R. (Matthew Richard)

2005-01-01

17

Prenatal Whole Genome Sequencing  

PubMed Central

With whole genome sequencing set to become the preferred method of prenatal screening, we need to pay more attention to the massive amount of information it will deliver to parents—and the fact that we don't yet understand what most of it means. PMID:22777977

Donley, Greer; Hull, Sara Chandros; Berkman, Benjamin E.

2014-01-01

18

Genome Sequence of \\  

Microsoft Academic Search

Members of the noncultured clade of Frankia enter into root nodule symbioses with actinorhizal species from the orders Cucurbitales and Rosales. We report the genome sequence of a member of this clade originally from Pakistan but obtained from root nodules of the American plant Datisca glomerata without isolation in culture.

Thomas Persson; David R Benson; Philippe Normand; Brian Vanden Heuvel; Petar Pujic; Olga Chertkov; Hazuki Teshima; David Bruce; J. Chris Detter; Roxanne Tapia; Cliff Han; James Han; Tanja Woyke; Sam Pitluck; Len Pennacchio; Matt Nolan; N Ivanova; Amrita Pati; Miriam L Land; Katharina Pawlowski; Alison M Berry

2011-01-01

19

Wheat and Barley Genome Sequencing  

Microsoft Academic Search

A high quality reference genome sequence is a prerequisite resource for accessing any gene, driving genomics-based approaches\\u000a to systems biology, and for efficient exploitation of natural and induced genetic diversity of an organism. Wheat and barley\\u000a possess genomes of a size that was long presumed to be not amenable for whole genome sequencing. So far, only limited genomic\\u000a sequencing of

Kellye Eversole; Andreas Graner; Nils Stein

20

Towards Sequencing Cotton (Gossypium) Genomes  

Technology Transfer Automated Retrieval System (TEKTRAN)

Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly. Generating larger amounts of sequence data more quickly does not address the difficulties of sequencing and assembling complex genomes de novo. The cotton genomes represent a...

21

Genome Sequence Databases (Overview): Sequencing and Assembly  

SciTech Connect

From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

Lapidus, Alla L.

2009-01-01

22

PERSPECTIVE Personal genome sequencing: current  

E-print Network

of this information for discovery and medicine is enormous. Fourteen genome sequences have been reported to datePERSPECTIVE Personal genome sequencing: current approaches and challenges Michael Snyder,1,5 Jiang Du,2 and Mark Gerstein2,3,4 1 Department of Genetics, Stanford University School of Medicine

Gerstein, Mark

23

Pig genome sequence - analysis and publication strategy  

Microsoft Academic Search

BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies of the BAC clone derived genome sequence have been annotated using the Pre-Ensembl and Ensembl automated pipelines and made accessible through

Alan L Archibald; Lars Bolund; Carol Churcher; Merete Fredholm; Martien AM Groenen; Barbara Harlizius; Kyung-Tai Lee; Denis Milan; Jane Rogers; Max F Rothschild; Hirohide Uenishi; Jun Wang; Lawrence B Schook

2010-01-01

24

The Center for integrative genomics  

E-print Network

The Center for integrative genomics Report 2005­2006 #12;Presentation Director's message 4 Scientific advisory committee 6 Organigram of the CIG 7 research The structure and function of genomes and their evolution alexandrereymond ­ Genome structure and expression 10 henrikKaessmann ­ Evolutionary genomics 12

Kaessmann, Henrik

25

The complete sequence of a heterochromatic island from a higher eukaryote. The Cold Spring Harbor Laboratory, Washington University Genome Sequencing Center, and PE Biosystems Arabidopsis Sequencing Consortium.  

PubMed

Heterochromatin, constitutively condensed chromosomal material, is widespread among eukaryotes but incompletely characterized at the nucleotide level. We have sequenced and analyzed 2.1 megabases (Mb) of Arabidopsis thaliana chromosome 4 that includes 0.5-0.7 Mb of isolated heterochromatin that resembles the chromosomal knobs described by Barbara McClintock in maize. This isolated region has a low density of expressed genes, low levels of recombination and a low incidence of genetrap insertion. Satellite repeats were absent, but tandem arrays of long repeats and many transposons were found. Methylation of these sequences was dependent on chromatin remodeling. Clustered repeats were associated with condensed chromosomal domains elsewhere. The complete sequence of a heterochromatic island provides an opportunity to study sequence determinants of chromosome condensation. PMID:10676819

2000-02-01

26

Bacterial genome sequence bagged  

SciTech Connect

This is a summary of the research which produced the first complete genome of a free-living organism, the bacterium Haemophilus influenzae. Also included are the practical information and future possibilities of this type of research. The work was done partly under the aspices of Human Genome Program.

Nowak, R.

1995-07-28

27

Genomic sequencing in clinical trials  

PubMed Central

Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to find its way into clinical trials both nationally and worldwide. We highlight the currently available types of genomic sequencing platforms, outline the advantages and disadvantages of each, and compare first- and next-generation techniques with respect to capabilities, quality, and cost. We describe the current geographical distributions and types of disease conditions in which these technologies are used, and how next-generation sequencing is strategically being incorporated into new and existing studies. Lastly, recent major breakthroughs and the ongoing challenges of using genomic sequencing in clinical research are discussed. PMID:22206293

2011-01-01

28

Using the Potato Genome Sequence! Robin Buell!  

E-print Network

Using the Potato Genome Sequence! Robin Buell! Michigan State University! Department of Plant Biology! August 15, 2010! buell@msu.edu! 1 #12;Whole Genome Shotgun Sequencing 2 #12;New genomics & post-genomic biology genomes genera 2002 2010 3 #12;So, you say you can sequence-Now what

Douches, David S.

29

Integrating sequence, evolution and functional genomics in regulatory genomics  

PubMed Central

With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome. PMID:19226437

Vingron, Martin; Brazma, Alvis; Coulson, Richard; van Helden, Jacques; Manke, Thomas; Palin, Kimmo; Sand, Olivier; Ukkonen, Esko

2009-01-01

30

Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome of a plant,  

E-print Network

COMMENTARY Comparative Sequencing of Plant Genomes: Choices to Make The first sequenced genome Information Entrez Genome Projects website reports that sequencing of several more plant genomes is in prog in plant genomics re- search. Many of the obvious candidates for genome sequencing, model species

Purugganan, Michael D.

31

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF), Martinsried, near Munich, Germany, continues its longstanding tradition to develop and maintain high quality curated genome databases. In addition, efforts have been intensified to cover the wealth of complete genome sequences in a systematic, comprehensive form. Bioinformatics, supporting national as well as European sequencing and functional analysis projects, has resulted in several

Hans-werner Mewes; Dmitrij Frishman; Christian Gruber; Birgitta Geier; Dirk Haase; Andreas Kaps; Kai Lemcke; Gertrud Mannhaupt; Friedhelm Pfeiffer; Christine M. Schüller; S. Stocker; B. Weil

2000-01-01

32

Poultry Genome Sequences: Progress and Outstanding Challenges  

Technology Transfer Automated Retrieval System (TEKTRAN)

The first build of the chicken genome sequence appeared in March 2004 – the first genome sequence of any animal agriculture species. That sequence was done primarily by whole genome shotgun Sanger sequencing, along with the use of an extensive BAC contig-based physical map to assemble the sequence ...

33

The Center for integrative genomics  

E-print Network

The Center for integrative genomics Faculty of biology and medicine a new adventure #12;The CIG building | 2 The Center for Integrative Genomics (CIG) is a new inter- disciplinary research and training forms the latest department of the newly estab- lished UNIL Faculty of Biology and Medicine, and has

Fankhauser, Christian

34

Whole genome sequencing in pharmacogenomics  

PubMed Central

Pharmacogenomics aims to shed light on the role of genes and genomic variants in clinical treatment response. Although, several drug–gene relationships are characterized to date, many challenges still remain toward the application of pharmacogenomics in the clinic; clinical guidelines for pharmacogenomic testing are still in their infancy, whereas the emerging high throughput genotyping technologies produce a tsunami of new findings. Herein, the potential of whole genome sequencing on pharmacogenomics research and clinical application are highlighted.

Katsila, Theodora

2015-01-01

35

Sequencing Complex Genomic Regions  

SciTech Connect

Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 2 of 2

Eichler, Evan [University of Washington

2009-05-28

36

Sequencing Complex Genomic Regions  

SciTech Connect

Evan Eichler, Howard Hughes Medical Investigator at the University of Washington, gives the May 28, 2009 keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM. Part 1 of 2

Eichler, Evan [University of Washington

2009-05-28

37

Pairwise Comparison Between Genomic Sequences and  

E-print Network

in comparative genomics and comparative optical-map study, respectively. A complete genome sequencePairwise Comparison Between Genomic Sequences and Optical-maps by Bing Sun A dissertation submitted experimental tech- nologies, massive amount of biological data including genomic sequences and optical-maps

Mohri, Mehryar

38

Sequencing the Sinorhizobium meliloti genome.  

PubMed

The Sinorhizobium meliloti genome consists of three replicons. This bacterium forms an intricate symbiotic relationship with the roots of certain legumes and is considered as an agriculturally important nitrogen-fixer. A consortium of 6 European laboratories was organized to sequence its single chromosome (3.7 Mb), whereas the other two elements (pSyma 1.4 Mb and pSymb 1.7 Mb) will be sequenced by other groups. PMID:11092731

Galibert, F; Barloy-Hubler, F; Capela, D; Gouzy, J

2000-01-01

39

SPECIAL FEATURES Genomic Sequence Databases  

E-print Network

SPECIAL FEATURES i I COMMENTARY Genomic Sequence Databases MICHAEL5. WATERMAN Departments at the molecular level are now being felt in other areas such as cell biology and medicine. The quantity a smaller number of people from chemistry, physics, medicine, the mathematical sciences, and other fields

Waterman, Michael S.

40

Genome Sequence of Salmonella Phage ?  

PubMed Central

Salmonella bacteriophage ? is a member of the Siphoviridae family that gains entry into its host cells by adsorbing to their flagella. We report the complete 59,578-bp sequence of the genome of phage ?, which together with its relatives, exemplifies a largely unexplored type of tailed bacteriophage. PMID:25720684

Ko, Ching-Chung; Jacobs-Sera, Deborah; Hatfull, Graham F.; Erhardt, Marc; Hughes, Kelly T.; Casjens, Sherwood R.

2015-01-01

41

Fuzzy Genome Sequence Assembly for Single and Environmental Genomes  

E-print Network

and to the first genome sequence as- sembly, Bacteriophage X174 [38]. In 1990 the Human Genome Project in 2003, two years before its projected date. #12;2 Sara Nasser, et al In 1993 The Institute for Genome advancements in technology that lead the to complete sequencing of the Human Genome and the H. influenzae

Nicolescu, Monica

42

Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission  

PubMed Central

Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures. PMID:21304715

Giongo, Adriana; Tyler, Heather L.; Zipperer, Ursula N.; Triplett, Eric W.

2010-01-01

43

Almost finished: the complete genome sequence of Mycosphaerella graminicola  

Technology Transfer Automated Retrieval System (TEKTRAN)

Mycosphaerella graminicola causes septoria tritici blotch of wheat. An 8.9x shotgun sequence of bread wheat strain IPO323 was generated through the Community Sequencing Program of the U.S. Department of Energy’s Joint Genome Institute (JGI), and was finished at the Stanford Human Genome Center. The ...

44

Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing  

E-print Network

Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing Ken J. Kalafus,1,2 Andrew R and Molecular Biophysics, 2 Bioinformatics Research Laboratory, 3 Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, USA Pash is a computer

Batzoglou, Serafim

45

The Sequence of the Human Genome  

Microsoft Academic Search

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome

J. Craig Venter; Mark D. Adams; Eugene W. Myers; Peter W. Li; Richard J. Mural; Granger G. Sutton; Hamilton O. Smith; Mark Yandell; Cheryl A. Evans; Robert A. Holt; Jeannine D. Gocayne; Peter Amanatides; Richard M. Ballew; Daniel H. Huson; Jennifer R. Wortman; Qing Zhang; Chinnappa D. Kodira; Xiangqun H. Zheng; Lin Chen; Marian Skupski; Gangadharan Subramanian; Paul D. Thomas; Jinghui Zhang; George L. Gabor Miklos; Catherine Nelson; Samuel Broder; Andrew G. Clark; Joe Nadeau; Victor A. McKusick; Norton Zinder; Arnold J. Levine; Mel Simon; Carolyn Slayman; Michael Hunkapiller; Randall Bolanos; Arthur Delcher; Ian Dew; Daniel Fasulo; Michael Flanigan; Liliana Florea; Aaron Halpern; Sridhar Hannenhalli; Saul Kravitz; Samuel Levy; Clark Mobarry; Knut Reinert; Karin Remington; Jane Abu-Threideh; Ellen Beasley; Kendra Biddick; Vivien Bonazzi; Rhonda Brandon; Michele Cargill; Ishwar Chandramouliswaran; Rosane Charlab; Kabir Chaturvedi; Zuoming Deng; Valentina Di Francesco; Patrick Dunn; Karen Eilbeck; Carlos Evangelista; Andrei E. Gabrielian; Weiniu Gan; Wangmao Ge; Fangcheng Gong; Zhiping Gu; Ping Guan; Thomas J. Heiman; Maureen E. Higgins; Rui-Ru Ji; Zhaoxi Ke; Karen A. Ketchum; Zhongwu Lai; Yiding Lei; Zhenya Li; Jiayin Li; Yong Liang; Xiaoying Lin; Fu Lu; Gennady V. Merkulov; Natalia Milshina; Helen M. Moore; Ashwinikumar K Naik; Vaibhav A. Narayan; Beena Neelam; Deborah Nusskern; Douglas B. Rusch; Steven Salzberg; Wei Shao; Bixiong Shue; Jingtao Sun; Zhen Yuan Wang; Aihui Wang; Xin Wang; Jian Wang; Ming-Hui Wei; Ron Wides; Chunlin Xiao; Chunhua Yan; Alison Yao; Jane Ye; Ming Zhan; Weiqing Zhang; Hongyu Zhang; Qi Zhao; Liansheng Zheng; Fei Zhong; Wenyan Zhong; Shiaoping C. Zhu; Shaying Zhao; Dennis Gilbert; Suzanna Baumhueter; Gene Spier; Christine Carter; Anibal Cravchik; Trevor Woodage; Feroze Ali; Huijin An; Aderonke Awe; Danita Baldwin; Holly Baden; Mary Barnstead; Ian Barrow; Karen Beeson; Dana Busam; Amy Carver; Ming Lai Cheng; Liz Curry; Steve Danaher; Lionel Davenport; Raymond Desilets; Susanne Dietz; Kristina Dodson; Lisa Doup; Steven Ferriera; Neha Garg; Andres Gluecksmann; Brit Hart; Jason Haynes; Charles Haynes; Cheryl Heiner; Suzanne Hladun; Damon Hostin; Jarrett Houck; Timothy Howland; Chinyere Ibegwam; Jeffery Johnson; Francis Kalush; Lesley Kline; Shashi Koduru; Amy Love; Felecia Mann; David May; Steven McCawley; Tina McIntosh; Ivy McMullen; Mee Moy; Linda Moy; Brian Murphy; Keith Nelson; Cynthia Pfannkoch; Eric Pratts; Vinita Puri; Hina Qureshi; Matthew Reardon; Robert Rodriguez; Yu-Hui Rogers; Deanna Romblad; Bob Ruhfel; Richard Scott; Cynthia Sitter; Michelle Smallwood; Erin Stewart; Renee Strong; Ellen Suh; Reginald Thomas; Ni Ni Tint; Sukyee Tse; Claire Vech; Gary Wang; Jeremy Wetter; Sherita Williams; Monica Williams; Sandra Windsor; Emily Winn-Deen; Keriellen Wolfe; Jayshree Zaveri; Karena Zaveri; Josep F. Abril; Roderic Guigo; Michael J. Campbell; Kimmen V. Sjolander; Brian Karlak; Anish Kejariwal; Huaiyu Mi; Betty Lazareva; Thomas Hatton; Apurva Narechania; Karen Diemer; Anushya Muruganujan; Nan Guo; Shinji Sato; Vineet Bafna; Sorin Istrail; Ross Lippert; Russell Schwartz; Brian Walenz; Shibu Yooseph; David Allen; Anand Basu; James Baxendale; Louis Blick; Marcelo Caminha; John Carnes-Stine; Parris Caulk; Yen-Hui Chiang; Carl Dahlke; Anne Deslattes Mays; Maria Dombroski; Michael Donnelly; Dale Ely; Shiva Esparham; Carl Fosler; Harold Gire; Stephen Glanowski; Kenneth Glasser; Anna Glodek; Mark Gorokhov; Ken Graham; Barry Gropman; Michael Harris; Jeremy Heil; Scott Henderson; Jeffrey Hoover; Donald Jennings; John Kasha; Leonid Kagan; Cheryl Kraft; Alexander Levitsky; Mark Lewis; Xiangjun Liu; John Lopez; Daniel Ma; William Majoros; Joe McDaniel; Sean Murphy; Matthew Newman; Trung Nguyen; Ngoc Nguyen; Marc Nodell; Sue Pan; Jim Peck; Marshall Peterson; William Rowe; Robert Sanders; John Scott; Michael Simpson; Thomas Smith; Arlan Sprague; Timothy Stockwell; Russell Turner; Eli Venter; Mei Wang; Meiyuan Wen; David Wu; Mitchell Wu; Ashley Xia; Ali Zandieh; Xiaohong Zhu

2001-01-01

46

Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project  

E-print Network

Update on the Maize Genome Sequencing Project The Maize Genome Sequencing Project Vicki L. Chandler Genome Sequencing Project. The momentum for this endeavor has been building within the maize (Zea mays and human genomes (Gregory et al., 2002). Our current picture of the maize genome is largely derived from

Brendel, Volker

47

Sequence and analysis of the Arabidopsis genome  

Microsoft Academic Search

The comprehensive analysis of the genome sequence of the plant Arabidopsis thaliana has been completed recently. The genome sequence and associated analyses provide the foundations for rapid progress in many fields of plant research, such as the exploitation of genetic variation in Arabidopsis ecotypes, the assessment of the transcriptome and proteome, and the association of genome changes at the sequence

Michael Bevan; Klaus Mayer; Owen White; Jonathan A Eisen; Daphne Preuss; Thomas Bureau; Steven L Salzberg; Hans-Werner Mewes

2001-01-01

48

Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project  

Microsoft Academic Search

Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity

Mark D. Adams; Jenny M. Kelley; Jeannine D. Gocayne; Mark Dubnick; Mihael H. Polymeropoulos; Hong Xiao; Carl R. Merril; Andrew Wu; Bjorn Olde; Ruben F. Moreno; Anthony R. Kerlavage; W. Richard McCombie; J. Craig Venter

1991-01-01

49

Human Whole-Genome Shotgun Sequencing  

Microsoft Academic Search

Large-scale sequencing of the human genome is now under way (Boguski et al. 1996; Marshall and Pennisi 1996). Although at the beginning of the Ge-nome Project, many doubted the scientific value of sequencing the entire human genome, these doubts have evaporated almost entirely (Gibbs 1995; Olson 1995). Primary reasons for generating the human genomic sequence are listed in Table 1.The

James L. Weber; Eugene W. Myers

1997-01-01

50

Genome sequencing and functional genomics approaches in tomato  

Microsoft Academic Search

Tomato genome sequencing has been taking place through an international, 10-year initiative entitled the “International Solanaceae Genome Project” (SOL). The strategy proposed by the SOL consortium is to sequence the approximately 220?Mb of euchromatin that contains the majority of genes, rather than the entire tomato genome. Tomato and other Solanaceae plants have unique developmental aspects, such as the formation of

Daisuke Shibata

2005-01-01

51

MIPS: a database for genomes and protein sequences  

Microsoft Academic Search

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein

Hans-werner Mewes; Dmitrij Frishman; Ulrich Güldener; Gertrud Mannhaupt; Klaus F. X. Mayer; Martin Mokrejs; Burkhard Morgenstern; Martin Münsterkötter; Stephen Rudd; B. Weil

2002-01-01

52

Sequencing Intractable DNA to Close Microbial Genomes  

SciTech Connect

Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

2012-01-01

53

The Human Genome Project: Sequencing the Future  

E-print Network

#12;The Human Genome Project: Sequencing the Future I n 1986, the U.S. Department of Energy (DOE and unilateral step by announcing its Human Genome Initiative--forerunner of the Human Genome Project critical areas, including those important to DOE missions. The Human Genome Project and DOE's complementary

54

Value of a newly sequenced bacterial genome  

PubMed Central

Next-generation sequencing (NGS) technologies have made high-throughput sequencing available to medium- and small-size laboratories, culminating in a tidal wave of genomic information. The quantity of sequenced bacterial genomes has not only brought excitement to the field of genomics but also heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting in an exponential increase in draft (partial data) genome deposits in public databases. If no further interests are expressed for a particular bacterial genome, it is more likely that the sequencing of its genome will be limited to a draft stage, and the painstaking tasks of completing the sequencing of its genome and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the “scientific value” of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses the factors that could be leading to the increase in the number of draft deposits and the consequent loss of relevant biological information. PMID:24921006

Barbosa, Eudes GV; Aburjaile, Flavia F; Ramos, Rommel TJ; Carneiro, Adriana R; Le Loir, Yves; Baumbach, Jan; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2014-01-01

55

Expressed sequence tags: alternative or complement to whole genome sequences?  

Microsoft Academic Search

Over three million sequences from approximately 200 plant species have been deposited in the publicly available plant expressed sequence tag (EST) sequence databases. Many of the ESTs have been sequenced as an alternative to complete genome sequencing or as a substrate for cDNA array-based expression analyses. This creates a formidable resource from both biodiversity and gene-discovery standpoints. Bioinformatics-based sequence analysis

Stephen Rudd

2003-01-01

56

Center for Eukaryotic Structural Genomics  

NSDL National Science Digital Library

A collaboration between the Department of Biochemistry at the University of Wisconsin-Madison, the Medical College of Wisconsin, Molecular Kinetics, Inc., and Hebrew University, the Center for Eukaryotic Structural Genomics (CESG) intends to "develop critical technologies for determining three-dimensional structures of proteins rapidly and economically." The site gives an overview of CESG, including the goals and mission of the center, biographies of people involved, and the methodology and results of the program. The results section is the most substantial part of the site, giving information on how target proteins were selected, protocols and technology used, publications based on CESG research, and more.

57

Atypical regions in large genomic DNA sequences  

SciTech Connect

Large genomic DNA sequences contain regions with distinctive patterns of sequence organization. The authors describe a method using logarithms of probabilities based on seventh-order Markov chains to rapidly identify genomic sequences that do not resemble models of genome organization built from compilations of octanucleotide usage. Data bases have been constructed from Escherichia coli and Saccharomyces cerevisiae DNA sequences of >1000 nt and human sequences of >10,000 nt. Atypical genes and clusters of genes have been located in bacteriophage, yeast, and primate DNA sequences. The authors consider criteria for statistical significance of the results, offer possible explanations for the observed variation in genome organization, and give additional applications of these methods in DNA sequence analysis.

Scherer, S. [Lawrence Berkeley Lab., CA (United States)]|[Univ. of Minnesota, Minneapolis, MN (United States); McPeek, M.S.; Speed, T.P. [Univ. of California, Berkeley, CA (United States)

1994-07-19

58

Complete Genome Sequence of Mycobacterium massiliense  

PubMed Central

Mycobacterium massiliense is a rapidly growing bacterium associated with opportunistic infections. The genome of a representative isolate (strain GO 06) recovered from wound samples from patients who underwent arthroscopic or laparoscopic surgery was sequenced. To the best of our knowledge, this is the first announcement of the complete genome sequence of an M. massiliense strain. PMID:22965084

Raiol, Tainá; Ribeiro, Guilherme Menegói; Maranhão, Andréa Queiroz; Bocca, Anamélia Lorenzetti; Silva-Pereira, Ildinete; Junqueira-Kipnis, Ana Paula; Brigido, Marcelo de Macedo

2012-01-01

59

BSMAP: whole genome bisulfite sequence MAPping program  

Microsoft Academic Search

BACKGROUND: Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the

Yuanxin Xi; Wei Li

2009-01-01

60

BAC as tools for genome sequencing  

Microsoft Academic Search

Genome sequencing represents the state-of-the-art technology for large-scale gene discovery, cloning and decoding. Bacteria-based large-insert clones, including bacterial artificial chromosome (BAC), bacteriophage P1-derived artificial chromosome (PAC) and large-insert conventional plasmid-based clone (PBC), are desirable resources and have offered numerous potentials for accelerated sequencing of large, complex genomes. They are not only capable of cloning large DNA fragments of complex genomes

Hong-Bin Zhang; Chengcang Wu

2001-01-01

61

Human Genome Sequencing in Health and Disease  

PubMed Central

Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

2013-01-01

62

Draft Genome Sequences of the Onion Center Rot Pathogen Pantoea ananatis PA4 and Maize Brown Stalk Rot Pathogen P. ananatis BD442  

PubMed Central

Pantoea ananatis is an emerging phytopathogen that infects a broad spectrum of plant hosts. Here, we present the genomes of two South African isolates, P. ananatis PA4, which causes center rot of onion, and BD442, isolated from brown stalk rot of maize. PMID:25103759

Weller-Stuart, Tania; Chan, Wai Yin; Venter, Stephanus N.; Smits, Theo H. M.; Duffy, Brion; Goszczynska, Teresa; Cowan, Don A.; de Maayer, Pieter

2014-01-01

63

Draft Genome Sequences of the Onion Center Rot Pathogen Pantoea ananatis PA4 and Maize Brown Stalk Rot Pathogen P. ananatis BD442.  

PubMed

Pantoea ananatis is an emerging phytopathogen that infects a broad spectrum of plant hosts. Here, we present the genomes of two South African isolates, P. ananatis PA4, which causes center rot of onion, and BD442, isolated from brown stalk rot of maize. PMID:25103759

Weller-Stuart, Tania; Chan, Wai Yin; Coutinho, Teresa A; Venter, Stephanus N; Smits, Theo H M; Duffy, Brion; Goszczynska, Teresa; Cowan, Don A; de Maayer, Pieter

2014-01-01

64

Genotyping-by-Sequencing for Populus Population Genomics: An Assessment of Genome Sampling Patterns  

E-print Network

Genotyping-by-Sequencing for Populus Population Genomics: An Assessment of Genome Sampling Patterns Abstract Continuing advances in nucleotide sequencing technology are inspiring a suite of genomic, recent advances in sequencing chemistry, sequencing platforms, data storage, and computational processing

65

Genome Sequencing and Analysis Conference IV  

SciTech Connect

J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

Not Available

1993-12-31

66

Genomic sequencing of Pleistocene cave bears  

SciTech Connect

Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

2005-04-01

67

The genome sequence of Drosophila melanogaster.  

SciTech Connect

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the {approximately}120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes {approximately}13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

NONE

2000-03-24

68

Genome sequence of the palaeopolyploid Jeremy Schmutz1,2  

E-print Network

). The soybean genome is the largest whole-genome shotgun- sequenced plant genome so far and compares favourably to all other high-quality draft whole-genome shotgun-sequenced plant genomes (Supplementary Table 4ARTICLES Genome sequence of the palaeopolyploid soybean Jeremy Schmutz1,2 , Steven B. Cannon3

Bhattacharyya, Madan Kumar

69

Plantagora: Modeling Whole Genome Sequencing and Assembly of Plant Genomes  

PubMed Central

Background Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. Methodology/Principal Findings For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. Conclusions/Significance Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly further. PMID:22174807

Barthelson, Roger; McFarlin, Adam J.; Rounsley, Steven D.; Young, Sarah

2011-01-01

70

Cancer genome-sequencing study design.  

PubMed

Discoveries from cancer genome sequencing have the potential to translate into advances in cancer prevention, diagnostics, prognostics, treatment and basic biology. Given the diversity of downstream applications, cancer genome-sequencing studies need to be designed to best fulfil specific aims. Knowledge of second-generation cancer genome-sequencing study design also facilitates assessment of the validity and importance of the rapidly growing number of published studies. In this Review, we focus on the practical application of second-generation sequencing technology (also known as next-generation sequencing) to cancer genomics and discuss how aspects of study design and methodological considerations - such as the size and composition of the discovery cohort - can be tailored to serve specific research aims. PMID:23594910

Mwenifumbo, Jill C; Marra, Marco A

2013-05-01

71

Next-generation sequencing: applications beyond genomes  

Microsoft Academic Search

The development of DNA sequencing more than 30 years ago has profoundly impacted biological research. In the last couple of years, remarkable technological innovations have emerged that allow the direct and cost-effective sequencing of complex samples at unprecedented scale and speed. These next-generation technologies make it feasible to sequence not only static genomes, but also entire transcriptomes expressed under different

Samuel Marguerat; Jürg Bähler

2008-01-01

72

Quality assessment of the human genome sequence  

Microsoft Academic Search

As the final sequencing of the human genome has now been completed, we present the results of the largest examination of the quality of the finished DNA sequence. The completed study covers the major contributing sequencing centres and is based on a rigorous combination of laboratory experiments and computational analysis.

Jeremy Schmutz; Jeremy Wheeler; Jane Grimwood; Mark Dickson; Joan Yang; Chenier Caoile; Eva Bajorek; Stacey Black; Yee Man Chan; Mirian Denys; Julio Escobar; Dave Flowers; Dea Fotopulos; Carmen Garcia; Maria Gomez; Eidelyn Gonzales; Lauren Haydu; Frederick Lopez; Lucia Ramirez; James Retterer; Alex Rodriguez; Stephanie Rogers; Angelica Salazar; Ming Tsai; Richard M. Myers

2004-01-01

73

Solvable Sequence Evolution Models and Genomic Correlations  

NASA Astrophysics Data System (ADS)

We study a minimal model for genome evolution whose elementary processes are single site mutation, duplication and deletion of sequence regions, and insertion of random segments. These processes are found to generate long-range correlations in the composition of letters as long as the sequence length is growing; i.e., the combined rates of duplications and insertions are higher than the deletion rate. For constant sequence length, on the other hand, all initial correlations decay exponentially. These results are obtained analytically and by simulations. They are compared with the long-range correlations observed in genomic DNA, and the implications for genome evolution are discussed.

Messer, Philipp W.; Arndt, Peter F.; Lässig, Michael

2005-04-01

74

Genome sequence of Coxiella burnetii strain Namibia  

PubMed Central

We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

2014-01-01

75

Genome sequence of Coxiella burnetii strain Namibia.  

PubMed

We present the whole genome sequence and annotation of the Coxiella burnetii strain Namibia. This strain was isolated from an aborting goat in 1991 in Windhoek, Namibia. The plasmid type QpRS was confirmed in our work. Further genomic typing placed the strain into a unique genomic group. The genome sequence is 2,101,438 bp long and contains 1,979 protein-coding and 51 RNA genes, including one rRNA operon. To overcome the poor yield from cell culture systems, an additional DNA enrichment with whole genome amplification (WGA) methods was applied. We describe a bioinformatics pipeline for improved genome assembly including several filters with a special focus on WGA characteristics. PMID:25593636

Walter, Mathias C; Öhrman, Caroline; Myrtennäs, Kerstin; Sjödin, Andreas; Byström, Mona; Larsson, Pär; Macellaro, Anna; Forsman, Mats; Frangoulidis, Dimitrios

2014-01-01

76

First Complete Sequence of the Human Genome  

NSDL National Science Digital Library

On April 6, Celera Genomics announced that it had completed the sequencing phase of one person's genome. It will now begin the process of assembling the sequenced fragments into their proper order with the aid of powerful computers. Work on this project began in September 1999 using a method called "whole genome shotgun sequencing," a quicker method than that used by the international Human Genome Project, which has completed about two-thirds of its own, more thorough, sequence of the human genome. Although talks between Celera and the Human Genome Project over the sharing of data broke down earlier this year, they have since resumed and the company has stated that it will cooperate. While this is just the first step towards understanding the human genome, it only reveals the order of the nucleotides, not what the genes do, it is certainly an important milestone, with broad implications for biology and medicine. Users can begin with the company's press release and then read reports from the BBC, the New York Times (free registration required), CNN, National Public Radio's All Things Considered, and the Times of India. Additional related resources are available from the Human Genome Project site and Doubletwist.com.

de Nie, Michael Willem.

77

Complete genome sequence of Streptomyces fulvissimus.  

PubMed

The complete genome sequence of Streptomyces fulvissimus (DSM 40593), consisting of a linear chromosome with a size of 7.9Mbp, is reported. Preliminary data indicates that the chromosome of S. fulvissimus contains 32 putative gene clusters involved in the biosynthesis of secondary metabolites, two of them showing very high similarity to the valinomycin and nonactin biosynthetic clusters. The availability of genome sequence of S. fulvissimus will contribute to the evaluation of the full biosynthetical potential of streptomycetes. PMID:23965270

Myronovskyi, M; Tokovenko, B; Manderscheid, N; Petzke, L; Luzhetskyy, A

2013-10-10

78

Genome sequence and analysis of Lactobacillus helveticus  

PubMed Central

The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of Lactobacillus helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE) inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract. As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones. PMID:23335916

Cremonesi, Paola; Chessa, Stefania; Castiglioni, Bianca

2013-01-01

79

Genomic sequencing of Pleistocene cave bears.  

PubMed

Despite the greater information content of genomic DNA, ancient DNA studies have largely been limited to the amplification of mitochondrial sequences. Here we describe metagenomic libraries constructed with unamplified DNA extracted from skeletal remains of two 40,000-year-old extinct cave bears. Analysis of approximately 1 megabase of sequence from each library showed that despite significant microbial contamination, 5.8 and 1.1% of clones contained cave bear inserts, yielding 26,861 base pairs of cave bear genome sequence. Comparison of cave bear and modern bear sequences revealed the evolutionary relationship of these lineages. The metagenomic approach used here establishes the feasibility of ancient DNA genome sequencing programs. PMID:15933159

Noonan, James P; Hofreiter, Michael; Smith, Doug; Priest, James R; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J Chris; Pääbo, Svante; Rubin, Edward M

2005-07-22

80

Sequencing and comparing whole mitochondrial genomes ofanimals  

SciTech Connect

Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

2005-04-22

81

From sequence mapping to genome assemblies.  

PubMed

The development of "next-generation" high-throughput sequencing technologies has made it possible for many labs to undertake sequencing-based research projects that were unthinkable just a few years ago. Although the scientific applications are diverse, e.g., new genome projects, gene expression analysis, genome-wide functional screens, or epigenetics-the sequence data are usually processed in one of two ways: sequence reads are either mapped to an existing reference sequence, or they are built into a new sequence ("de novo assembly"). In this chapter, we first discuss some limitations of the mapping process and how these may be overcome through local sequence assembly. We then introduce the concept of de novo assembly and describe essential assembly improvement procedures such as scaffolding, contig ordering, gap closure, error evaluation, gene annotation transfer and ab initio gene annotation. The results are high-quality draft assemblies that will facilitate informative downstream analyses. PMID:25388106

Otto, Thomas D

2015-01-01

82

Genome Sequencing, Assembly and Gene Prediction in Fungi  

Microsoft Academic Search

Genome sequencing and the science of genomics is now being applied to the study of fungi. Although resources have been slow in coming, a number of fungi are now being sequenced and an increasingly diverse array of these organisms are being considered as candidates for whole genome sequencing. Currently there are only two complete fungal genome sequences available, those of

Brendan Loftus

2003-01-01

83

Thirteen Camellia chloroplast genome sequences determined by high-throughput sequencing: genome structure and phylogenetic relationships  

PubMed Central

Background Camellia is an economically and phylogenetically important genus in the family Theaceae. Owing to numerous hybridization and polyploidization, it is taxonomically and phylogenetically ranked as one of the most challengingly difficult taxa in plants. Sequence comparisons of chloroplast (cp) genomes are of great interest to provide a robust evidence for taxonomic studies, species identification and understanding mechanisms that underlie the evolution of the Camellia species. Results The eight complete cp genomes and five draft cp genome sequences of Camellia species were determined using Illumina sequencing technology via a combined strategy of de novo and reference-guided assembly. The Camellia cp genomes exhibited typical circular structure that was rather conserved in genomic structure and the synteny of gene order. Differences of repeat sequences, simple sequence repeats, indels and substitutions were further examined among five complete cp genomes, representing a wide phylogenetic diversity in the genus. A total of fifteen molecular markers were identified with more than 1.5% sequence divergence that may be useful for further phylogenetic analysis and species identification of Camellia. Our results showed that, rather than functional constrains, it is the regional constraints that strongly affect sequence evolution of the cp genomes. In a substantial improvement over prior studies, evolutionary relationships of the section Thea were determined on basis of phylogenomic analyses of cp genome sequences. Conclusions Despite a high degree of conservation between the Camellia cp genomes, sequence variation among species could still be detected, representing a wide phylogenetic diversity in the genus. Furthermore, phylogenomic analysis was conducted using 18 complete cp genomes and 5 draft cp genome sequences of Camellia species. Our results support Chang’s taxonomical treatment that C. pubicosta may be classified into sect. Thea, and indicate that taxonomical value of the number of ovaries should be reconsidered when classifying the Camellia species. The availability of these cp genomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the phylogeny of the genus Camellia. PMID:25001059

2014-01-01

84

International Rice Genome Sequencing Project: the effort to completely sequence the rice genome  

Microsoft Academic Search

The International Rice Genome Sequencing Project (IRGSP) involves researchers from ten countries who are working to completely and accurately sequence the rice genome within a short period. Sequencing uses a map-based clone-by-clone shotgun strategy; shared bacterial artificial chromosome\\/ P1-derived artificial chromosome libraries have been constructed from Oryza sativa ssp. japonica variety ‘Nipponbare’. End-sequencing, fingerprinting and marker-aided PCR screening are being

Takuji Sasaki; Benjamin Burr

2000-01-01

85

Comparison of 61 sequenced Escherichia coli genomes.  

PubMed

Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of this set of sequenced strains. A hierarchical clustering of variable genes allowed clear separation of the strains into clusters, including known pathotypes; clinically relevant serotypes can also be resolved in this way. In contrast, when in silico MLST was performed, many of the various strains appear jumbled and less well resolved. The predicted pan-genome comprises 15,741 gene families, and only 993 (6%) of the families are represented in every genome, comprising the core genome. The variable or 'accessory' genes thus make up more than 90% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of Enterobacteriaceae. PMID:20623278

Lukjancenko, Oksana; Wassenaar, Trudy M; Ussery, David W

2010-11-01

86

Comparison of 61 Sequenced Escherichia coli Genomes  

PubMed Central

Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of this set of sequenced strains. A hierarchical clustering of variable genes allowed clear separation of the strains into clusters, including known pathotypes; clinically relevant serotypes can also be resolved in this way. In contrast, when in silico MLST was performed, many of the various strains appear jumbled and less well resolved. The predicted pan-genome comprises 15,741 gene families, and only 993 (6%) of the families are represented in every genome, comprising the core genome. The variable or ‘accessory’ genes thus make up more than 90% of the pan-genome and about 80% of a typical genome; some of these variable genes tend to be co-localized on genomic islands. The diversity within the species E. coli, and the overlap in gene content between this and related species, suggests a continuum rather than sharp species borders in this group of Enterobacteriaceae. PMID:20623278

Lukjancenko, Oksana; Wassenaar, Trudy M.

2010-01-01

87

Whole Genome Sequence of a Turkish Individual  

PubMed Central

Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ?1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123?2,122,671 or 1?1.5) and transition/transversion ratios (2,383,204?1,154,590 or 2.06?1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1?1.09 insertion/deletion ratio), ranging from ?52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale. PMID:24416366

Dogan, Haluk; Can, Handan; Otu, Hasan H.

2014-01-01

88

About The Center for Cancer Genomics (CCG)  

Cancer.gov

Recognizing the power of genomics, the National Cancer Institute (NCI) established the Center for Cancer Genomics (CCG) to develop and apply genome science to better diagnose and treat cancer patients. NCI is supporting research to identify the genetic drivers of cancer, to advance adoption of precise tumor diagnosis and treatment, and to prepare patients and their doctors for the changes in medical care influenced by genomics. Throughout these efforts, NCI protects patients’ privacy without hindering treatment or research.

89

Genomic Sequencing of Single Microbial Cells from Environmental Samples  

SciTech Connect

Recently developed techniques allow genomic DNA sequencing from single microbial cells [Lasken RS: Single-cell genomic sequencing using multiple displacement amplification, Curr Opin Microbiol 2007, 10:510-516]. Here, we focus on research strategies for putting these methods into practice in the laboratory setting. An immediate consequence of single-cell sequencing is that it provides an alternative to culturing organisms as a prerequisite for genomic sequencing. The microgram amounts of DNA required as template are amplified from a single bacterium by a method called multiple displacement amplification (MDA) avoiding the need to grow cells. The ability to sequence DNA from individual cells will likely have an immense impact on microbiology considering the vast numbers of novel organisms, which have been inaccessible unless culture-independent methods could be used. However, special approaches have been necessary to work with amplified DNA. MDA may not recover the entire genome from the single copy present in most bacteria. Also, some sequence rearrangements can occur during the DNA amplification reaction. Over the past two years many research groups have begun to use MDA, and some practical approaches to single-cell sequencing have been developed. We review the consensus that is emerging on optimum methods, reliability of amplified template, and the proper interpretation of 'composite' genomes which result from the necessity of combining data from several single-cell MDA reactions in order to complete the assembly. Preferred laboratory methods are considered on the basis of experience at several large sequencing centers where >70% of genomes are now often recovered from single cells. Methods are reviewed for preparation of bacterial fractions from environmental samples, single-cell isolation, DNA amplification by MDA, and DNA sequencing.

Ishoey, Thomas; Woyke, Tanja; Stepanauskas, Ramunas; Novotny, Mark; Lasken, Roger S.

2008-02-01

90

Genome sequence and comparative analysis of the model rodent malaria  

E-print Network

Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii Medical Centre, PO Box 9600, 2300 RC Leiden, The Netherlands § Naval Medical Research Center, Malaria ........................................................................................................................................................................................................................... Species of malaria parasite that infect rodents have long been used as models for malaria disease research

Salzberg, Steven

91

Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence  

Technology Transfer Automated Retrieval System (TEKTRAN)

The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

92

SWINE GENOME SEQUENCING CONSORTIUM (SGSC): A STRATEGIC ROADMAP FOR SEQUENCING THE PIG GENOME  

Technology Transfer Automated Retrieval System (TEKTRAN)

The Swine Genome Sequencing Consortium (SGSC) was formed in September 2003 by academic, government and industry representatives to provide international coordination for sequencing the pig genome. The SGSC's mission is to advance biomedical research for animal production and health by the developmen...

93

Genome Sequence of the Palaeopolyploid soybean  

SciTech Connect

Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

2009-08-03

94

ORIGINAL PAPER Microsatellite DNA in genomic survey sequences  

E-print Network

ORIGINAL PAPER Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine Craig S) 2011 Abstract Genomic DNA sequence databases are a potential and growing resource for simple sequence densities in genome survey sequences (GSSs) to those in non-redundant EST and cDNA sequences (Uni

95

Using comparative genomics to reorder the human genome sequence into a virtual sheep genome  

Microsoft Academic Search

BACKGROUND: Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes? RESULTS: A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the

Brian P Dalrymple; Ewen F Kirkness; Mikhail Nefedov; Sean McWilliam; Abhirami Ratnakumar; Wes Barris; Shaying Zhao; Jyoti Shetty; Jillian F Maddox; Margaret O'Grady; Frank Nicholas; Allan M Crawford; Tim Smith; Pieter J de Jong; John McEwan; V Hutton Oddy; Noelle E Cockett

2007-01-01

96

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence  

E-print Network

plants have large and complex genomes with an abundance of repeated sequences.plants have large and complex genomes with a great abundance of repeated sequences.Sequence composition, organization, and evolution of the core Triticeae genome. Plant

2011-01-01

97

Locus Reference Genomic: reference sequences for the reporting of clinically relevant sequence variants  

PubMed Central

Locus Reference Genomic (LRG; http://www.lrg-sequence.org/) records contain internationally recognized stable reference sequences designed specifically for reporting clinically relevant sequence variants. Each LRG is contained within a single file consisting of a stable ‘fixed’ section and a regularly updated ‘updatable’ section. The fixed section contains stable genomic DNA sequence for a genomic region, essential transcripts and proteins for variant reporting and an exon numbering system. The updatable section contains mapping information, annotation of all transcripts and overlapping genes in the region and legacy exon and amino acid numbering systems. LRGs provide a stable framework that is vital for reporting variants, according to Human Genome Variation Society (HGVS) conventions, in genomic DNA, transcript or protein coordinates. To enable translation of information between LRG and genomic coordinates, LRGs include mapping to the human genome assembly. LRGs are compiled and maintained by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). LRG reference sequences are selected in collaboration with the diagnostic and research communities, locus-specific database curators and mutation consortia. Currently >700 LRGs have been created, of which >400 are publicly available. The aim is to create an LRG for every locus with clinical implications. PMID:24285302

MacArthur, Jacqueline A. L.; Morales, Joannella; Tully, Ray E.; Astashyn, Alex; Gil, Laurent; Bruford, Elspeth A.; Larsson, Pontus; Flicek, Paul; Dalgleish, Raymond; Maglott, Donna R.; Cunningham, Fiona

2014-01-01

98

Mining for single nucleotide polymorphisms in pig genome sequence data  

PubMed Central

Background Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation. PMID:19126189

Kerstens, Hindrik HD; Kollers, Sonja; Kommadath, Arun; del Rosario, Marisol; Dibbits, Bert; Kinders, Sylvia M; Crooijmans, Richard P; Groenen, Martien AM

2009-01-01

99

Next-generation sequencing applied to rare diseases genomics.  

PubMed

Genomics has revolutionized the study of rare diseases. In this review, we overview the latest technological development, rare disease discoveries, implementation obstacles and bioethical challenges. First, we discuss the technology of genome and exome sequencing, including the different next-generation platforms and exome enrichment technologies. Second, we survey the pioneering centers and discoveries for rare diseases, including few of the research institutions that have contributed to the field, as well as an overview survey of different types of rare diseases that have had new discoveries due to next-generation sequencing. Third, we discuss the obstacles and challenges that allow for clinical implementation, including returning of results, informed consent and privacy. Last, we discuss possible outlook as clinical genomics receives wider adoption, as third-generation sequencing is coming onto the horizon, and some needs in informatics and software to further advance the field. PMID:24702023

Danielsson, Krissi; Mun, Liew Jun; Lordemann, Amanda; Mao, Jimmy; Lin, Cheng-Ho Jimmy

2014-05-01

100

Accelerating Genome Sequencing 100X with FPGAs  

SciTech Connect

The performance of two Cray XD1 systems with Virtex-II Pro 50 and Virtex-4 LX160 FPGAs was evaluated using the FASTA computational biology program for human genome (DNA and protein) sequence comparisons. FPGA speedups of 50X (Virtex-II Pro 50) and 100X (Virtex-4 LX160) over a 2.2 GHz Opteron were obtained. FPGA coding issues for human genome data are described.

Storaasli, Olaf O [ORNL; Strenski, Dave [Cray, Inc.

2007-01-01

101

Complete genome sequence of Borrelia crocidurae.  

PubMed

We announce the draft genome sequence of Borrelia crocidurae (strain Achema). The 1,557,560-bp genome (27% GC content) comprises one 919,477-bp linear chromosome and 638,083-bp plasmids that together carry 1,472 open reading frames, 32 tRNAs, and three complete rRNAs, with almost complete colinearity between B. crocidurae and Borrelia duttonii chromosomes. PMID:22740657

Elbir, Haitham; Gimenez, Grégory; Robert, Catherine; Bergström, Sven; Cutler, Sally; Raoult, Didier; Drancourt, Michel

2012-07-01

102

Automated correction of genome sequence errors  

Microsoft Academic Search

ABSTRACT By,using,information,from,an,assembly,of,a genome, a new program called AutoEditor signifi- cantly,improves,base,calling,accuracy,over,that achieved,by,previous,algorithms.,This in,turn improves,the,overall,accuracy,of,genome sequences,and,facilitates,the,use,of,these sequences,for,polymorphism,discovery.,We describe,the algorithm,and,its application,in a large set of recent genome sequencing,projects. The number,of erroneous,base,calls in these,projects was,reduced,by,80%. In an,analysis,of over,one million corrections, we found that AutoEditor made just one error per 8828 corrections. By substantially increasing the accuracy of base calling, AutoEditor can dramatically,accelerate,the process,of finishing

Pawel Gajer; Michael Schatz; Steven L. Salzberg

2004-01-01

103

Genome sequence and assembly of Bos indicus.  

PubMed

Cattle are divided into 2 groups referred to as taurine and indicine, both of which have been under strong artificial selection due to their importance for human nutrition. A side effect of this domestication includes a loss of genetic diversity within each specialized breed. Recently, the first taurine genome was sequenced and assembled, allowing for a better understanding of this ruminant species. However, genetic information from indicine breeds has been limited. Here, we present the first genome sequence of an indicine breed (Nellore) generated with 52X coverage by SOLiD sequencing platform. As expected, both genomes share high similarity at the nucleotide level for all autosomes and the X chromosome. Regarding the Y chromosome, the homology was considerably lower, most likely due to uncompleted assembly of the taurine Y chromosome. We were also able to cover 97% of the annotated taurine protein-coding genes. PMID:22315242

Canavez, Flavio C; Luche, Douglas D; Stothard, Paul; Leite, Katia R M; Sousa-Canavez, Juliana M; Plastow, Graham; Meidanis, João; Souza, Maria Angélica; Feijao, Pedro; Moore, Steve S; Camara-Lopes, Luiz H

2012-01-01

104

Standardized Metadata for Human Pathogen/Vector Genomic Sequences  

PubMed Central

High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant. PMID:24936976

Dugan, Vivien G.; Emrich, Scott J.; Giraldo-Calderón, Gloria I.; Harb, Omar S.; Newman, Ruchi M.; Pickett, Brett E.; Schriml, Lynn M.; Stockwell, Timothy B.; Stoeckert, Christian J.; Sullivan, Dan E.; Singh, Indresh; Ward, Doyle V.; Yao, Alison; Zheng, Jie; Barrett, Tanya; Birren, Bruce; Brinkac, Lauren; Bruno, Vincent M.; Caler, Elizabet; Chapman, Sinéad; Collins, Frank H.; Cuomo, Christina A.; Di Francesco, Valentina; Durkin, Scott; Eppinger, Mark; Feldgarden, Michael; Fraser, Claire; Fricke, W. Florian; Giovanni, Maria; Henn, Matthew R.; Hine, Erin; Hotopp, Julie Dunning; Karsch-Mizrachi, Ilene; Kissinger, Jessica C.; Lee, Eun Mi; Mathur, Punam; Mongodin, Emmanuel F.; Murphy, Cheryl I.; Myers, Garry; Neafsey, Daniel E.; Nelson, Karen E.; Nierman, William C.; Puzak, Julia; Rasko, David; Roos, David S.; Sadzewicz, Lisa; Silva, Joana C.; Sobral, Bruno; Squires, R. Burke; Stevens, Rick L.; Tallon, Luke; Tettelin, Herve; Wentworth, David; White, Owen; Will, Rebecca; Wortman, Jennifer; Zhang, Yun; Scheuermann, Richard H.

2014-01-01

105

An International Plan to Sequence the Onion Genome  

Technology Transfer Automated Retrieval System (TEKTRAN)

The cost of DNA sequencing continues to decline and, in the near future, it will become reasonable to undertake sequencing of the enormous nuclear genome of onion. We undertook sequencing of expressed and genomic regions of the onion genome to learn about the structure of the onion genome, as well a...

106

The Trichomonas vaginalis Genome Sequencing Project  

NSDL National Science Digital Library

The Institute for Genomic Research (TIGR) in 2003 released the first draft assembly of the Trichomonas vaginalis_genome, available through this website to the academic and not-for-profit research community for noncommercial use only. TIGR will release more data at regular intervals during the sequencing project, which should help researchers better understand this widespread parasite and its role in HIV infection, neo-natal disorders, predisposition to cervical cancer, and of course, vaginitis. The website also includes background information on T. vaginalis, as well as a link to TIGR's sequencing project for Entamoeba histolytica -- a closely related organism.

107

Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes  

Technology Transfer Automated Retrieval System (TEKTRAN)

Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

108

Genome size and the accumulation of simple sequence repeats: implications of new data from genome sequencing projects  

Microsoft Academic Search

The relationship between the level of repetitiveness in genomic sequences and genome size has been re-investigated making use of the rapidly growing database of complete eubacterial and archaeal genome sequences combined with the fragmentary but now large amount of data from eukaryotic genomes. Relative simplicity factors (RSFs), which measure the repetitiveness of sequences, were calculated and significantly simple motifs (SSMs),

John M. Hancock

2002-01-01

109

Sequencing Your Genome: What Does It Mean?  

PubMed Central

The human genome contains approximately 3.2 billion nucleotides and about 23,500 genes. Each gene has protein-coding regions that are referred to as exons. The human genome contains about 180,000 exons, which are collectively called an exome. An exome comprises about 1% of the human genome and hence is about 30 million nucleotides in size. Today’s technologies afford the opportunity to sequence all nucleotides in the human exome and even in the human genome. Given that more than three-quarters of the known disease-causing variants are located in the exome, and considering the cost and technical challenges in analyzing the whole genome sequence data, the focus of present research is primarily on whole exome sequencing (WES). While WES at the medical sequencing level is still expensive, it is becoming more affordable. Cost will not likely be a major barrier in the near future, and the data analysis is becoming less tedious. The most difficult challenge at the heart of medical sequencing is interpreting the findings. Each exome contains about 13,500 single nucleotide variants (SNVs) that affect the amino acid sequence, and a large number are expected to be functional variants. The daunting task is to distinguish the variants that are pathogenic from those that have minimal or no discernible clinical effects. While various algorithms exist, none are sufficiently robust. Thus, in-depth knowledge in genetics and medicine is essential for the proper interpretation of the WES findings. This review will discuss the potential applications of the WES data in the practice of cardiovascular medicine. PMID:24932355

2014-01-01

110

Human Genome Project Sequencing, 3D animation with basic narrationSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

DNAi Location: Genome>Project>putting it together>Mapping the genome As represented by this huge stack of paper, the human genome contains more than three billion nucleotides or DNA letters. The first stage of the public Human Genome Project focused on identifying marker sequences or unique tags (shown here in yellow) at regular intervals throughout this \\"book of life.\\" Once enough sequences were tagged, various blocks of the genome were allocated to different academic centers for sequencing.

2008-10-06

111

Mapping and sequencing the human genome  

SciTech Connect

Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

none,

1988-01-01

112

Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria  

PubMed Central

Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the “gold standard” of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W.; Aarestrup, Frank M.; Lund, Ole

2012-01-01

113

Multilocus sequence typing of total-genome-sequenced bacteria.  

PubMed

Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS) continue to decline, it becomes increasingly available to scientists and routine diagnostic laboratories. Currently, the cost is below that of traditional MLST. The new challenges will be how to extract the relevant information from the large amount of data so as to allow for comparison over time and between laboratories. Ideally, this information should also allow for comparison to historical data. We developed a Web-based method for MLST of 66 bacterial species based on WGS data. As input, the method uses short sequence reads from four sequencing platforms or preassembled genomes. Updates from the MLST databases are downloaded monthly, and the best-matching MLST alleles of the specified MLST scheme are found using a BLAST-based ranking method. The sequence type is then determined by the combination of alleles identified. The method was tested on preassembled genomes from 336 isolates covering 56 MLST schemes, on short sequence reads from 387 isolates covering 10 schemes, and on a small test set of short sequence reads from 29 isolates for which the sequence type had been determined by traditional methods. The method presented here enables investigators to determine the sequence types of their isolates on the basis of WGS data. This method is publicly available at www.cbs.dtu.dk/services/MLST. PMID:22238442

Larsen, Mette V; Cosentino, Salvatore; Rasmussen, Simon; Friis, Carsten; Hasman, Henrik; Marvig, Rasmus Lykke; Jelsbak, Lars; Sicheritz-Pontén, Thomas; Ussery, David W; Aarestrup, Frank M; Lund, Ole

2012-04-01

114

Assigning genomic sequences to CATH  

Microsoft Academic Search

We report the latest release (version 1.6) of the CATH protein domains database (http:\\/\\/www.biochem.ucl. ac.uk\\/bsm\\/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo- logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and

Frances M. G. Pearl; David Lee; James E. Bray; Ian Sillitoe; Annabel E. Todd; Andrew P. Harrison; Janet M. Thornton; Christine A. Orengo

2000-01-01

115

Whole genome sequences of four Brucella strains.  

PubMed

Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine. PMID:21602346

Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yang, Yi; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

2011-07-01

116

Whole Genome Sequences of Four Brucella Strains ?  

PubMed Central

Brucella melitensis and Brucella suis are intracellular pathogens of livestock and humans. Here we report four genome sequences, those of the virulent strain B. melitensis M28-12 and vaccine strains B. melitensis M5 and M111 and B. suis S2, which show different virulences and pathogenicities, which will help to design a more effective brucellosis vaccine. PMID:21602346

Ding, Jiabo; Pan, Yuanlong; Jiang, Hai; Cheng, Junsheng; Liu, Taotao; Qin, Nan; Yang, Yi; Cui, Buyun; Chen, Chen; Liu, Cuihua; Mao, Kairong; Zhu, Baoli

2011-01-01

117

Genome Sequence of Corynebacterium ulcerans Strain 210932  

PubMed Central

In this work, we present the complete genome sequence of Corynebacterium ulcerans strain 210932, isolated from a human. The species is an emergent pathogen that infects a variety of wild and domesticated animals and humans. It is associated with a growing number of cases of a diphtheria-like disease around the world. PMID:25428977

Viana, Marcus Vinicius Canário; de Jesus Benevides, Leandro; Batista Mariano, Diego Cesar; de Souza Rocha, Flávia; Bagano Vilas Boas, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Alves Dorella, Fernanda; Gomes Leal, Carlos Augusto; Fiorini de Carvalho, Alex; Silva, Artur; de Castro Soares, Siomar; Pereira Figueiredo, Henrique Cesar; Guimarães, Luis Carlos

2014-01-01

118

Draft Genome Sequence of Virgibacillus halodenitrificans 1806  

PubMed Central

Virgibacillus halodenitrificans 1806 is an endospore-forming halophilic bacterium isolated from salterns in Korea. Here, we report the draft genome sequence of V. halodenitrificans 1806, which may reveal the molecular basis of osmoadaptation and insights into carbon and anaerobic metabolism in moderate halophiles. PMID:23105070

Lee, Sang-Jae; Lee, Yong-Jik; Jeong, Haeyoung; Lee, Sang Jun; Lee, Han-Seung; Pan, Jae-Gu

2012-01-01

119

A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower  

E-print Network

genomes are crop plants, their complete genome sequence willchloroplast genome sequence for any plant within the largersequence of Glycine max and comparative analyses with other legume genomes. Plant

Timme, Ruth E.

2009-01-01

120

Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken  

E-print Network

-scale DNA sequencing has transformed biological research. Scientists can sequence whole genomes of microbesABSTRACT Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken into fragments and sequenced

Campbell, A. Malcolm

121

A Computer Program for Aligning a cDNA Sequence with a Genomic DNA Sequence  

Microsoft Academic Search

We address the problem of efficiently aligning a transcribed and spliced DNA sequence with a genomic sequence containing that gene, allowing for introns in the genomic sequence and a relatively small number of sequencing errors. A freely available computer program, described herein, solves the problem for a 100-kb genomic sequence in a few seconds on a workstation. With large amounts

Liliana Florea; George Hartzell; Gerald M. Rubin; Webb Miller

1998-01-01

122

Defining Genome Project Standards in a New Era of Sequencing  

SciTech Connect

Patrick Chain of the DOE Joint Genome Institute gives a talk on behalf of the International Genome Sequencing Standards Consortium on the need for intermediate genome classifications between "draft" and "finished"

Chain, Patrick [DOE-JGI

2009-05-27

123

Whole-genome sequencing in bacteriology: state of the art  

PubMed Central

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115

Dark, Michael J

2013-01-01

124

The first Irish genome and ways of improving sequence accuracy.  

PubMed

Whole-genome sequencing of an Irish person reveals hundreds of thousands of novel genomic variants. Imputation using previous known information improves the accuracy of low-read-depth sequencing. PMID:20815917

Ju, Young Seok; Yoo, Yun Joo; Kim, Jong-Il; Seo, Jeong-Sun

2010-01-01

125

The Genome Sequence of Drosophila melanogaster  

NSDL National Science Digital Library

On Thursday March 23, 2000, a historic milestone was marked as researchers announced they have completed mapping the genome of the fruit fly, Drosophila melanogaster. The achievement, which was announced in a special issue of the journal Science, culminates close to 100 years of research. Drosophila melanogaster is the most complex animal thus far to have its genetic sequence deciphered. The findings have important implications for human medical research and for completing a map of the human genome. Mapping the fruit fly genome has been a broad collaborative effort between academia and industry in several countries. While a foundation was laid by US (Berkeley), European, and Canadian Drosophila Genome Projects, Celera Genomic finished the job over the last year by employing super-computers and state-of-the-art gene-sequencing machines. The techniques learned and used in this last phase of mapping may now be applied to more rapidly decode genes of other organisms, including humans. This week's In The News takes a closer look at this important landmark.

Ramanujan, Krishna.

126

Agaricus bisporus genome sequence: a commentary.  

PubMed

The genomes of two isolates of Agaricus bisporus have been sequenced recently. This soil-inhabiting fungus has a wide geographical distribution in nature and it is also cultivated in an industrialized indoor process ($4.7bn annual worldwide value) to produce edible mushrooms. Previously this lignocellulosic fungus has resisted precise econutritional classification, i.e. into white- or brown-rot decomposers. The generation of the genome sequence and transcriptomic analyses has revealed a new classification, 'humicolous', for species adapted to grow in humic-rich, partially decomposed leaf material. The Agaricus biporus genomes contain a collection of polysaccharide and lignin-degrading genes and more interestingly an expanded number of genes (relative to other lignocellulosic fungi) that enhance degradation of lignin derivatives, i.e. heme-thiolate peroxidases and ?-etherases. A motif that is hypothesized to be a promoter element in the humicolous adaptation suite is present in a large number of genes specifically up-regulated when the mycelium is grown on humic-rich substrate. The genome sequence of A. bisporus offers a platform to explore fungal biology in carbon-rich soil environments and terrestrial cycling of carbon, nitrogen, phosphorus and potassium. PMID:23558250

Kerrigan, Richard W; Challen, Michael P; Burton, Kerry S

2013-06-01

127

Comparative Analysis of Genome Sequences with VISTA  

DOE Data Explorer

VISTA is a comprehensive suite of programs and databases developed by and hosted at the Genomics Division of Lawrence Berkeley National Laboratory. They provide information and tools designed to facilitate comparative analysis of genomic sequences. Users have two ways to interact with the suite of applications at the VISTA portal. They can submit their own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species. A key menu option is the Enhancer Browser and Database at http://enhancer.lbl.gov/. The VISTA Enhancer Browser is a central resource for experimentally validated human noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation with other vertebrates. The results of this enhancer screen are provided through this publicly available website. The browser also features relevant results by external contributors and a large collection of additional genome-wide conserved noncoding elements which are candidate enhancer sequences. The LBL developers invite external groups to submit computational predictions of developmental enhancers. As of 10/19/2009 the database contains information on 1109 in vivo tested elements - 508 elements with enhancer activity.

Dubchak, Inna

128

Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences  

Microsoft Academic Search

BACKGROUND: Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX,

Alexander F. Auch; Stefan R. Henz; Barbara R. Holland; Markus Göker

2006-01-01

129

Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence  

Microsoft Academic Search

BACKGROUND: The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to

Susan E Celniker; David A Wheeler; Brent Kronmiller; Joseph W Carlson; Aaron Halpern; Sandeep Patel; Mark Adams; Mark Champe; Shannon P Dugan; Erwin Frise; Ann Hodgson; Reed A George; Roger A Hoskins; Todd Laverty; Donna M Muzny; Catherine R Nelson; Joanne M Pacleb; Soo Park; Barret D Pfeiffer; Stephen Richards; Erica J Sodergren; Robert Svirskas; Paul E Tabor; Kenneth Wan; Mark Stapleton; Granger G Sutton; Craig Venter; George Weinstock; Steven E Scherer; Eugene W Myers; Richard A Gibbs; Gerald M Rubin

2002-01-01

130

Initial sequencing and comparative analysis of the mouse genome  

Microsoft Academic Search

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing

Robert H. Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F. Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E. Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R. Brent; Daniel G. Brown; Stephen D. Brown; Carol Bult; John Burton; Jonathan Butler; Robert D. Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T. Chinwalla; Deanna M. Church; Michele Clamp; Christopher Clee; Francis S. Collins; Lisa L. Cook; Richard R. Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D. Delehaunty; Justin Deri; Emmanouil T. Dermitzakis; Colin Dewey; Nicholas J. Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M. Dunn; Sean R. Eddy; Laura Elnitski; Richard D. Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A. Fewell; Paul Flicek; Karen Foley; Wayne N. Frankel; Lucinda A. Fulton; Robert S. Fulton; Terrence S. Furey; Diane Gage; Richard A. Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A. Graves; Eric D. Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C. Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W. Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B. Jaffe; L. Steven Johnson; Matthew Jones; Thomas A. Jones; Ann Joy; Michael Kamal; Elinor K. Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W. James Kent; Andrew Kirby; Diana L. Kolbe; Ian Korf; Raju S. Kucherlapati; Edward J. Kulbokas; David Kulp; Tom Landers; J. P. Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R. Maglott; Elaine R. Mardis; Lucy Matthews; Evan Mauceli; John H. Mayer; Megan McCarthy; W. Richard McCombie; Stuart McLaren; Kirsten McLay; John D. McPherson; Jim Meldrim; Beverley Meredith; Jill P. Mesirov; Webb Miller; Tracie L. Miner; Emmanuel Mongin; Kate T. Montgomery; Michael Morgan; Richard Mott; James C. Mullikin; Donna M. Muzny; William E. Nash; Joanne O. Nelson; Michael N. Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J. O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H. Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S. Pohl; Alex Poliakov; Tracy C. Ponce; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A. Roe; Krishna M. Roskin; Edward M. Rubin; Alistair G. Rust; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B. Singer; Guy Slater; Arian Smit; Douglas R. Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P. Vinson; Andrew C. von Niederhausern; Claire M. Wade; Melanie Wall; Ryan J. Weber; Robert B. Weiss; Michael C. Wendl; Anthony P. West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K. Wilson; Eitan Winter; Kim C. Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M. Zdobnov; Michael C. Zody; Eric S. Lander; Chris P. Ponting; Matthias S. Schwartz

2002-01-01

131

The diploid genome sequence of an Asian individual  

Microsoft Academic Search

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the

Jun Wang; Wei Wang; Ruiqiang Li; Yingrui Li; Geng Tian; Laurie Goodman; Wei Fan; Junqing Zhang; Jun Li; Juanbin Zhang; Yiran Guo; Binxiao Feng; Heng Li; Yao Lu; Xiaodong Fang; Huiqing Liang; Zhenglin Du; Dong Li; Yiqing Zhao; Yujie Hu; Zhenzhen Yang; Hancheng Zheng; Ines Hellmann; Michael Inouye; John Pool; Xin Yi; Jing Zhao; Jinjie Duan; Yan Zhou; Junjie Qin; Lijia Ma; Guoqing Li; Zhentao Yang; Guojie Zhang; Bin Yang; Chang Yu; Fang Liang; Wenjie Li; Shaochuan Li; Dawei Li; Peixiang Ni; Jue Ruan; Qibin Li; Hongmei Zhu; Dongyuan Liu; Zhike Lu; Ning Li; Guangwu Guo; Jianguo Zhang; Jia Ye; Lin Fang; Qin Hao; Quan Chen; Yu Liang; Yeyang Su; A. San; Cuo Ping; Shuang Yang; Fang Chen; Li Li; Ke Zhou; Hongkun Zheng; Yuanyuan Ren; Ling Yang; Guohua Yang; Zhuo Li; Xiaoli Feng; Karsten Kristiansen; Gane Ka-Shu Wong; Rasmus Nielsen; Richard Durbin; Lars Bolund; Xiuqing Zhang; Songgang Li; Huanming Yang; Jian Wang

2008-01-01

132

Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux  

PubMed Central

We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ?20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology. PMID:22848480

Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

2012-01-01

133

Ancient human genome sequence of an extinct Palaeo-Eskimo  

E-print Network

ARTICLES Ancient human genome sequence of an extinct Palaeo-Eskimo Morten Rasmussen1,2 *, Yingrui the genome sequence of an ancient human. Obtained from ,4,000-year-old permafrost-preserved hair, the genome, independent of that giving rise to the modern Native Americans and Inuit. Recent advances in DNA sequencing

Nielsen, Rasmus

134

Sequence and comparative analysis of the chicken genome provide unique  

E-print Network

Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution International Chicken Genome Sequencing Consortium* *Lists of participants and affiliations appear ........................................................................................................................................................................................................................... We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken

Edwards, Scott

135

Data structures and compression algorithms for genomic sequence data  

Microsoft Academic Search

Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function, and evolution, but also for the storage, navigation, and privacy of genomic data. Here we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and

Marty C. Brandon; Douglas C. Wallace; Pierre Baldi

2009-01-01

136

The Z curve database: a graphic representation of genome sequences  

Microsoft Academic Search

Motivation: Genome projects for many prokaryotic and eukaryotic species have been completed and more new genome projects are being underway currently. The avail- ability of a large number of genomic sequences for re- searchers creates a need to find graphic tools to study genomes in a perceivable form. The Z curve is one of such tools available for visualizing genomes.

Chun-ting Zhang; Ren Zhang; Hong-yu Ou

2003-01-01

137

The Norway spruce genome sequence and conifer genome evolution.  

PubMed

Conifers have dominated forests for more than 200?million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000?base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding. PMID:23698360

Nystedt, Björn; Street, Nathaniel R; Wetterbom, Anna; Zuccolo, Andrea; Lin, Yao-Cheng; Scofield, Douglas G; Vezzi, Francesco; Delhomme, Nicolas; Giacomello, Stefania; Alexeyenko, Andrey; Vicedomini, Riccardo; Sahlin, Kristoffer; Sherwood, Ellen; Elfstrand, Malin; Gramzow, Lydia; Holmberg, Kristina; Hällman, Jimmie; Keech, Olivier; Klasson, Lisa; Koriabine, Maxim; Kucukoglu, Melis; Käller, Max; Luthman, Johannes; Lysholm, Fredrik; Niittylä, Totte; Olson, Ake; Rilakovic, Nemanja; Ritland, Carol; Rosselló, Josep A; Sena, Juliana; Svensson, Thomas; Talavera-López, Carlos; Theißen, Günter; Tuominen, Hannele; Vanneste, Kevin; Wu, Zhi-Qiang; Zhang, Bo; Zerbe, Philipp; Arvestad, Lars; Bhalerao, Rishikesh; Bohlmann, Joerg; Bousquet, Jean; Garcia Gil, Rosario; Hvidsten, Torgeir R; de Jong, Pieter; MacKay, John; Morgante, Michele; Ritland, Kermit; Sundberg, Björn; Thompson, Stacey Lee; Van de Peer, Yves; Andersson, Björn; Nilsson, Ove; Ingvarsson, Pär K; Lundeberg, Joakim; Jansson, Stefan

2013-05-30

138

Complete genome sequence of Pyrobaculum oguniense  

PubMed Central

Pyrobaculum oguniense TE7 is an aerobic hyperthermophilic crenarchaeon isolated from a hot spring in Japan. Here we describe its main chromosome of 2,436,033 bp, with three large-scale inversions and an extra-chromosomal element of 16,887 bp. We have annotated 2,800 protein-coding genes and 145 RNA genes in this genome, including nine H/ACA-like small RNA, 83 predicted C/D box small RNA, and 47 transfer RNA genes. Comparative analyses with the closest known relative, the anaerobe Pyrobaculum arsenaticum from Italy, reveals unexpectedly high synteny and nucleotide identity between these two geographically distant species. Deep sequencing of a mixture of genomic DNA from multiple cells has illuminated some of the genome dynamics potentially shared with other species in this genus. PMID:23407329

Bernick, David L.; Karplus, Kevin; Lui, Lauren M.; Coker, Joanna K. C.; Murphy, Julie N.; Chan, Patricia P.; Cozen, Aaron E.

2012-01-01

139

The genome sequence of Schizosaccharomyces pombe  

Microsoft Academic Search

We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended

R. Gwilliam; M.-A. Rajandream; M. Lyne; R. Lyne; A. Stewart; J. Sgouros; N. Peat; J. Hayles; S. Baker; D. Basham; S. Bowman; K. Brooks; D. Brown; S. Brown; T. Chillingworth; C. Churcher; M. Collins; R. Connor; A. Cronin; P. Davis; T. Feltwell; A. Fraser; S. Gentles; A. Goble; N. Hamlin; D. Harris; J. Hidalgo; G. Hodgson; S. Holroyd; T. Hornsby; S. Howarth; E. J. Huckle; S. Hunt; K. Jagels; K. James; L. Jones; M. Jones; S. Leather; S. McDonald; J. McLean; P. Mooney; S. Moule; K. Mungall; L. Murphy; D. Niblett; C. Odell; K. Oliver; S. O'Neil; D. Pearson; M. A. Quail; E. Rabbinowitsch; K. Rutherford; S. Rutter; D. Saunders; K. Seeger; S. Sharp; J. Skelton; M. Simmonds; R. Squares; S. Squares; K. Stevens; K. Taylor; R. G. Taylor; A. Tivey; S. Walsh; T. Warren; S. Whitehead; J. Woodward; G. Volckaert; R. Aert; J. Robben; B. Grymonprez; I. Weltjens; E. Vanstreels; M. Rieger; M. Schäfer; S. Müller-Auer; C. Gabel; M. Fuchs; C. Fritzc; E. Holzer; D. Moestl; H. Hilbert; K. Borzym; I. Langer; A. Beck; H. Lehrach; R. Reinhardt; T. M. Pohl; P. Eger; W. Zimmermann; H. Wedler; R. Wambutt; B. Purnelle; A. Goffeau; E. Cadieu; S. Dréano; S. Gloux; V. Lelaure; S. Mottier; F. Galibert; S. J. Aves; Z. Xiang; C. Hunt; K. Moore; S. M. Hurst; M. Lucas; M. Rochet; C. Gaillardin; V. A. Tallada; A. Garzon; G. Thode; R. R. Daga; L. Cruzado; J. Jimenez; M. Sánchez; F. del Rey; J. Benito; A. Domínguez; J. L. Revuelta; S. Moreno; J. Armstrong; S. L. Forsburg; L. Cerrutti; T. Lowe; W. R. McCombie; I. Paulsen; J. Potashkin; G. V. Shpakovski; D. Ussery; B. G. Barrell; P. Nurse

2002-01-01

140

Modeling alternate RNA structures in genomic sequences.  

PubMed

We introduce the concept of RNA multistructures, which is a formal grammar-based framework specifically designed to model a set of alternate RNA secondary structures. Such alternate structures can either be a set of suboptimal foldings, or distinct stable folding states, or variants within an RNA family. We provide several such examples and propose an efficient algorithm to search for RNA multistructures within a genomic sequence. PMID:25768235

Saffarian, Azadeh; Giraud, Mathieu; Touzet, Hélène

2015-03-01

141

Draft Genome Sequence of Rubrivivax gelatinosus CBS  

SciTech Connect

Rubrivivax gelatinosus CBS, a purple nonsulfur photosynthetic bacterium, can grow photosynthetically using CO and N{sub 2} as the sole carbon and nitrogen nutrients, respectively. R. gelatinosus CBS is of particular interest due to its ability to metabolize CO and yield H{sub 2}. We present the 5-Mb draft genome sequence of R. gelatinosus CBS with the goal of providing genetic insight into the metabolic properties of this bacterium.

Hu, P. S.; Lang, J.; Wawrousek, K.; Yu, J. P.; Maness, P. C.; Chen, J.

2012-06-01

142

Genome sequence of Halobacterium species NRC-1  

PubMed Central

We report the complete sequence of an extreme halophile, Halobacterium sp. NRC-1, harboring a dynamic 2,571,010-bp genome containing 91 insertion sequences representing 12 families and organized into a large chromosome and 2 related minichromosomes. The Halobacterium NRC-1 genome codes for 2,630 predicted proteins, 36% of which are unrelated to any previously reported. Analysis of the genome sequence shows the presence of pathways for uptake and utilization of amino acids, active sodium-proton antiporter and potassium uptake systems, sophisticated photosensory and signal transduction pathways, and DNA replication, transcription, and translation systems resembling more complex eukaryotic organisms. Whole proteome comparisons show the definite archaeal nature of this halophile with additional similarities to the Gram-positive Bacillus subtilis and other bacteria. The ease of culturing Halobacterium and the availability of methods for its genetic manipulation in the laboratory, including construction of gene knockouts and replacements, indicate this halophile can serve as an excellent model system among the archaea. PMID:11016950

Ng, Wailap Victor; Kennedy, Sean P.; Mahairas, Gregory G.; Berquist, Brian; Pan, Min; Shukla, Hem Dutt; Lasky, Stephen R.; Baliga, Nitin S.; Thorsson, Vesteinn; Sbrogna, Jennifer; Swartzell, Steven; Weir, Douglas; Hall, John; Dahl, Timothy A.; Welti, Russell; Goo, Young Ah; Leithauser, Brent; Keller, Kim; Cruz, Randy; Danson, Michael J.; Hough, David W.; Maddocks, Deborah G.; Jablonski, Peter E.; Krebs, Mark P.; Angevine, Christine M.; Dale, Heather; Isenbarger, Thomas A.; Peck, Ronald F.; Pohlschroder, Mechthild; Spudich, John L.; Jung, Kwang-Hwan; Alam, Maqsudul; Freitas, Tracey; Hou, Shaobin; Daniels, Charles J.; Dennis, Patrick P.; Omer, Arina D.; Ebhardt, Holger; Lowe, Todd M.; Liang, Ping; Riley, Monica; Hood, Leroy; DasSarma, Shiladitya

2000-01-01

143

Why Assembling Plant Genome Sequences Is So Challenging  

PubMed Central

In spite of the biological and economic importance of plants, relatively few plant species have been sequenced. Only the genome sequence of plants with relatively small genomes, most of them angiosperms, in particular eudicots, has been determined. The arrival of next-generation sequencing technologies has allowed the rapid and efficient development of new genomic resources for non-model or orphan plant species. But the sequencing pace of plants is far from that of animals and microorganisms. This review focuses on the typical challenges of plant genomes that can explain why plant genomics is less developed than animal genomics. Explanations about the impact of some confounding factors emerging from the nature of plant genomes are given. As a result of these challenges and confounding factors, the correct assembly and annotation of plant genomes is hindered, genome drafts are produced, and advances in plant genomics are delayed. PMID:24832233

Claros, Manuel Gonzalo; Bautista, Rocío; Guerrero-Fernández, Darío; Benzerki, Hicham; Seoane, Pedro; Fernández-Pozo, Noé

2012-01-01

144

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence  

Microsoft Academic Search

BACKGROUND: Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS)

Frank M You; Naxin Huo; Karin R Deal; Yong Q Gu; Ming-Cheng Luo; Patrick E McGuire; Jan Dvorak; Olin D Anderson

2011-01-01

145

Whole genome sequence (WGS) analysis for exploring plant relationships  

Microsoft Academic Search

Shotgun sequencing plant genomic DNA preparations generates large quantities of sequence data in a single run. Using the Illumina GAII, whole genome shot-gun sequence (WGS) data was generated for Oryza sativa cv Nipponbarre, and the rice wild relatives Oryza meridionalis and Oryza australiensis. Two other grass species were also sequenced, Potamophila parviflora, from the Oryzeae tribe and Microlaena stipoides from

Nicole F Rice; Giovanni M Cordeiro; Catherine J Nock; Daniel LE Waters; Stirling Bowen; Robert J Henry

2010-01-01

146

Advances in understanding cancer genomes through second-generation sequencing  

Microsoft Academic Search

Cancers are caused by the accumulation of genomic alterations. Therefore, analyses of cancer genome sequences and structures provide insights for understanding cancer biology, diagnosis and therapy. The application of second-generation DNA sequencing technologies (also known as next-generation sequencing) — through whole-genome, whole-exome and whole-transcriptome approaches — is allowing substantial advances in cancer genomics. These methods are facilitating an increase in

Stacey Gabriel; Gad Getz; Matthew Meyerson

2010-01-01

147

Genome, Epigenome and RNA sequences of Monozygotic Twins Discordant for Multiple Sclerosis  

SciTech Connect

Neil Miller, Deputy Director of Software Engineering at the National Center for Genome Resources, discusses a monozygotic twin study on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

Miller, Neil [National Center for Genome Resources

2010-06-02

148

Whole Chloroplast Genome Sequencing in Fragaria Using Deep Sequencing: A Comparison of Three Methods  

Technology Transfer Automated Retrieval System (TEKTRAN)

Chloroplast sequences previously investigated in Fragaria revealed low amounts of variation. Deep sequencing technologies enable economical sequencing of complete chloroplast genomes. These sequences can potentially provide robust phylogenetic resolution, even at low taxonomic levels within plant gr...

149

NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins  

Microsoft Academic Search

The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database (http:\\/\\/www.ncbi.nlm.nih.gov\\/RefSeq\\/) provides a non-redundant collection of sequences representing genomic data, transcripts and proteins. Although the goal is to provide a comprehensive dataset repres- enting the complete sequence information for any given species, the database pragmatically includes sequence data that are currently publicly available in the archival databases. The database

Kim D. Pruitt; Tatiana A. Tatusova; Donna R. Maglott

2005-01-01

150

Ten years of bacterial genome sequencing: comparative-genomics-based discoveries  

Microsoft Academic Search

It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: “What have we learned from this vast amount of

Tim T. Binnewies; Yair Motro; Peter F. Hallin; Ole Lund; David Dunn; Tom La; David J. Hampson; Matthew Bellgard; Trudy M. Wassenaar; David W. Ussery

2006-01-01

151

Genome Sequence of the Pea Aphid Acyrthosiphon The International Aphid Genomics Consortium"  

E-print Network

Genome Sequence of the Pea Aphid Acyrthosiphon pisum The International Aphid Genomics Consortium we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple

Paris-Sud XI, Université de

152

The International Rice Genome Sequencing Project: progress and prospects  

Microsoft Academic Search

The rice genome sequencing project has been pursued as a national project in Japan since 1998. At the same time, a desire to accelerate the sequenc- ing of the entire rice genome led to the formation of the International Rice Genome Sequencing Project (IRGSP), initially comprising five countries. The sequencing strategy is the conventional clone-by-clone shotgun method us- ing P1-derived

T. Sasaki; T. Matsumoto; T. Baba; K. Yamamoto; J. Wu; Y. Katayose; K. Sakata

153

Initial sequencing and comparative analysis of the mouse genome  

SciTech Connect

The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

Waterston, Robert H.; Lindblad-Toh, Kerstin; Birney, Ewan; Rogers, Jane; Abril, Josep F.; Agarwal, Pankaj; Agarwala, Richa; Ainscough, Rachel; Alexandersson, Marina; An, Peter; Antonarakis, Stylianos E.; Attwood, John; Baertsch, Robert; Bailey, Jonathon; Barlow, Karen; Beck, Stephan; Berry, Eric; Birren, Bruce; Bloom, Toby; Bork, Peer; Botcherby, Marc; Bray, Nicolas; Brent, Michael R.; Brown, Daniel G.; Brown, Stephen D.; Bult, Carol; Burton, John; Butler, Jonathan; Campbell, Robert D.; Carninci, Piero; Cawley, Simon; Chiaromonte, Francesca; Chinwalla, Asif T.; Church, Deanna M.; Clamp, Michele; Clee, Christopher; Collins, Francis S.; Cook, Lisa L.; Copley, Richard R.; Coulson, Alan; Couronne, Olivier; Cuff, James; Curwen, Val; Cutts, Tim; Daly, Mark; David, Robert; Davies, Joy; Delehaunty, Kimberly D.; Deri, Justin; Dermitzakis, Emmanouil T.; Dewey, Colin; Dickens, Nicholas J.; Diekhans, Mark; Dodge, Sheila; Dubchak, Inna; Dunn, Diane M.; Eddy, Sean R.; Elnitski, Laura; Emes, Richard D.; Eswara, Pallavi; Eyras, Eduardo; Felsenfeld, Adam; Fewell, Ginger A.; Flicek, Paul; Foley, Karen; Frankel, Wayne N.; Fulton, Lucinda A.; Fulton, Robert S.; Furey, Terrence S.; Gage, Diane; Gibbs, Richard A.; Glusman, Gustavo; Gnerre, Sante; Goldman, Nick; Goodstadt, Leo; Grafham, Darren; Graves, Tina A.; Green, Eric D.; Gregory, Simon; Guigo, Roderic; Guyer, Mark; Hardison, Ross C.; Haussler, David; Hayashizaki, Yoshihide; Hillier, LaDeana W.; Hinrichs, Angela; Hlavina, Wratko; Holzer, Timothy; Hsu, Fan; Hua, Axin; Hubbard, Tim; Hunt, Adrienne; Jackson, Ian; Jaffe, David B.; Johnson, L. Steven; Jones, Matthew; Jones, Thomas A.; Joy, Ann; Kamal, Michael; Karlsson, Elinor K.; Karolchik, Donna; Kasprzyk, Arkadiusz; Kawai, Jun; Keibler, Evan; Kells, Cristyn; Kent, W. James; Kirby, Andrew; Kolbe, Diana L.; Korf, Ian; Kucherlapati, Raju S.; Kulbokas III, Edward J.; Kulp, David; Landers, Tom; Leger, J.P.; Leonard, Steven; Letunic, Ivica; Levine, Rosie; et al.

2002-12-15

154

A comparison of whole genome sequencing with exome sequencing for family-based association studies  

PubMed Central

As the cost of DNA sequencing decreases, association studies based on whole genome sequencing are now becoming feasible. It is still unclear, however, how much more we could gain from whole genome sequencing compared to exome sequencing, which has been widely used to study a variety of diseases. In this project, we performed a comparison between whole genome sequencing and exome sequencing for family-based association analysis using data from Genetic Analysis Workshop 18. Whole genome sequencing was able to identify several significant hits within intergenic regions. However, the increased cost of multiple testing counteracted the benefits and resulted in a higher false discovery rate. Our results suggest that exome sequencing is a cost-effective way to identify disease-related variants. With the decreasing sequencing cost and accumulating knowledge of the human genome, whole genome sequencing has the potential to identify important variants in regulatory regions typically inaccessible for exome sequencing. PMID:25519383

2014-01-01

155

[Science, communication and policy: sequencing the rice genome].  

PubMed

Nearly 4 years after launching the International Rice Genome Sequencing Project (IRGSP), the rice genome sequence is almost completed. This is the second plant genome after Arabidopsis thaliana and one expect that it is more representative of other cereal genomes. Indeed, no more than 4 sequences have been independently reported as a result of a tough competition between economy, politics and media. The efficiency and impact of this way of managing a large scale project is questionable. This paper reports the various phases in sequencing rice genome as well as what we start to learn. PMID:12836228

Delseny, Michel

2003-04-01

156

Statistical Properties of Open Reading Frames in Complete Genome Sequences  

Microsoft Academic Search

Some statistical properties of open reading frames in all currently available complete genome sequences are analyzed (seventeen prokatyotic genomes, and 16 chromosome sequences from the yeast genome). The size distribution of open reading frames is characterized by various techniques, such as quantile tables, QQ-plots, rank- size plots (Zipf's plots), and spatial densities. The issue of the influence of CG% on

Wentian Li

1999-01-01

157

Draft Genome Sequence of Geotrichum candidum Strain 3C  

PubMed Central

We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

Bobrov, Kirill S.; Eneyskaya, Elena V.; Kulminskaya, Anna A.

2014-01-01

158

Combined Evidence Annotation of Transposable Elements in Genome Sequences  

Microsoft Academic Search

Transposable elements (TEs) are mobile, repetitive sequences that make up significant fractions of metazoan genomes. Despite their near ubiquity and importance in genome and chromosome biology, most efforts to annotate TEs in genome sequences rely on the results of a single computational program, RepeatMasker. In contrast, recent advances in gene annotation indicate that high-quality gene models can be produced from

Hadi Quesneville; Olivier Andrieu; Delphine Autard; Danielle Nouaud; Michael Ashburner; Dominique Anxolabehere

2005-01-01

159

Genome Sequence of Pediococcus pentosaceus Strain IE-3  

PubMed Central

We report the 1.8-Mb genome sequence of Pediococcus pentosaceus strain IE-3, isolated from a dairy effluent sample. The whole-genome sequence of this strain will aid in comparative genomics of Pediococcus pentosaceus strains of diverse ecological origins and their biotechnological applications. PMID:22843596

Midha, Samriti; Ranjan, Manish; Sharma, Vikas; Kumari, Annu; Singh, Pradip Kumar

2012-01-01

160

Draft Genome Sequence of Geotrichum candidum Strain 3C.  

PubMed

We report here the draft genome sequence of Geotrichum candidum strain 3C, which is a filamentous yeast-like fungus that holds great promise for biotechnology. The genome was sequenced using Ion Torrent and 454 platforms. The estimated genome size was 41.4 Mb, and 14,579 protein-coding genes were predicted ab initio. PMID:25278525

Polev, Dmitrii E; Bobrov, Kirill S; Eneyskaya, Elena V; Kulminskaya, Anna A

2014-01-01

161

Complete Genome Sequence of the Embu Virus Strain SPAn880  

PubMed Central

We report the complete genome sequence of the Embu virus. The genome consists of 185,139 bp and is nearly identical to that of the Cotia virus. This is the first report on the Embu virus genome sequence, which has been considered an unclassified poxvirus until now. PMID:25477400

Antwerpen, Markus; Georgi, Enrico; Vette, Philipp; Zoeller, Gudrun; Meyer, Hermann

2014-01-01

162

Simple sequence repeats in bryophyte mitochondrial genomes.  

PubMed

Abstract Simple sequence repeats (SSRs) are thought to be common in plant mitochondrial (mt) genomes, but have yet to be fully described for bryophytes. We screened the mt genomes of two liverworts (Marchantia polymorpha and Pleurozia purpurea), two mosses (Physcomitrella patens and Anomodon rugelii) and two hornworts (Phaeoceros laevis and Nothoceros aenigmaticus), and detected 475 SSRs. Some SSRs are found conserved during the evolution, among which except one exists in both liverworts and mosses, all others are shared only by the two liverworts, mosses or hornworts. SSRs are known as DNA tracts having high mutation rates; however, according to our observations, they still can evolve slowly. The conservativeness of these SSRs suggests that they are under strong selection and could play critical roles in maintaining the gene functions. PMID:24491104

Zhao, Chao-Xian; Zhu, Rui-Liang; Liu, Yang

2014-02-01

163

Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences  

Microsoft Academic Search

During the past decade, there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. There are 45 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next 5 years. Several groups of researchers including ours have

Robert K. Jansen; Linda A. Raubeson; Jeffrey L. Boore; Claude W. dePamphilis; Timothy W. Chumley; Rosemarie C. Haberle; Stacia K. Wyman; Andrew J. Alverson; Rhiannon Peery; Sallie J. Herman; H. Matthew Fourcade; Jennifer V. Kuehl; Joel R. McNeal; James Leebens-Mack; Liying Cui

2005-01-01

164

Initial sequencing and comparative analysis of the mouse genome  

E-print Network

and knockin techniques17­22 . For these and other reasons, the Human Genome Project (HGP) recognized from its ........................................................................................................................................................................................................................... The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from

Eddy, Sean

165

Genomic Sequence Comparisons, 1987-2003 Final Report  

SciTech Connect

This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

George M. Church

2004-07-29

166

University of Tokyo-Institute of Medical Science: Human Genome Center  

NSDL National Science Digital Library

The Human Genome Center was established in 1991 at the University of Tokyo's Institute of Medical Science. In pursuit of progress in the areas of human disease diagnosis, care, and prevention, the Center conducts genome research in Japan and participates in "international activities in database construction, mapping, and sequencing of the human genome." The Genome Center website contains links to its nine Laboratories which conduct research in the following areas: Genome Structure, Sequence Analysis, Molecular Medicine, and DNA Information Analysis, to name a few. Laboratory pages contain information about research, publications, staff, and services. The Center site also links to a number of databases and software tools including a database of Japanese Single Nucleotide Polymorphisms (JSNP), Microbial Genome Database for Comparative Analysis (MBGD), PSI-BLAST, TFBIND (software for searching transcription factor binding sites), and more.

167

Current challenges in de novo plant genome sequencing and assembly  

PubMed Central

Genome sequencing is now affordable, but assembling plant genomes de novo remains challenging. We assess the state of the art of assembly and review the best practices for the community. PMID:22546054

2012-01-01

168

Initial impact of the sequencing of the human genome  

E-print Network

The sequence of the human genome has dramatically accelerated biomedical research. Here I explore its impact, in the decade since its publication, on our understanding of the biological functions encoded in the genome, on ...

Massachusetts Institute of Technology. Department of Biology; Broad Institute of MIT and Harvard; Lander, Eric S.; Lander, Eric S.

169

Genome sequencing of the important oilseed crop Sesamum indicum L  

PubMed Central

The Sesame Genome Working Group (SGWG) has been formed to sequence and assemble the sesame (Sesamum indicum L.) genome. The status of this project and our planned analyses are described. PMID:23369264

2013-01-01

170

Fast and Sensitive Alignment of Large Genomic Sequences  

Microsoft Academic Search

Comparative analysis of syntenic genome sequences can be used to identify functional sites such as exons and regulatory elements. Here, the first step is to align two or several evolutionary related sequences and, in recent years, a number of computer programs have been developed for alignment of large genomic sequences. Some of these programs are extremely fast but often time-efficiency

Michael Brudno; Burkhard Morgenstern

2002-01-01

171

Digital Signal Processing in the Analysis of Genomic Sequences  

Microsoft Academic Search

Digital Signal Processing (DSP) applications in Bioinformatics have received great attention in recent years, where new effective methods for genomic sequence analysis, such as the detection of coding regions, have been devel- oped. The use of DSP principles to analyze genomic sequences requires defining an adequate representation of the nucleo- tide bases by numerical values, converting the nucleotide sequences into

Juan V. Lorenzo-Ginori; Aníbal Rodríguez-Fuentes; Ricardo Grau Ábalo; Robersy Sánchez

2009-01-01

172

Next Generation Sequencing at the University of Chicago Genomics Core  

SciTech Connect

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

Faber, Pieter [University of Chicago

2013-04-24

173

Validation of rice genome sequence by optical mapping  

Microsoft Academic Search

BACKGROUND: Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data. RESULTS: To facilitate ongoing sequencing finishing and validation

Shiguo Zhou; Michael C Bechner; Chris P Churas; Louise Pape; Sally A Leong; Rod Runnheim; Dan K Forrest; Steve Goldstein; Miron Livny; David C Schwartz

2007-01-01

174

Genome variation discovery with high-throughput sequencing data  

E-print Network

require extensive computational analysis in order to identify genomic variants present in the sequenced], and to understand the regulation of genes by sequencing chromatin immunoprecipitation products (ChIP-Seq) [12 representative of the species (the reference), while an HTS technology is used to sequence reads from the genome

Toronto, University of

175

Genome Sequence and Analysis of a Propionibacterium acnes Bacteriophage? †  

PubMed Central

Cutaneous propionibacteria are important commensals of human skin and are implicated in a wide range of opportunistic infections. Propionibacterium acnes is also associated with inflammatory acne vulgaris. Bacteriophage PA6 is the first phage of P. acnes to be sequenced and demonstrates a high degree of similarity to many mycobacteriophages both morphologically and genetically. PA6 possesses an icosahedreal head and long noncontractile tail characteristic of the Siphoviridae. The overall genome organization of PA6 resembled that of the temperate mycobacteriophages, although the genome was much smaller, 29,739 bp (48 predicted genes), compared to, for example, 50,550 bp (86 predicted genes) for the Bxb1 genome. PA6 infected only P. acnes and produced clear plaques with turbid centers, but it lacked any obvious genes for lysogeny. The host range of PA6 was restricted to P. acnes, but the phage was able to infect and lyse all P. acnes isolates tested. Sequencing of the PA6 genome makes an important contribution to the study of phage evolution and propionibacterial genetics. PMID:17400737

Farrar, Mark D.; Howson, Karen M.; Bojar, Richard A.; West, David; Towler, James C.; Parry, James; Pelton, Katharine; Holland, Keith T.

2007-01-01

176

Complete genome sequence of Arcanobacterium haemolyticum type strain (11018T)  

SciTech Connect

Vulcanisaeta distributa Itoh et al. 2002 belongs to the family Thermoproteaceae in the phylum Crenarchaeota. The genus Vulcanisaeta is characterized by a global distribution in hot and acidic springs. This is the first genome sequence from a member of the genus Vulcanisaeta and seventh genome sequence in the family Thermoproteaceae. The 2,374,137 bp long genome with its 2,544 protein-coding and 49 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Teshima, Hazuki [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

2010-01-01

177

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence  

Technology Transfer Automated Retrieval System (TEKTRAN)

An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions fr...

178

Next-generation sequencing strategies for characterizing the turkey genome.  

PubMed

The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry. PMID:24570472

Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

2014-02-01

179

Applications of next-generation sequencing technologies in functional genomics  

Microsoft Academic Search

A new generation of sequencing technologies, from Illumina\\/Solexa, ABI\\/SOLiD, 454\\/Roche, and Helicos, has provided unprecedented opportunities for high-throughput functional genomic research. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted resequencing, discovery of transcription factor binding sites, and noncoding RNA expression profiling. This review discusses applications of next-generation sequencing technologies in functional genomics

Olena Morozova; Marco A. Marra

2008-01-01

180

Comparative DNA Sequence Analysis of Wheat and Rice Genomes  

PubMed Central

The use of DNA sequence-based comparative genomics for evolutionary studies and for transferring information from model species to crop species has revolutionized molecular genetics and crop improvement strategies. This study compared 4485 expressed sequence tags (ESTs) that were physically mapped in wheat chromosome bins, to the public rice genome sequence data from 2251 ordered BAC/PAC clones using BLAST. A rice genome view of homologous wheat genome locations based on comparative sequence analysis revealed numerous chromosomal rearrangements that will significantly complicate the use of rice as a model for cross-species transfer of information in nonconserved regions. PMID:12902377

Sorrells, Mark E.; La Rota, Mauricio; Bermudez-Kandianis, Catherine E.; Greene, Robert A.; Kantety, Ramesh; Munkvold, Jesse D.; Miftahudin; Mahmoud, Ahmed; Ma, Xuefeng; Gustafson, Perry J.; Qi, Lili L.; Echalier, Benjamin; Gill, Bikram S.; Matthews, David E.; Lazo, Gerard R.; Chao, Shiaoman; Anderson, Olin D.; Edwards, Hugh; Linkiewicz, Anna M.; Dubcovsky, Jorge; Akhunov, Eduard D.; Dvorak, Jan; Zhang, Deshui; Nguyen, Henry T.; Peng, Junhua; Lapitan, Nora L.V.; Gonzalez-Hernandez, Jose L.; Anderson, James A.; Hossain, Khwaja; Kalavacharla, Venu; Kianian, Shahryar F.; Choi, Dong-Woog; Close, Timothy J.; Dilbirligi, Muharrem; Gill, Kulvinder S.; Steber, Camille; Walker-Simmons, Mary K.; McGuire, Patrick E.; Qualset, Calvin O.

2003-01-01

181

Comparative chloroplast genomics: Analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus  

E-print Network

genome sequences and make comparisons (within angiosperms, seed plants,genome sequence from Korean Ginseng (Panax schiseng Nees) and comparative analysis of sequence evolution among 17 vascular plants.genomes of all other vascular plant taxa examined, a similar sequence

2007-01-01

182

Sequencing and Assembly of the 22-Gb Loblolly Pine Genome  

PubMed Central

Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer “super-reads,” rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp. PMID:24653210

Zimin, Aleksey; Stevens, Kristian A.; Crepeau, Marc W.; Holtz-Morris, Ann; Koriabine, Maxim; Marçais, Guillaume; Puiu, Daniela; Roberts, Michael; Wegrzyn, Jill L.; de Jong, Pieter J.; Neale, David B.; Salzberg, Steven L.; Yorke, James A.; Langley, Charles H.

2014-01-01

183

Mapping the Human Reference Genome’s Missing Sequence by Three-Way Admixture in Latino Genomes  

PubMed Central

A principal obstacle to completing maps and analyses of the human genome involves the genome’s “inaccessible” regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)—a substantial fraction of the human genome’s remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

Genovese, Giulio; Handsaker, Robert E.; Li, Heng; Kenny, Eimear E.; McCarroll, Steven A.

2013-01-01

184

BorreliaBase: a phylogeny-centered browser of Borrelia genomes  

PubMed Central

Background The bacterial genus Borrelia (phylum Spirochaetes) consists of two groups of pathogens represented respectively by B. burgdorferi, the agent of Lyme borreliosis, and B. hermsii, the agent of tick-borne relapsing fever. The number of publicly available Borrelia genomic sequences is growing rapidly with the discovery and sequencing of Borrelia strains worldwide. There is however a lack of dedicated online databases to facilitate comparative analyses of Borrelia genomes. Description We have developed BorreliaBase, an online database for comparative browsing of Borrelia genomes. The database is currently populated with sequences from 35 genomes of eight Lyme-borreliosis (LB) group Borrelia species and 7 Relapsing-fever (RF) group Borrelia species. Distinct from genome repositories and aggregator databases, BorreliaBase serves manually curated comparative-genomic data including genome-based phylogeny, genome synteny, and sequence alignments of orthologous genes and intergenic spacers. Conclusions With a genome phylogeny at its center, BorreliaBase allows online identification of hypervariable lipoprotein genes, potential regulatory elements, and recombination footprints by providing evolution-based expectations of sequence variability at each genomic locus. The phylo-centric design of BorreliaBase (http://borreliabase.org) is a novel model for interactive browsing and comparative analysis of bacterial genomes online. PMID:24994456

2014-01-01

185

Genome scanning : an AFM-based DNA sequencing technique  

E-print Network

Genome Scanning is a powerful new technique for DNA sequencing. The method presented in this thesis uses an atomic force microscope with a functionalized cantilever tip to sequence single stranded DNA immobilized to a mica ...

Elmouelhi, Ahmed (Ahmed M.), 1979-

2003-01-01

186

Insights from 20 years of bacterial genome sequencing.  

PubMed

Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them. PMID:25722247

Land, Miriam; Hauser, Loren; Jun, Se-Ran; Nookaew, Intawat; Leuze, Michael R; Ahn, Tae-Hyuk; Karpinets, Tatiana; Lund, Ole; Kora, Guruprased; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W

2015-03-01

187

Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome  

Microsoft Academic Search

Background: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. Results: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D.

Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

2002-01-01

188

De Novo Next Generation Sequencing of Plant Genomes  

Microsoft Academic Search

The genome sequencing of all major food and bioenergy crops is of critical importance in the race to improve crop production\\u000a to meet the future food and energy security needs of the world. Next generation sequencing technologies have brought about\\u000a great improvements in sequencing throughput and cost, but do not yet allow for de novo sequencing of large repetitive genomes

Steve Rounsley; Pradeep Reddy Marri; Yeisoo Yu; Ruifeng He; Nick Sisneros; Jose Luis Goicoechea; So Jeong Lee; Angelina Angelova; Dave Kudrna; Meizhong Luo; Jason Affourtit; Brian Desany; James Knight; Faheem Niazi; Michael Egholm; Rod A. Wing

2009-01-01

189

Volatiles from nineteen recently genome sequenced actinomycetes.  

PubMed

The volatiles released by agar plate cultures of nineteen actinomycetes whose genomes were recently sequenced were collected by use of a closed-loop stripping apparatus (CLSA) and analysed by GC/MS. In total, 178 compounds from various classes were identified. The most interesting findings were the detection of the insect pheromone frontalin in Streptomyces varsoviensis, and the emission of the unusual plant metabolite 1-nitro-2-phenylethane. Its biosynthesis from phenylalanine was investigated in isotopic labelling experiments. Furthermore, the identified terpenes were correlated to the information about terpene cyclase homologs encoded in the investigated strains. The analytical data were in line with functionally characterised bacterial terpene cyclases and particularly corroborated the recently suggested function of a terpene cyclase from Streptomyces violaceusniger by the identification of a functional homolog in Streptomyces rapamycinicus. PMID:25585196

Citron, Christian A; Barra, Lena; Wink, Joachim; Dickschat, Jeroen S

2015-02-18

190

Genome Project Standards in a New Era of Sequencing  

SciTech Connect

For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.

GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

2009-06-01

191

Ten years of bacterial genome sequencing: comparative-genomics-based discoveries.  

PubMed

It has been more than 10 years since the first bacterial genome sequence was published. Hundreds of bacterial genome sequences are now available for comparative genomics, and searching a given protein against more than a thousand genomes will soon be possible. The subject of this review will address a relatively straightforward question: "What have we learned from this vast amount of new genomic data?" Perhaps one of the most important lessons has been that genetic diversity, at the level of large-scale variation amongst even genomes of the same species, is far greater than was thought. The classical textbook view of evolution relying on the relatively slow accumulation of mutational events at the level of individual bases scattered throughout the genome has changed. One of the most obvious conclusions from examining the sequences from several hundred bacterial genomes is the enormous amount of diversity--even in different genomes from the same bacterial species. This diversity is generated by a variety of mechanisms, including mobile genetic elements and bacteriophages. An examination of the 20 Escherichia coli genomes sequenced so far dramatically illustrates this, with the genome size ranging from 4.6 to 5.5 Mbp; much of the variation appears to be of phage origin. This review also addresses mobile genetic elements, including pathogenicity islands and the structure of transposable elements. There are at least 20 different methods available to compare bacterial genomes. Metagenomics offers the chance to study genomic sequences found in ecosystems, including genomes of species that are difficult to culture. It has become clear that a genome sequence represents more than just a collection of gene sequences for an organism and that information concerning the environment and growth conditions for the organism are important for interpretation of the genomic data. The newly proposed Minimal Information about a Genome Sequence standard has been developed to obtain this information. PMID:16773396

Binnewies, Tim T; Motro, Yair; Hallin, Peter F; Lund, Ole; Dunn, David; La, Tom; Hampson, David J; Bellgard, Matthew; Wassenaar, Trudy M; Ussery, David W

2006-07-01

192

Finishing The Euchromatic Sequence Of The Human Genome  

SciTech Connect

The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process.The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers {approx}99% of the euchromatic genome and is accurate to an error rate of {approx}1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number,birth and death. Notably, the human genome seems to encode only20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.

Rubin, Edward M.; Lucas, Susan; Richardson, Paul; Rokhsar, Daniel; Pennacchio, Len

2004-09-07

193

Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes.  

PubMed

A principal obstacle to completing maps and analyses of the human genome involves the genome's "inaccessible" regions: sequences (often euchromatic and containing genes) that are isolated from the rest of the euchromatic genome by heterochromatin and other repeat-rich sequence. We describe a way to localize these sequences by using ancestry linkage disequilibrium in populations that derive ancestry from at least three continents, as is the case for Latinos. We used this approach to map the genomic locations of almost 20 megabases of sequence unlocalized or missing from the current human genome reference (NCBI Genome GRCh37)-a substantial fraction of the human genome's remaining unmapped sequence. We show that the genomic locations of most sequences that originated from fosmids and larger clones can be admixture mapped in this way, by using publicly available whole-genome sequence data. Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. PMID:23932108

Genovese, Giulio; Handsaker, Robert E; Li, Heng; Kenny, Eimear E; McCarroll, Steven A

2013-09-01

194

Research ethics and the challenge of whole-genome sequencing  

Microsoft Academic Search

The recent completion of the first two individual whole-genome sequences is a research milestone. As personal genome research advances, investigators and international research bodies must ensure ethical research conduct. We identify three major ethical considerations that have been implicated in whole-genome research: the return of research results to participants; the obligations, if any, that are owed to participants' relatives; and

Amy L. McGuire; Mildred K. Cho; Timothy Caulfield

2007-01-01

195

Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum.  

PubMed

Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches. PMID:25883287

Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

2015-01-01

196

Genome Sequence of Aedes aegypti, a Major Arbovirus Vector  

Microsoft Academic Search

We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at ~1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of ~4 to

Vishvanath Nene; Jennifer R. Wortman; Daniel Lawson; Brian Haas; Chinnappa Kodira; Z. Tu; Brendan Loftus; Zhiyong Xi; Karyn Megy; Manfred Grabherr; Quinghu Ren; E. M. Zdobnov; N. F. Lobo; K. S. Campbell; S. E. Brown; M. F. Bonaldo; Jingsong Zhu; S. P. Sinkins; D. G. Hogenkamp; Paolo Amedeo; Peter Arensburger; P. W. Atkinson; Shelby Bidwell; Jim Biedler; Ewan Birney; Robert V. Bruggner; Javier Costas; M. R. Coy; Jonathan Crabtree; Matt Crawford; Becky deBruyn; David DeCaprio; Karin Eiglmeier; Eric Eisenstadt; Hamza El-Dorry; W. M. Gelbart; S. L. Gomes; Martin Hammond; Linda I. Hannick; M. H. Holmes; J. R. Hogan; David Jaffe; J. S. Johnston; R. C. Kennedy; Hean Koo; Saul Kravitz; Evgenia V. Kriventseva; David Kulp; Kurt LaButti; Eduardo Lee; Song Li; Diane D. Lovin; Chunhong Mao; Evan Mauceli; C. F. M. Menck; J. R. Miller; Philip Montgomery; Akio Mori; A. L. Nascimento; H. F. Naveira; Chad Nusbaum; S. O'Leary; Joshua Orvis; Mihaela Pertea; Hadi Quesneville; K. R. Reidenbach; Yu-Hui Rogers; C. W. Roth; J. R. Schneider; Michael Schatz; Martin Shumway; Mario Stanke; E. O. Stinson; J. M. C. Tubio; J. P. VanZee; Sergio Verjovski-Almeida; Doreen Werner; Owen White; Stefan Wyder; Qiandong Zeng; Qi Zhao; Yongmei Zhao; C. A. Hill; A. S. Raikhel; M. B. Soares; D. L. Knudson; N. H. Lee; James Galagan; S. L. Salzberg; I. T. Paulsen; George Dimopoulos; F. H. Collins; Bruce Birren; C. M. Fraser-Liggett; D. W. Severson

2007-01-01

197

Complete Genome Sequence of the Mesoplasma florum W37 Strain  

PubMed Central

Mesoplasma florum is a small-genome fast-growing mollicute that is an attractive model for systems and synthetic genomics studies. We report the complete 825,824-bp genome sequence of a second representative of this species, M. florum strain W37, which contains 733 predicted open reading frames and 35 stable RNAs. PMID:24285658

Baby, Vincent; Matteau, Dominick; Knight, Thomas F.

2013-01-01

198

Genome Sequence of Brevibacillus laterosporus Strain GI-9  

PubMed Central

We report the 5.18-Mb genome sequence of Brevibacillus laterosporus strain GI-9, isolated from a subsurface soil sample during a screen for novel strains producing antimicrobial compounds. The draft genome of this strain will aid in biotechnological exploitation and comparative genomics of Brevibacillus laterosporus strains. PMID:22328768

Sharma, Vikas; Singh, Pradip K.; Midha, Samriti; Ranjan, Manish

2012-01-01

199

Genome Sequence of Xanthomonas axonopodis pv. punicae Strain LMG 859  

PubMed Central

We report the 4.94-Mb genome sequence of Xanthomonas axonopodis pv. punicae strain LMG 859, the causal agent of bacterial leaf blight disease in pomegranate. The draft genome will aid in comparative genomics, epidemiological studies, and quarantine of this devastating phytopathogen. PMID:22493202

Sharma, Vikas; Midha, Samriti; Ranjan, Manish; Pinnaka, Anil Kumar

2012-01-01

200

Genome Sequence of the Rice Pathogen Pseudomonas fuscovaginae CB98818  

PubMed Central

Pseudomonas fuscovaginae is a phytopathogenic bacterium causing bacterial sheath brown rot of cereal crops. Here, we present the draft genome sequence of P. fuscovaginae CB98818, originally isolated from a diseased rice plant in China. The draft genome will aid in epidemiological studies, comparative genomics, and quarantine of this broad-host-range pathogen. PMID:22965098

Xie, Guanlin; Cui, Zhouqi; Tao, Zhongyun; Qiu, Hui; Liu, He; Zhu, Bo; Jin, Gulei; Sun, Guochang; Almoneafy, Abdulwareth

2012-01-01

201

Genome sequence of the rice pathogen Pseudomonas fuscovaginae CB98818.  

PubMed

Pseudomonas fuscovaginae is a phytopathogenic bacterium causing bacterial sheath brown rot of cereal crops. Here, we present the draft genome sequence of P. fuscovaginae CB98818, originally isolated from a diseased rice plant in China. The draft genome will aid in epidemiological studies, comparative genomics, and quarantine of this broad-host-range pathogen. PMID:22965098

Xie, Guanlin; Cui, Zhouqi; Tao, Zhongyun; Qiu, Hui; Liu, He; Ibrahim, Muhammad; Zhu, Bo; Jin, Gulei; Sun, Guochang; Almoneafy, Abdulwareth; Li, Bin

2012-10-01

202

Identification of Candidate Drosophila Olfactory Receptors from Genomic DNA Sequence  

Microsoft Academic Search

We have taken advantage of the availability of a large amount of Drosophila genomic DNA sequence in the Berkeley Drosophila Genome Project database (?1\\/5 of the genome) to identify a family of novel seven transmembrane domain encoding genes that are putative Drosophila olfactory receptors. Members of the family are expressed in distinct subsets of olfactory neurons, and certain family members

Qian Gao; Andrew Chess

1999-01-01

203

Finished Genome Sequence of Collimonas arenae Cal35.  

PubMed

We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. PMID:25573943

Wu, Je-Jia; de Jager, Victor C L; Deng, Wen-Ling; Leveau, Johan H J

2015-01-01

204

SEQUENCING THE PIG GENOME USING A BAC BY BAC APPROACH  

Technology Transfer Automated Retrieval System (TEKTRAN)

We have generated a highly contiguous physical map covering >98% of the pig genome in just 176 contigs. The map is localized to the genome through integration with the UIVC RH map as well BAC end sequence alignments to the human genome. Over 265k HindIII restriction digest fingerprints totaling 16.2...

205

On the sequencing of the human genome Robert H. Waterston*  

E-print Network

. The international Human Ge- nome Project (HGP) used the hierarchical shotgun approach, whereas Celera Genomics. One was the product of the international Human Genome Project (HGP), and the other was the productOn the sequencing of the human genome Robert H. Waterston* , Eric S. Lander , and John E. Sulston

Batzoglou, Serafim

206

Finished Genome Sequence of Collimonas arenae Cal35  

PubMed Central

We announce the finished genome sequence of soil forest isolate Collimonas arenae Cal35, which comprises a 5.6-Mbp chromosome and 41-kb plasmid. The Cal35 genome is the second one published for the bacterial genus Collimonas and represents the first opportunity for high-resolution comparison of genome content and synteny among collimonads. PMID:25573943

Wu, Je-Jia; de Jager, Victor C. L.; Deng, Wen-Ling

2015-01-01

207

Whole-genome sequencing and variant discovery in C. elegans  

Microsoft Academic Search

Massively parallel sequencing instruments enable rapid and inexpensive DNA sequence data production. Because these instruments are new, their data require characterization with respect to accuracy and utility. To address this, we sequenced a Caernohabditis elegans N2 Bristol strain isolate using the Solexa Sequence Analyzer, and compared the reads to the reference genome to characterize the data and to evaluate coverage

LaDeana W Hillier; Gabor T Marth; Aaron R Quinlan; David Dooling; Ginger Fewell; Derek Barnett; Paul Fox; Jarret I Glasscock; Matthew Hickenbotham; Weichun Huang; Vincent J Magrini; Ryan J Richt; Sacha N Sander; Donald A Stewart; Michael Stromberg; Eric F Tsung; Todd Wylie; Tim Schedl; Richard K Wilson; Elaine R Mardis

2008-01-01

208

Accurate whole human genome sequencing using reversible terminator chemistry  

Microsoft Academic Search

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation.

David R. Bentley; Shankar Balasubramanian; Harold P. Swerdlow; Geoffrey P. Smith; John Milton; Clive G. Brown; Kevin P. Hall; Dirk J. Evers; Colin L. Barnes; Helen R. Bignell; Jonathan M. Boutell; Jason Bryant; Richard J. Carter; R. Keira Cheetham; Anthony J. Cox; Darren J. Ellis; Michael R. Flatbush; Niall A. Gormley; Sean J. Humphray; Leslie J. Irving; Mirian S. Karbelashvili; Scott M. Kirk; Heng Li; Xiaohai Liu; Klaus S. Maisinger; Lisa J. Murray; Bojan Obradovic; Tobias Ost; Michael L. Parkinson; Mark R. Pratt; Isabelle M. J. Rasolonjatovo; Mark T. Reed; Roberto Rigatti; Chiara Rodighiero; Mark T. Ross; Andrea Sabot; Subramanian V. Sankar; Aylwyn Scally; Gary P. Schroth; Mark E. Smith; Vincent P. Smith; Anastassia Spiridou; Peta E. Torrance; Svilen S. Tzonev; Eric H. Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D. Alam; Carole Anastasi; Ify C. Aniebo; David M. D. Bailey; Iain R. Bancarz; Saibal Banerjee; Selena G. Barbour; Primo A. Baybayan; Vincent A. Benoit; Kevin F. Benson; Claire Bevis; Phillip J. Black; Asha Boodhun; Joe S. Brennan; John A. Bridgham; Rob C. Brown; Andrew A. Brown; Dale H. Buermann; Abass A. Bundu; James C. Burrows; Nigel P. Carter; Nestor Castillo; Maria Chiara E. Catenazzi; Simon Chang; R. Neil Cooley; Natasha R. Crake; Olubunmi O. Dada; Konstantinos D. Diakoumakos; Belen Dominguez-Fernandez; David J. Earnshaw; Ugonna C. Egbujor; David W. Elmore; Sergey S. Etchin; Mark R. Ewan; Milan Fedurco; Louise J. Fraser; Karin V. Fuentes Fajardo; W. Scott Furey; David George; Kimberley J. Gietzen; Colin P. Goddard; George S. Golda; Philip A. Granieri; David L. Gustafson; Nancy F. Hansen; Kevin Harnish; Christian D. Haudenschild; Narinder I. Heyer; Matthew M. Hims; Johnny T. Ho; Adrian M. Horgan; Katya Hoschler; Steve Hurwitz; Denis V. Ivanov; Maria Q. Johnson; Terena James; T. A. Huw Jones; Gyoung-Dong Kang; Tzvetana H. Kerelska; Alan D. Kersey; Irina Khrebtukova; Alex P. Kindwall; Zoya Kingsbury; Paula I. Kokko-Gonzales; Anil Kumar; Marc A. Laurent; Cynthia T. Lawley; Sarah E. Lee; Xavier Lee; Arnold K. Liao; Jennifer A. Loch; Mitch Lok; Shujun Luo; Radhika M. Mammen; John W. Martin; Patrick G. McCauley; Paul McNitt; Parul Mehta; Keith W. Moon; Joe W. Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M. Novo; Mark A. Osborne; Andrew Osnowski; Omead Ostadan; Lambros L. Paraschos; Lea Pickering; Andrew C. Pike; D. Chris Pinkard; Daniel P. Pliskin; Joe Podhasky; Victor J. Quijano; Come Raczy; Vicki H. Rae; Stephen R. Rawlings; Ana Chiva Rodriguez; Phyllida M. Roe; John Rogers; Maria C. Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K. Roth; Natalie J. Rourke; Silke T. Ruediger; Eli Rusman; Raquel M. Sanches-Kuiper; Martin R. Schenker; Josefina M. Seoane; Richard J. Shaw; Mitch K. Shiver; Steven W. Short; Ning L. Sizto; Johannes P. Sluis; Melanie A. Smith; Jean Ernest Sohna Sohna; Eric J. Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L. Tregidgo; Gerardo Turcatti; Stephanie vandeVondele; Yuli Verhovsky; Selene M. Virk; Suzanne Wakelin; Gregory C. Walcott; Jingwen Wang; Graham J. Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C. Mullikin; Matthew E. Hurles; Nick J. McCooke; John S. West; Frank L. Oaks; Peter L. Lundberg; David Klenerman; Richard Durbin; Anthony J. Smith

2008-01-01

209

Genome Wide Characterization of Simple Sequence Repeats in Cucumber  

Technology Transfer Automated Retrieval System (TEKTRAN)

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

210

PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly  

E-print Network

PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly Xing Liu, Student Member subsequences from the reads. With Next Generation Sequencing (NGS) technologies, assembly software needs sequence. We focus here on de novo assembly, where no reference sequence aids the reconstruction. Next

Bader, David A.

211

On the current status of Phakopsora pachyrhizi genome sequencing  

PubMed Central

Recent advances in the field of sequencing technologies and bioinformatics allow a more rapid access to genomes of non-model organisms at sinking costs. Accordingly, draft genomes of several economically important cereal rust fungi have been released in the last 3 years. Aside from the very recent flax rust and poplar rust draft assemblies there are no genomic data available for other dicot-infecting rust fungi. In this article we outline rust fungus sequencing efforts and comment on the current status of Phakopsora pachyrhizi (Asian soybean rust) genome sequencing. PMID:25221558

Loehrer, Marco; Vogel, Alexander; Huettel, Bruno; Reinhardt, Richard; Benes, Vladimir; Duplessis, Sébastien; Usadel, Björn; Schaffrath, Ulrich

2014-01-01

212

Draft Genome Sequence of Bordetella trematum Strain HR18  

PubMed Central

The genus Bordetella is reportedly a human or animal pathogen and environmental microbe. We report the draft genome sequence of Bordetella trematum strain HR18, which was isolated from the rumen of Korean native cattle (Hanwoo; Bos taurus coreanae). It is the first genome sequence of a Bordetella sp. isolated from the rumen of cattle. PMID:25573930

Chang, Dong-Ho; Jin, Tae-Eun; Rhee, Moon-Soo; Jeong, Haeyoung; Kim, Seil

2015-01-01

213

Draft Genome Sequence of Kocuria rhizophila P7-4?  

PubMed Central

We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

2011-01-01

214

Complete Genome Sequence of Magnetospirillum gryphiswaldense MSR-1  

PubMed Central

We report the complete genomic sequence of Magnetospirillum gryphiswaldense MSR-1 (DSM 6361), a type strain of the genus Magnetospirillum belonging to the Alphaproteobacteria. Compared to the reported draft sequence, extensive rearrangements and differences were found, indicating high genomic flexibility and “domestication” by accelerated evolution of the strain upon repeated passaging. PMID:24625872

Wang, Xu; Wang, Qing; Zhang, Weijia; Wang, Yinjia; Li, Li; Wen, Tong; Zhang, Tongwei; Zhang, Yang; Xu, Jun; Hu, Junying; Li, Shuqi; Liu, Lingzi; Liu, Jinxin; Jiang, Wei; Tian, Jiesheng; Wang, Lei; Li, Jilun

2014-01-01

215

Draft Genome Sequence of Neurospora crassa Strain FGSC 73  

PubMed Central

We report the elucidation of the complete genome of the Neurospora crassa (Shear and Dodge) strain FGSC 73, a mat-a, trp-3 mutant strain. The genome sequence around the idiotypic mating type locus represents the only publicly available sequence for a mat-a strain. 40.42 Megabases are assembled into 358 scaffolds carrying 11,978 gene models. PMID:25838471

Schackwitz, Wendy; Lipzen, Anna; Martin, Joel; Haridas, Sajeet; LaButti, Kurt; Grigoriev, Igor V.; Simmons, Blake A.

2015-01-01

216

Draft Genome Sequence of the Fish Pathogen Piscirickettsia salmonis  

PubMed Central

Piscirickettsia salmonis is a Gram-negative intracellular fish pathogen that has a significant impact on the salmon industry. Here, we report the genome sequence of P. salmonis strain LF-89. This is the first draft genome sequence of P. salmonis, and it reveals interesting attributes, including flagellar genes, despite this bacterium being considered nonmotile. PMID:24201203

Eppinger, Mark; McNair, Katelyn; Zogaj, Xhavit; Dinsdale, Elizabeth A.; Edwards, Robert A.

2013-01-01

217

Draft Genome Sequence of Raoultella planticola, Isolated from River Water  

PubMed Central

We isolated Raoultella planticola from a river water sample, which was phenotypically indistinguishable from Escherichia coli on MI agar. The genome sequence of R. planticola was determined to gain information about its metabolic functions contributing to its false positive appearance of E. coli on MI agar. We report the first whole genome sequence of Raoultella planticola. PMID:25323725

Kahler, Amy; Strockbine, Nancy; Gladney, Lori; Hill, Vincent R.

2014-01-01

218

Draft Genome Sequence of Xanthomonas sacchari Strain LMG 476  

PubMed Central

We report the high-quality draft genome sequence of Xanthomonas sacchari strain LMG 476, isolated from sugarcane. The genome comparison of this strain with a previously sequenced X. sacchari strain isolated from a distinct environmental source should provide further insights into the adaptation of this species to different habitats and its evolution. PMID:25792064

Pieretti, Isabelle; Bolot, Stéphanie; Carrère, Sébastien; Barbe, Valérie; Cociancich, Stéphane; Rott, Philippe

2015-01-01

219

Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages  

PubMed Central

Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

Sheflo, Michael A.; Gardner, Adam V.; Merrill, Bryan D.; Fisher, Joshua N. B.; Lunt, Bryce L.; Breakwell, Donald P.; Grose, Julianne H.

2013-01-01

220

Complete Genome Sequences of Five Paenibacillus larvae Bacteriophages.  

PubMed

Paenibacillus larvae is a pathogen of honeybees that causes American foulbrood (AFB). We isolated bacteriophages from soil containing bee debris collected near beehives in Utah. We announce five high-quality complete genome sequences, which represent the first completed genome sequences submitted to GenBank for any P. larvae bacteriophage. PMID:24233582

Sheflo, Michael A; Gardner, Adam V; Merrill, Bryan D; Fisher, Joshua N B; Lunt, Bryce L; Breakwell, Donald P; Grose, Julianne H; Burnett, Sandra H

2013-01-01

221

Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’  

Technology Transfer Automated Retrieval System (TEKTRAN)

The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...

222

Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii  

PubMed Central

Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named “wSuzi” that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

2013-01-01

223

Draft Genome Sequence of the Wolbachia Endosymbiont of Drosophila suzukii.  

PubMed

Wolbachia is one of the most successful and abundant symbiotic bacteria in nature, infecting more than 40% of the terrestrial arthropod species. Here we report the draft genome sequence of a novel Wolbachia strain named "wSuzi" that was retrieved from the genome sequencing of its host, the invasive pest Drosophila suzukii. PMID:23472225

Siozios, Stefanos; Cestaro, Alessandro; Kaur, Rupinder; Pertot, Ilaria; Rota-Stabelli, Omar; Anfora, Gianfranco

2013-01-01

224

Draft genome sequence of Kocuria rhizophila P7-4.  

PubMed

We report the draft genome sequence of Kocuria rhizophila P7-4, which was isolated from the intestine of Siganus doliatus caught in the Pacific Ocean. The 2.83-Mb genome sequence consists of 75 large contigs (>100 bp in size) and contains 2,462 predicted protein-coding genes. PMID:21685281

Kim, Woo-Jin; Kim, Young-Ok; Kim, Dae-Soo; Choi, Sang-Haeng; Kim, Dong-Wook; Lee, Jun-Seo; Kong, Hee Jeong; Nam, Bo-Hye; Kim, Bong-Seok; Lee, Sang-Jun; Park, Hong-Seog; Chae, Sung-Hwa

2011-08-01

225

The Prospects for Sequencing the Western Corn Rootworm Genome  

Technology Transfer Automated Retrieval System (TEKTRAN)

Historically, obtaining the complete sequence of eukaryotic genomes has been an expensive and complex task. For this reason, efforts to sequence insect genomes have largely been confined to model organisms, species that are important to human health, and representative species from a few insect orde...

226

Initial sequencing and analysis of the human genome  

Microsoft Academic Search

The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

Eric S. Lander; Lauren M. Linton; Bruce Birren; Chad Nusbaum; Michael C. Zody; Jennifer Baldwin; Keri Devon; Ken Dewar; Michael Doyle; William FitzHugh; Roel Funke; Diane Gage; Katrina Harris; Andrew Heaford; John Howland; Lisa Kann; Jessica Lehoczky; Rosie LeVine; Paul McEwan; Kevin McKernan; James Meldrim; Jill P. Mesirov; Cher Miranda; William Morris; Jerome Naylor; Christina Raymond; Mark Rosetti; Ralph Santos; Andrew Sheridan; Carrie Sougnez; Nicole Stange-Thomann; Nikola Stojanovic; Aravind Subramanian; Dudley Wyman; Jane Rogers; John Sulston; Rachael Ainscough; Stephan Beck; David Bentley; John Burton; Christopher Clee; Nigel Carter; Alan Coulson; Rebecca Deadman; Panos Deloukas; Andrew Dunham; Ian Dunham; Richard Durbin; Lisa French; Darren Grafham; Simon Gregory; Tim Hubbard; Sean Humphray; Adrienne Hunt; Matthew Jones; Christine Lloyd; Amanda McMurray; Lucy Matthews; Simon Mercer; Sarah Milne; James C. Mullikin; Andrew Mungall; Robert Plumb; Mark Ross; Ratna Shownkeen; Sarah Sims; Robert H. Waterston; Richard K. Wilson; LaDeana W. Hillier; John D. McPherson; Marco A. Marra; Elaine R. Mardis; Lucinda A. Fulton; Asif T. Chinwalla; Kymberlie H. Pepin; Warren R. Gish; Stephanie L. Chissoe; Michael C. Wendl; Kim D. Delehaunty; Tracie L. Miner; Andrew Delehaunty; Jason B. Kramer; Lisa L. Cook; Robert S. Fulton; Douglas L. Johnson; Patrick J. Minx; Sandra W. Clifton; Trevor Hawkins; Elbert Branscomb; Paul Predki; Paul Richardson; Sarah Wenning; Tom Slezak; Norman Doggett; Jan-Fang Cheng; Anne Olsen; Susan Lucas; Christopher Elkin; Edward Uberbacher; Marvin Frazier; Richard A. Gibbs; Donna M. Muzny; Steven E. Scherer; John B. Bouck; Erica J. Sodergren; Kim C. Worley; Catherine M. Rives; James H. Gorrell; Michael L. Metzker; Susan L. Naylor; Raju S. Kucherlapati; David L. Nelson; George M. Weinstock; Yoshiyuki Sakaki; Asao Fujiyama; Masahira Hattori; Tetsushi Yada; Atsushi Toyoda; Takehiko Itoh; Chiharu Kawagoe; Hidemi Watanabe; Yasushi Totoki; Todd Taylor; Jean Weissenbach; Roland Heilig; William Saurin; Francois Artiguenave; Philippe Brottier; Thomas Bruls; Eric Pelletier; Catherine Robert; Patrick Wincker; Douglas R. Smith; Lynn Doucette-Stamm; Marc Rubenfield; Keith Weinstock; Hong Mei Lee; JoAnn Dubois; André Rosenthal; Matthias Platzer; Gerald Nyakatura; Stefan Taudien; Andreas Rump; Huanming Yang; Jun Yu; Jian Wang; Guyang Huang; Jun Gu; Leroy Hood; Lee Rowen; Anup Madan; Shizen Qin; Ronald W. Davis; Nancy A. Federspiel; A. Pia Abola; Michael J. Proctor; Richard M. Myers; Jeremy Schmutz; Mark Dickson; Jane Grimwood; David R. Cox; Maynard V. Olson; Rajinder Kaul; Christopher Raymond; Nobuyoshi Shimizu; Kazuhiko Kawasaki; Shinsei Minoshima; Glen A. Evans; Maria Athanasiou; Roger Schultz; Bruce A. Roe; Feng Chen; Huaqin Pan; Juliane Ramser; Hans Lehrach; Richard Reinhardt; W. Richard McCombie; Melissa de la Bastide; Neilay Dedhia; Helmut Blöcker; Klaus Hornischer; Gabriele Nordsiek; Richa Agarwala; L. Aravind; Jeffrey A. Bailey; Serafim Batzoglou; Ewan Birney; Peer Bork; Daniel G. Brown; Christopher B. Burge; Lorenzo Cerutti; Hsiu-Chuan Chen; Deanna Church; Michele Clamp; Richard R. Copley; Tobias Doerks; Sean R. Eddy; Evan E. Eichler; Terrence S. Furey; James Galagan; James G. R. Gilbert; Cyrus Harmon; Yoshihide Hayashizaki; David Haussler; Henning Hermjakob; Karsten Hokamp; Wonhee Jang; L. Steven Johnson; Thomas A. Jones; Simon Kasif; Arek Kaspryzk; Scot Kennedy; W. James Kent; Paul Kitts; Eugene V. Koonin; Ian Korf; David Kulp; Doron Lancet; Todd M. Lowe; Aoife McLysaght; Tarjei Mikkelsen; John V. Moran; Nicola Mulder; Victor J. Pollara; Chris P. Ponting; Greg Schuler; Jörg Schultz; Guy Slater; Arian F. A. Smit; Elia Stupka; Joseph Szustakowki; Danielle Thierry-Mieg; Jean Thierry-Mieg; Lukas Wagner; John Wallis; Raymond Wheeler; Alan Williams; Yuri I. Wolf; Kenneth H. Wolfe; Shiaw-Pyng Yang; Ru-Fang Yeh; Francis Collins; Mark S. Guyer; Jane Peterson; Adam Felsenfeld; Kris A. Wetterstrand; Aristides Patrinos; Michael J. Morgan

2001-01-01

227

Draft Genome Sequence of Tolypothrix boutellei Strain VB521301  

PubMed Central

We report here the draft genome sequence of the filamentous nitrogen-fixing cyanobacterium Tolypothrix boutellei strain VB521301. The organism is lipid rich and hydrophobic and produces polyunsaturated fatty acids which can be harnessed for industrial purpose. The draft genome sequence assembled into 11,572,263 bp with 70 scaffolds and 7,777 protein coding genes. PMID:25700407

Chandrababunaidu, Mathu Malar; Singh, Deeksha; Sen, Diya; Bhan, Sushma; Das, Subhadeep; Gupta, Akash

2015-01-01

228

Genome sequence of Pasteurella multocida subsp. gallicida Anand1_poultry.  

PubMed

We report the finished and annotated genome sequence of Pasteurella multocida gallicida strain Anand1_poultry, which was isolated from the liver of a diseased adult female chicken. The strain causes a disease called "fowl cholera," which is a contagious disease in birds. We compared it with the published genome sequence of Pasteurella multocida Pm70. PMID:21914901

Ahir, V B; Roy, A; Jhala, M K; Bhanderi, B B; Mathakiya, R A; Bhatt, V D; Padiya, K B; Jakhesara, S J; Koringa, P G; Joshi, C G

2011-10-01

229

Genome Sequence of Pasteurella multocida subsp. gallicida Anand1_poultry  

PubMed Central

We report the finished and annotated genome sequence of Pasteurella multocida gallicida strain Anand1_poultry, which was isolated from the liver of a diseased adult female chicken. The strain causes a disease called “fowl cholera,” which is a contagious disease in birds. We compared it with the published genome sequence of Pasteurella multocida Pm70. PMID:21914901

Ahir, V. B.; Roy, A.; Jhala, M. K.; Bhanderi, B. B.; Mathakiya, R. A.; Bhatt, V. D.; Padiya, K. B.; Jakhesara, S. J.; Koringa, P. G.; Joshi, C. G.

2011-01-01

230

Unexpected cross-species contamination in genome sequencing projects  

PubMed Central

The raw data from a genome sequencing project sometimes contains DNA from contaminating organisms, which may be introduced during sample collection or sequence preparation. In some instances, these contaminants remain in the sequence even after assembly and deposition of the genome into public databases. As a result, searches of these databases may yield erroneous and confusing results. We used efficient microbiome analysis software to scan the draft assembly of domestic cow, Bos taurus, and identify 173 small contigs that appeared to derive from microbial contaminants. In the course of verifying these findings, we discovered that one genome, Neisseria gonorrhoeae TCDC-NG08107, although putatively a complete genome, contained multiple sequences that actually derived from the cow and sheep genomes. Our findings illustrate the need to carefully validate findings of anomalous DNA that rely on comparisons to either draft or finished genomes. PMID:25426337

Merchant, Samier; Wood, Derrick E.

2014-01-01

231

Implications of the Plastid Genome Sequence of Typha (Typhaceae, Poales) for Understanding Genome Evolution in Poaceae  

Microsoft Academic Search

Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been\\u000a a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution\\u000a has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the

Mary M. GuisingerTimothy; Timothy W. Chumley; Jennifer V. Kuehl; Jeffrey L. Boore; Robert K. Jansen

2010-01-01

232

De novo assembly of a bell pepper endornavirus genome sequence using RNA sequencing data.  

PubMed

The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

Jo, Yeonhwa; Choi, Hoseng; Cho, Won Kyong

2015-01-01

233

De Novo Assembly of a Bell Pepper Endornavirus Genome Sequence Using RNA Sequencing Data  

PubMed Central

The genus Endornavirus is a double-stranded RNA virus that infects a wide range of hosts. In this study, we report on the de novo assembly of a bell pepper endornavirus genome sequence by RNA sequencing (RNA-Seq). Our result demonstrates the successful application of RNA-Seq to obtain a complete viral genome sequence from the transcriptome data. PMID:25792042

Jo, Yeonhwa; Choi, Hoseng

2015-01-01

234

Genome sequence of the human malaria parasite Plasmodium falciparum  

Microsoft Academic Search

The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date.

Malcolm J. Gardner; Neil Hall; Eula Fung; Owen White; Matthew Berriman; Richard W. Hyman; Jane M. Carlton; Arnab Pain; Sharen Bowman; Ian T. Paulsen; Keith James; Kim Rutherford; Steven L. Salzberg; Alister Craig; Sue Kyes; Man-Suen Chan; Vishvanath Nene; Shamira J. Shallom; Bernard Suh; Jeremy Peterson; Sam Angiuoli; Mihaela Pertea; Jonathan Allen; Jeremy Selengut; Daniel Haft; Michael W. Mather; Akhil B. Vaidya; Alan H. Fairlamb; Martin J. Fraunholz; David S. Roos; Stuart A. Ralph; Geoffrey I. McFadden; Leda M. Cummings; G. Mani Subramanian; Chris Mungall; J. Craig Venter; Daniel J. Carucci; Stephen L. Hoffman; Chris Newbold; Ronald W. Davis; Claire M. Fraser; Bart Barrell

2002-01-01

235

The human genome sequence: impact on health care  

Microsoft Academic Search

The recent sequencing of the human genome, resulting from two independent global efforts, is poised to revolutionize all aspects of human health. This landmark achievement has also vindicated two different methodologies that can now be used to target other important large genomes. The human genome sequence has revealed several novel\\/surprising features notably the probable presence of a mere 30-35,000 genes.

M. D. Bashyam; S. E. Hasnain

2003-01-01

236

Draft sequences of the radish (Raphanus sativus L.) genome.  

PubMed

Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ? 300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

2014-10-01

237

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change  

Microsoft Academic Search

In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN\\/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how

Tina T. Hu; Pedro Pattyn; Erica G. Bakker; Jun Cao; Jan-Fang Cheng; Richard M. Clark; Noah Fahlgren; Jeffrey A. Fawcett; Jane Grimwood; Heidrun Gundlach; Georg Haberer; Jesse D. Hollister; Stephan Ossowski; Robert P. Ottilar; Asaf A. Salamov; Korbinian Schneeberger; Manuel Spannagl; Xi Wang; Liang Yang; Mikhail E. Nasrallah; Joy Bergelson; James C. Carrington; Brandon S. Gaut; Jeremy Schmutz; Klaus F. X. Mayer; Yves Van de Peer; Igor V. Grigoriev; Magnus Nordborg; Detlef Weigel; Ya-Long Guo

2011-01-01

238

Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi  

SciTech Connect

Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

2011-02-01

239

Scrutinizing Virus Genome Termini by High-Throughput Sequencing  

PubMed Central

Analysis of genomic terminal sequences has been a major step in studies on viral DNA replication and packaging mechanisms. However, traditional methods to study genome termini are challenging due to the time-consuming protocols and their inefficiency where critical details are lost easily. Recent advances in next generation sequencing (NGS) have enabled it to be a powerful tool to study genome termini. In this study, using NGS we sequenced one iridovirus genome and twenty phage genomes and confirmed for the first time that the high frequency sequences (HFSs) found in the NGS reads are indeed the terminal sequences of viral genomes. Further, we established a criterion to distinguish the type of termini and the viral packaging mode. We also obtained additional terminal details such as terminal repeats, multi-termini, asymmetric termini. With this approach, we were able to simultaneously detect details of the genome termini as well as obtain the complete sequence of bacteriophage genomes. Theoretically, this application can be further extended to analyze larger and more complicated genomes of plant and animal viruses. This study proposed a novel and efficient method for research on viral replication, packaging, terminase activity, transcription regulation, and metabolism of the host cell. PMID:24465717

Fan, Huahao; Jiang, Huanhuan; Chen, Yubao; Tong, Yigang

2014-01-01

240

Standards for sequencing viral genomes in the era of high-throughput sequencing.  

PubMed

Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five "standard" categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

Ladner, Jason T; Beitzel, Brett; Chain, Patrick S G; Davenport, Matthew G; Donaldson, Eric F; Frieman, Matthew; Kugelman, Jeffrey R; Kuhn, Jens H; O'Rear, Jules; Sabeti, Pardis C; Wentworth, David E; Wiley, Michael R; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher; Palacios, Gustavo

2014-01-01

241

Standards for Sequencing Viral Genomes in the Era of High-Throughput Sequencing  

PubMed Central

ABSTRACT Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques. PMID:24939889

Beitzel, Brett; Chain, Patrick S. G.; Davenport, Matthew G.; Donaldson, Eric; Frieman, Matthew; Kugelman, Jeffrey; Kuhn, Jens H.; O’Rear, Jules; Sabeti, Pardis C.; Wentworth, David E.; Wiley, Michael R.; Yu, Guo-Yun; Sozhamannan, Shanmuga; Bradburne, Christopher

2014-01-01

242

Emerging Knowledge from Genome Sequencing of Crop Species  

Microsoft Academic Search

Extensive insights into the genome composition, organization, and evolution have been gained from the plant genome sequencing\\u000a and annotation ongoing projects. The analysis of crop genomes provided surprising evidences with important implications in\\u000a plant origin and evolution: genome duplication, ancestral re-arrangements and unexpected polyploidization events opened new\\u000a doors to address fundamental questions related to species proliferation, adaptation, and functional modulations.

Delfina Barabaschi; Davide Guerra; Katia Lacrima; Paolo Laino; Vania Michelotti; Simona Urso; Giampiero Valè; Luigi Cattivelli

243

Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes  

SciTech Connect

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

2005-08-26

244

Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding  

PubMed Central

Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly) were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge. PMID:24205335

Jiang, Yanliang; Ninwichian, Parichart; Liu, Shikai; Zhang, Jiaren; Kucuktas, Huseyin; Sun, Fanyue; Kaltenboeck, Ludmilla; Sun, Luyang; Bao, Lisui; Liu, Zhanjiang

2013-01-01

245

Genome Sequence of Tumebacillus flagellatus GST4, the First Genome Sequence of a Species in the Genus Tumebacillus  

PubMed Central

We present here the first genome sequence of a species in the genus Tumebacillus. The draft genome sequence of Tumebacillus flagellatus GST4 provides a genetic basis for future studies addressing the origins, evolution, and ecological role of Tumebacillus organisms, as well as a source of acid-resistant amylase-encoding genes for further studies. PMID:25395648

Wang, Qing-Yan; Huang, Yan-Yan; Song, Li-Fu; Du, Qi-Shi; Yu, Bo; Chen, Dong

2014-01-01

246

The Brachypodium genome sequence: a resource for oat genomics research  

Technology Transfer Automated Retrieval System (TEKTRAN)

Oat (Avena sativa) is an important cereal crop used as both an animal feed and for human consumption. Genetic and genomic research on oat is hindered because it is hexaploid and possesses a large (13 Gb) genome. Diploid Avena relatives have been employed for genetic and genomic studies, but only mod...

247

Single-Molecule DNA Sequencing of a Viral Genome  

Microsoft Academic Search

The full promise of human genomics will be realized only when the genomes of thousands of individuals can be sequenced for comparative analysis. A reference sequence enables the use of short read length. We report an amplification-free method for determining the nucleotide sequence of more than 280,000 individual DNA molecules simultaneously. A DNA polymerase adds labeled nucleotides to surface-immobilized primer-template

Timothy D. Harris; Phillip R. Buzby; Hazen Babcock; Eric Beer; Jayson Bowers; Ido Braslavsky; Marie Causey; Jennifer Colonell; James DiMeo; J. William Efcavitch; Eldar Giladi; Jaime Gill; John Healy; Mirna Jarosz; Dan Lapen; Keith Moulton; Stephen R. Quake; Kathleen Steinmann; Edward Thayer; Anastasia Tyurina; Rebecca Ward; Howard Weiss; Zheng Xie

2008-01-01

248

Automated de novo identification of repeat sequence families in sequenced genomes.  

PubMed

Repetitive sequences make up a major part of eukaryotic genomes. We have developed an approach for the de novo identification and classification of repeat sequence families that is based on extensions to the usual approach of single linkage clustering of local pairwise alignments between genomic sequences. Our extensions use multiple alignment information to define the boundaries of individual copies of the repeats and to distinguish homologous but distinct repeat element families. When tested on the human genome, our approach was able to properly identify and group known transposable elements. The program, should be useful for first-pass automatic classification of repeats in newly sequenced genomes. PMID:12176934

Bao, Zhirong; Eddy, Sean R

2002-08-01

249

Strain-specific and pooled genome sequences for populations of Drosophila melanogaster from three continents.  

PubMed Central

To contribute to our general understanding of the evolutionary forces that shape variation in genome sequences in nature, we have sequenced genomes from 50 isofemale lines and six pooled samples from populations of Drosophila melanogaster on three continents. Analysis of raw and reference-mapped reads indicates the quality of these genomic sequence data is very high. Comparison of the predicted and experimentally-determined Wolbachia infection status of these samples suggests that strain or sample swaps are unlikely to have occurred in the generation of these data. Genome sequences are freely available in the European Nucleotide Archive under accession ERP009059. Isofemale lines can be obtained from the Drosophila Species Stock Center. PMID:25717372

Bergman, Casey M.; Haddrill, Penelope R.

2015-01-01

250

Complete genome sequence of Anaerococcus prevotii type strain (PC1).  

PubMed

Anaerococcus prevotii (Foubert and Douglas 1948) Ezaki et al. 2001 is the type species of the genus, and is of phylogenetic interest because of its arguable assignment to the provisionally arranged family 'Peptostreptococcaceae'. A. prevotii is an obligate anaerobic coccus, usually arranged in clumps or tetrads. The strain, whose genome is described here, was originally isolated from human plasma; other strains of the species were also isolated from clinical specimen. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the genus. Next to Finegoldia magna, A. prevotii is only the second species from the family 'Peptostreptococcaceae' for which a complete genome sequence is described. The 1,998,633 bp long genome (chromosome and one plasmid) with its 1852 protein-coding and 61 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304652

Labutti, Kurt; Pukall, Rudiger; Steenblock, Katja; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Nolan, Matt; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Chain, Patrick; Saunders, Elizabeth; Brettin, Thomas; Detter, John C; Han, Cliff; Göker, Markus; Bristow, Jim; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

2009-01-01

251

Marsupial Genome Sequences: Providing Insight into Evolution and Disease  

PubMed Central

Marsupials (metatherians), with their position in vertebrate phylogeny and their unique biological features, have been studied for many years by a dedicated group of researchers, but it has only been since the sequencing of the first marsupial genome that their value has been more widely recognised. We now have genome sequences for three distantly related marsupial species (the grey short-tailed opossum, the tammar wallaby, and Tasmanian devil), with the promise of many more genomes to be sequenced in the near future, making this a particularly exciting time in marsupial genomics. The emergence of a transmissible cancer, which is obliterating the Tasmanian devil population, has increased the importance of obtaining and analysing marsupial genome sequence for understanding such diseases as well as for conservation efforts. In addition, these genome sequences have facilitated studies aimed at answering questions regarding gene and genome evolution and provided insight into the evolution of epigenetic mechanisms. Here I highlight the major advances in our understanding of evolution and disease, facilitated by marsupial genome projects, and speculate on the future contributions to be made by such sequences. PMID:24278712

Deakin, Janine E.

2012-01-01

252

THE RICE GENOME: The Cereal of the World's Poor Takes Center Stage  

NSDL National Science Digital Library

Access to the article is free, however registration and sign-in are required. The milestone publication of not one, but two, draft genome sequences of rice (Oryza sativa) brought the cereal crop of the world's poor to center stage. In their Perspectives, Cantrell and Reeves discuss the potential impacts of these sequences for humankind from the standpoints of food security and combating malnutrition.

Ronald P. Cantrell (International Rice Research Institute (IRRI); )

2002-04-05

253

Looking to future of genome mapping, sequencing  

SciTech Connect

The human genome mapping and sequencing project is perhaps the prime example of an international project in medicine today. The project director, Nobelist James D. Watson, PhD, noted at the bicentennial conference that it may be possible to bring the cost down to as low as 50{cents} a base pair without any enormous technological breakthroughs in the 10-nation effort. Another speaker, George Poste, PhD, DVM, DSc, head of research and development, Smith Kline French Laboratories, Philadelphia, PA, predicted that completion of the genetic dictionary will lead to compilation of a protein dictionary for each cell type for use against disease. Anti-trust legislation, he said, is overtly ignored all the time in the defense industry because it is deemed to be in the national interest. However, Poste went on, the legislative bodies of the world do not yet understand the implications of the directions in which we are going in terms of Big Biology and the requirements for companies to be able to work together.

Kangilaski, J.

1989-07-21

254

Complete genome sequence of Thermomonospora curvata type strain (B9)  

SciTech Connect

Thermomonospora curvata Henssen 1957 is the type species of the genus Thermomonospora. This genus is of interest because members of this clade are sources of new antibiotics, enzymes, and products with pharmacological activity. In addition, members of this genus participate in the active degradation of cellulose. This is the first complete genome sequence of a member of the family Thermomonosporaceae. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 5,639,016 bp long genome with its 4,985 protein-coding and 76 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Chertkov, Olga [Los Alamos National Laboratory (LANL); Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [Joint Genome Institute, Walnut Creek, California; Lapidus, Alla L. [Joint Genome Institute, Walnut Creek, California; Lucas, Susan [Joint Genome Institute, Walnut Creek, California; Glavina Del Rio, Tijana [Joint Genome Institute, Walnut Creek, California; Tice, Hope [Joint Genome Institute, Walnut Creek, California; Cheng, Jan-Fang [Joint Genome Institute, Walnut Creek, California; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [Joint Genome Institute, Walnut Creek, California; Liolios, Konstantinos [Joint Genome Institute, Walnut Creek, California; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [Joint Genome Institute, Walnut Creek, California; Palaniappan, Krishna [Joint Genome Institute, Walnut Creek, California; Ngatchou, Olivier Duplex [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Brettin, Thomas S [ORNL; Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [Joint Genome Institute, Walnut Creek, California; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [Joint Genome Institute, Walnut Creek, California; Bristow, James [Joint Genome Institute, Walnut Creek, California; Eisen, Jonathan [Joint Genome Institute, Walnut Creek, California; Markowitz, Victor [Joint Genome Institute, Walnut Creek, California; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [Joint Genome Institute, Walnut Creek, California

2011-01-01

255

Complete genome sequence of Gordonia bronchialis type strain (3410T)  

SciTech Connect

Gordonia bronchialis Tsukamura 1971 is the type species of the genus. G. bronchialis is a human-pathogenic organism that has been isolated from a large variety of human tissues. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Gordoniaceae. The 5,290,012 bp long genome with its 4,944 protein-coding and 55 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Jando, Marlen [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Brettin, Thomas S [ORNL; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

2010-01-01

256

Complete genome sequence of Acidimicrobium ferrooxidans type strain (ICPT)  

SciTech Connect

Acidimicrobium ferrooxidans (Clark and Norris 1996) is the sole and type species of the genus, which until recently was the only genus within the actinobacterial family Acidimicrobiaceae and in the order Acidomicrobiales. Rapid oxidation of iron pyrite during autotrophic growth in the absence of an enhanced CO2 concentration is characteristic for A. ferrooxidans. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of the order Acidomicrobiales, and the 2,158,157 bp long single replicon genome with its 2038 protein coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Clum, Alicia; Nolan, Matt; Lang, Elke; Glavina Del Rio, Tijana; Tice, Hope; Copeland, Alex; Cheng, Jan-Fang; Lucas, Susan; Chen, Feng; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Ivanova, Natalia; Mavrommatis, Konstantinos; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Goker, Markus; Spring, Stefan; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jefferies, Cynthia C.; Chain, Patrick; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Lapidus, Alla

2009-05-20

257

STATUS OF THE RB51 GENOME SEQUENCING PROJECT  

Technology Transfer Automated Retrieval System (TEKTRAN)

The shotgun sequencing of the B. abortus vaccine strain, RB51 genome is nearly complete. Thus far, approximately 49,000 recombinant clones have been sequenced, generating approximately 34,300,000-bp of raw DNA sequence data. The resulting data has been compiled and aligned using the B. abortus st...

258

Recurrence time statistics: Versatile tools for genomic DNA sequence analysis  

E-print Network

Recurrence time statistics: Versatile tools for genomic DNA sequence analysis Yinhe Cao1, Wen from DNA sequences. One of the more important structures in a DNA se- quence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant

Gao, Jianbo

259

INVESTIGATION Whole-Genome Sequencing of Sordaria macrospora  

E-print Network

prior mapping information. KEYWORDS next-generation sequencing developmental mutants Sordaria macrospora. In recent years, so-called next-generation sequencing techniques were developed that allow a massivelyINVESTIGATION Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes

Kück, Ulrich

260

The genus burkholderia: analysis of 56 genomic sequences.  

PubMed

The genus Burkholderia consists of a number of very diverse species, both in terms of lifestyle (which varies from category B pathogens to apathogenic soil bacteria and plant colonizers) and their genetic contents. We have used 56 publicly available genomes to explore the genomic diversity within this genus, including genome sequences that are not completely finished, but are available from the NCBI database. Defining the pan- and core genomes of species results in insights in the conserved and variable fraction of genomes, and can verify (or question) historic, taxonomic groupings. We find only several hundred genes that are conserved across all Burkholderia genomes, whilst there are more than 40,000 gene families in the Burkholderia pan-genome. A BLAST matrix visualizes the fraction of conserved genes in pairwise comparisons. A BLAST atlas shows which genes are actually conserved in a number of genomes, located and visualized with reference to a chosen genome. Genomic islands are common in many Burkholderia genomes, and most of these can be readily visualized by DNA structural properties of the chromosome. Trees that are based on relatedness of gene family content yield different results depending on what genes are analyzed. Some of the differences can be explained by errors in incomplete genome sequences, but, as our data illustrate, the outcome of phylogenetic trees depends on the type of genes that are analyzed. PMID:19696499

Ussery, D W; Kiil, K; Lagesen, K; Sicheritz-Pontén, T; Bohlin, J; Wassenaar, T M

2009-01-01

261

Complete genome sequence of Thioalkalivibrio sp. K90mix  

PubMed Central

Thioalkalivibrio sp. K90mix is an obligately chemolithoautotrophic, natronophilic sulfur-oxidizing bacterium (SOxB) belonging to the family Ectothiorhodospiraceae within the Gammaproteobacteria. The strain was isolated from a mixture of sediment samples obtained from different soda lakes located in the Kulunda Steppe (Altai, Russia) based on its extreme potassium carbonate tolerance as an enrichment method. Here we report the complete genome sequence of strain K90mix and its annotation. The genome was sequenced within the Joint Genome Institute Community Sequencing Program, because of its relevance to the sustainable removal of sulfide from wastewater and gas streams. PMID:22675584

Muyzer, Gerard; Sorokin, Dimitry Y.; Mavromatis, Konstantinos; Lapidus, Alla; Foster, Brian; Sun, Hui; Ivanova, Natalia; Pati, Amrita; D'haeseleer, Patrik; Woyke, Tanja; Kyrpides, Nikos C.

2011-01-01

262

Genomic Treasure Troves: Complete Genome Sequencing of Herbarium and Insect Museum Specimens  

PubMed Central

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22–82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4–97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2–71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal. Furthermore, NGS of historical DNA enables recovering crucial genetic information from old type specimens that to date have remained mostly unutilized and, thus, opens up a new frontier for taxonomic research as well. PMID:23922691

Staats, Martijn; Erkens, Roy H. J.; van de Vossenberg, Bart; Wieringa, Jan J.; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E.; Bakker, Freek T.

2013-01-01

263

Center for Cell and Genome Sciences, Crocker Science Building  

E-print Network

chemistry Center for Cell and Genome Sciences genetic engineering building artificial life brain engineering photodiodes #12;the Cell engineering the genome, imaging proteins Invitrogen at the intersection of chemistry

Tipple, Brett

264

Application of massive parallel sequencing to whole genome SNP discovery in the porcine genome  

Microsoft Academic Search

BACKGROUND: Although the Illumina 1 G Genome Analyzer generates billions of base pairs of sequence data, challenges arise in sequence selection due to the varying sequence quality. Therefore, in the framework of the International Porcine SNP Chip Consortium, this pilot study aimed to evaluate the impact of the quality level of the sequenced bases on mapping quality and identification of

Andreia J Amaral; Hendrik-Jan Megens; Hindrik HD Kerstens; Henri CM Heuven; Bert Dibbits; Richard PMA Crooijmans; Johan T den Dunnen; Martien AM Groenen

2009-01-01

265

A physical map of the papaya genome with integrated genetic map and genome sequence  

Microsoft Academic Search

BACKGROUND: Papaya is a major fruit crop in tropical and subtropical regions worldwide and has primitive sex chromosomes controlling sex determination in this trioecious species. The papaya genome was recently sequenced because of its agricultural importance, unique biological features, and successful application of transgenic papaya for resistance to papaya ringspot virus. As a part of the genome sequencing project, we

Qingyi Yu; Eric Tong; Rachel L Skelton; John E Bowers; Meghan R Jones; Jan E Murray; Shaobin Hou; Peizhu Guan; Ricelle A Acob; Ming-Cheng Luo; Paul H Moore; Maqsudul Alam; Andrew H Paterson; Ray Ming

2009-01-01

266

New complete genome sequences of human rhinoviruses shed light on their phylogeny and genomic features  

Microsoft Academic Search

BACKGROUND: Human rhinoviruses (HRV), the most frequent cause of respiratory infections, include 99 different serotypes segregating into two species, A and B. Rhinoviruses share extensive genomic sequence similarity with enteroviruses and both are part of the picornavirus family. Nevertheless they differ significantly at the phenotypic level. The lack of HRV full-length genome sequences and the absence of analysis comparing picornaviruses

Caroline Tapparel; Thomas Junier; Daniel Gerlach; Samuel Cordey; Sandra Van Belle; Luc Perrin; Evgeny M Zdobnov; Laurent Kaiser

2007-01-01

267

De Novo Whole-Genome Sequence and Genome Annotation of Lichtheimia ramosa  

PubMed Central

We report the annotated draft genome sequence of Lichtheimia ramosa (JMRC FSU:6197). It has been reported to be a causative organism of mucormycosis, a rare but rapidly progressive infection in immunocompromised humans. The functionally annotated genomic sequence consists of 74 scaffolds with a total number of 11,510 genes. PMID:25212617

Linde, Jörg; Schwartze, Volker; Binder, Ulrike; Lass-Flörl, Cornelia

2014-01-01

268

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change  

SciTech Connect

In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

2011-04-29

269

Sequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations  

PubMed Central

Current de novo whole-genome sequencing approaches often are inadequate for organisms lacking substantial preexisting genetic data. Problems with these methods are manifest as: large numbers of scaffolds that are not ordered within chromosomes or assigned to individual chromosomes, misassembly of allelic sequences as separate loci when the individual(s) being sequenced are heterozygous, and the collapse of recently duplicated sequences into a single locus, regardless of levels of heterozygosity. Here we propose a new approach for producing de novo whole-genome sequences—which we call recombinant population genome construction—that solves many of the problems encountered in standard genome assembly and that can be applied in model and nonmodel organisms. Our approach takes advantage of next-generation sequencing technologies to simultaneously barcode and sequence a large number of individuals from a recombinant population. The sequences of all recombinants can be combined to create an initial de novo assembly, followed by the use of individual recombinant genotypes to correct assembly splitting/collapsing and to order and orient scaffolds within linkage groups. Recombinant population genome construction can rapidly accelerate the transformation of nonmodel species into genome-enabled systems by simultaneously producing a high-quality genome assembly and providing genomic tools (e.g., high-confidence single-nucleotide polymorphisms) for immediate applications. In populations segregating for important functional traits, this approach also enables simultaneous mapping of quantitative trait loci. We demonstrate our method using simulated Illumina data from a recombinant population of Caenorhabditis elegans and show that the method can produce a high-fidelity, high-quality genome assembly for both parents of the cross. PMID:24531727

Hahn, Matthew W.; Zhang, Simo V.; Moyle, Leonie C.

2014-01-01

270

Accurate whole human genome sequencing using reversible terminator chemistry.  

PubMed

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications. PMID:18987734

Bentley, David R; Balasubramanian, Shankar; Swerdlow, Harold P; Smith, Geoffrey P; Milton, John; Brown, Clive G; Hall, Kevin P; Evers, Dirk J; Barnes, Colin L; Bignell, Helen R; Boutell, Jonathan M; Bryant, Jason; Carter, Richard J; Keira Cheetham, R; Cox, Anthony J; Ellis, Darren J; Flatbush, Michael R; Gormley, Niall A; Humphray, Sean J; Irving, Leslie J; Karbelashvili, Mirian S; Kirk, Scott M; Li, Heng; Liu, Xiaohai; Maisinger, Klaus S; Murray, Lisa J; Obradovic, Bojan; Ost, Tobias; Parkinson, Michael L; Pratt, Mark R; Rasolonjatovo, Isabelle M J; Reed, Mark T; Rigatti, Roberto; Rodighiero, Chiara; Ross, Mark T; Sabot, Andrea; Sankar, Subramanian V; Scally, Aylwyn; Schroth, Gary P; Smith, Mark E; Smith, Vincent P; Spiridou, Anastassia; Torrance, Peta E; Tzonev, Svilen S; Vermaas, Eric H; Walter, Klaudia; Wu, Xiaolin; Zhang, Lu; Alam, Mohammed D; Anastasi, Carole; Aniebo, Ify C; Bailey, David M D; Bancarz, Iain R; Banerjee, Saibal; Barbour, Selena G; Baybayan, Primo A; Benoit, Vincent A; Benson, Kevin F; Bevis, Claire; Black, Phillip J; Boodhun, Asha; Brennan, Joe S; Bridgham, John A; Brown, Rob C; Brown, Andrew A; Buermann, Dale H; Bundu, Abass A; Burrows, James C; Carter, Nigel P; Castillo, Nestor; Chiara E Catenazzi, Maria; Chang, Simon; Neil Cooley, R; Crake, Natasha R; Dada, Olubunmi O; Diakoumakos, Konstantinos D; Dominguez-Fernandez, Belen; Earnshaw, David J; Egbujor, Ugonna C; Elmore, David W; Etchin, Sergey S; Ewan, Mark R; Fedurco, Milan; Fraser, Louise J; Fuentes Fajardo, Karin V; Scott Furey, W; George, David; Gietzen, Kimberley J; Goddard, Colin P; Golda, George S; Granieri, Philip A; Green, David E; Gustafson, David L; Hansen, Nancy F; Harnish, Kevin; Haudenschild, Christian D; Heyer, Narinder I; Hims, Matthew M; Ho, Johnny T; Horgan, Adrian M; Hoschler, Katya; Hurwitz, Steve; Ivanov, Denis V; Johnson, Maria Q; James, Terena; Huw Jones, T A; Kang, Gyoung-Dong; Kerelska, Tzvetana H; Kersey, Alan D; Khrebtukova, Irina; Kindwall, Alex P; Kingsbury, Zoya; Kokko-Gonzales, Paula I; Kumar, Anil; Laurent, Marc A; Lawley, Cynthia T; Lee, Sarah E; Lee, Xavier; Liao, Arnold K; Loch, Jennifer A; Lok, Mitch; Luo, Shujun; Mammen, Radhika M; Martin, John W; McCauley, Patrick G; McNitt, Paul; Mehta, Parul; Moon, Keith W; Mullens, Joe W; Newington, Taksina; Ning, Zemin; Ling Ng, Bee; Novo, Sonia M; O'Neill, Michael J; Osborne, Mark A; Osnowski, Andrew; Ostadan, Omead; Paraschos, Lambros L; Pickering, Lea; Pike, Andrew C; Pike, Alger C; Chris Pinkard, D; Pliskin, Daniel P; Podhasky, Joe; Quijano, Victor J; Raczy, Come; Rae, Vicki H; Rawlings, Stephen R; Chiva Rodriguez, Ana; Roe, Phyllida M; Rogers, John; Rogert Bacigalupo, Maria C; Romanov, Nikolai; Romieu, Anthony; Roth, Rithy K; Rourke, Natalie J; Ruediger, Silke T; Rusman, Eli; Sanches-Kuiper, Raquel M; Schenker, Martin R; Seoane, Josefina M; Shaw, Richard J; Shiver, Mitch K; Short, Steven W; Sizto, Ning L; Sluis, Johannes P; Smith, Melanie A; Ernest Sohna Sohna, Jean; Spence, Eric J; Stevens, Kim; Sutton, Neil; Szajkowski, Lukasz; Tregidgo, Carolyn L; Turcatti, Gerardo; Vandevondele, Stephanie; Verhovsky, Yuli; Virk, Selene M; Wakelin, Suzanne; Walcott, Gregory C; Wang, Jingwen; Worsley, Graham J; Yan, Juying; Yau, Ling; Zuerlein, Mike; Rogers, Jane; Mullikin, James C; Hurles, Matthew E; McCooke, Nick J; West, John S; Oaks, Frank L; Lundberg, Peter L; Klenerman, David; Durbin, Richard; Smith, Anthony J

2008-11-01

271

Tandem repeats in complete bacterial genome sequences: sequence and structural analyses for comparative studies  

Microsoft Academic Search

A series of complete bacterial genome sequences have recently become available and powerful methods have been developed for the identification of tandem repeats on a very large scale. It is thus possible to derive extensive comparative descriptions of such repeats at the level of complete genomes, as illustrated here for three different bacterial genomes: Escherichia coli, Haemophilus influenzae, and Mycobacterium

Edouard Yeramian; Henri Buc

1999-01-01

272

Multiplex Sequencing of Seven Ocular Herpes Simplex Virus Type-1 Genomes: Phylogeny, Sequence Variability,  

E-print Network

. Brandt1,4 PURPOSE. Little is known about the role of sequence variation in the pathology of HSV-1 is feasible for simultaneously sequencing seven HSV-1 ocular strains. METHODS. A genome sequencer was used to sequence the HSV-1 ocular isolates TFT401, 134, CJ311, CJ360, CJ394, CJ970, and OD4, in a single lane

Craven, Mark

273

Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae  

Microsoft Academic Search

The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the other completely sequenced genomes identified genes specific to the streptococci and to S. agalactiae. These in silico analyses, combined

Hervé Tettelin; Vega Masignani; Michael J. Cieslewicz; Jonathan A. Eisen; Scott Peterson; Michael R. Wessels; Ian T. Paulsen; Karen E. Nelson; Immaculada Margarit; Timothy D. Read; Lawrence C. Madoff; Alex M. Wolf; Maureen J. Beanan; Lauren M. Brinkac; Sean C. Daugherty; Robert T. Deboy; A. Scott Durkin; James F. Kolonay; Ramana Madupu; Matthew R. Lewis; Diana Radune; Nadezhda B. Fedorova; David Scanlan; Hoda Khouri; Stephanie Mulligan; Heather A. Carty; Robin T. Cline; Susan E. van Aken; John Gill; Maria Scarselli; Marirosa Mora; Emilia T. Iacobini; Cecilia Brettoni; Giuliano Galli; Massimo Mariani; Filippo Vegni; Domenico Maione; Daniela Rinaudo; Rino Rappuoli; John L. Telford; Dennis L. Kasper; Guido Grandi; Claire M. Fraser

2002-01-01

274

BAC-pool 454-sequencing: A rapid and efficient approach to sequence complex tetraploid cotton genomes  

Technology Transfer Automated Retrieval System (TEKTRAN)

New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...

275

Complete genome sequence of Cellulomonas flavigena type strain (134T)  

PubMed Central

Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21304688

Abt, Birte; Foster, Brian; Lapidus, Alla; Clum, Alicia; Sun, Hui; Pukall, Rüdiger; Lucas, Susan; Glavina Del Rio, Tijana; Nolan, Matt; Tice, Hope; Cheng, Jan-Fang; Pitluck, Sam; Liolios, Konstantinos; Ivanova, Natalia; Mavromatis, Konstantinos; Ovchinnikova, Galina; Pati, Amrita; Goodwin, Lynne; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D.; Rohde, Manfred; Göker, Markus; Woyke, Tanja; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter

2010-01-01

276

Genome sequencing and analysis of the model grass Brachypodium distachyon  

SciTech Connect

Three subfamilies of grasses, the Ehrhartoideae, Panicoideae and Pooideae, provide the bulk of human nutrition and are poised to become major sources of renewable energy. Here we describe the genome sequence of the wild grass Brachypodium distachyon (Brachypodium), which is, to our knowledge, the first member of the Pooideae subfamily to be sequenced. Comparison of the Brachypodium, rice and sorghum genomes shows a precise history of genome evolution across a broad diversity of the grasses, and establishes a template for analysis of the large genomes of economically important pooid grasses such as wheat. The high-quality genome sequence, coupled with ease of cultivation and transformation, small size and rapid life cycle, will help Brachypodium reach its potential as an important model system for developing new energy and food crops.

Yang, Xiaohan [ORNL; Kalluri, Udaya C [ORNL; Tuskan, Gerald A [ORNL

2010-01-01

277

Complete genome sequence of Cellulomonas flavigena type strain (134T)  

SciTech Connect

Cellulomonas flavigena (Kellerman and McBeth 1912) Bergey et al. 1923 is the type species of the genus Cellulomonas of the actinobacterial family Cellulomonadaceae. Members of the genus Cellulomonas are of special interest for their ability to degrade cellulose and hemicellulose, particularly with regard to the use of biomass as an alternative energy source. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a member of the genus Cellulomonas, and next to the human pathogen Tropheryma whipplei the second complete genome sequence within the actinobacterial family Cellulomonadaceae. The 4,123,179 bp long single replicon genome with its 3,735 protein-coding and 53 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Foster, Brian [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Clum, Alicia [U.S. Department of Energy, Joint Genome Institute; Sun, Hui [U.S. Department of Energy, Joint Genome Institute; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany

2010-01-01

278

The Release 6 reference sequence of the Drosophila melanogaster genome.  

PubMed

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

2015-03-01

279

The Release 6 reference sequence of the Drosophila melanogaster genome  

PubMed Central

Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. PMID:25589440

Carlson, Joseph W.; Wan, Kenneth H.; Park, Soo; Mendez, Ivonne; Galle, Samuel E.; Booth, Benjamin W.; Pfeiffer, Barret D.; George, Reed A.; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V.; Andreyeva, Evgeniya N.; Boldyreva, Lidiya V.; Marra, Marco; Carvalho, A. Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F.; Rubin, Gerald M.; Karpen, Gary H.

2015-01-01

280

Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.  

PubMed

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

2014-01-01

281

Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species  

PubMed Central

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ?200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

2014-01-01

282

Draft Genome Sequence of Aneurinibacillus migulanus Strain Nagano.  

PubMed

Aneurinibacillus migulanus is characterized by inhibition of growth of a range of plant-pathogenic bacteria and fungi. Here, we report the high-quality draft genome sequences of A. migulanus Nagano. PMID:25838487

Alenezi, Faizah N; Weitz, Hedda J; Belbahri, Lassaad; Ben Rebah, Hassen; Luptakova, Lenka; Jaspars, Marcel; Woodward, Stephen

2015-01-01

283

Genome sequence of the fish pathogen Flavobacterium columnare ATCC 49512  

Technology Transfer Automated Retrieval System (TEKTRAN)

Flavobacterium columnare is a Gram-negative, rod shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. ...

284

Genome Sequence of Mycoplasma hyorhinis Strain DBS 1050  

PubMed Central

Mycoplasma hyorhinis is known as one of the most prevalent contaminants of mammalian cell and tissue cultures worldwide. Here, we present the complete genome sequence of the fastidious M. hyorhinis strain DBS 1050. PMID:24604646

Soika, Valerii; Volokhov, Dmitriy; Simonyan, Vahan; Chizhikov, Vladimir

2014-01-01

285

Genome Sequence of the Fish Pathogen Flavobacterium columnare ATCC 49512  

PubMed Central

Flavobacterium columnare is a Gram-negative, rod-shaped, motile, and highly prevalent fish pathogen causing columnaris disease in freshwater fish worldwide. Here, we present the complete genome sequence of F. columnare strain ATCC 49512. PMID:22535941

Tekedar, Hasan C.; Karsi, Attila; Gillaspy, Allison F.; Dyer, David W.; Benton, Nicole R.; Zaitshik, Jeremy; Vamenta, Stefanie; Banes, Michelle M.; Gülsoy, Nagihan; Aboko-Cole, Mary; Waldbieser, Geoffrey C.

2012-01-01

286

Science Originals: Sequencing Cancer Genomes: Targeted Cancer Therapies  

NSDL National Science Digital Library

Applying DNA sequencing to cancer genomes is providing insights that have allowed researchers to turn some cancers into chronic diseases rather than deadly ones. Still, the ultimate goal is to kill the cancer.

Robert Frederick (AAAS; )

2011-03-25

287

Draft Genome Sequence of Aneurinibacillus migulanus Strain Nagano  

PubMed Central

Aneurinibacillus migulanus is characterized by inhibition of growth of a range of plant-pathogenic bacteria and fungi. Here, we report the high-quality draft genome sequences of A. migulanus Nagano. PMID:25838487

Alenezi, Faizah N.; Weitz, Hedda J.; Ben Rebah, Hassen; Luptakova, Lenka; Jaspars, Marcel; Woodward, Stephen

2015-01-01

288

Genome Sequence of Microcystis aeruginosa Strain NIES-44.  

PubMed

Microcystis aeruginosa is a typical algal bloom-forming cyanobacterium. This report describes the whole-genome sequence of a non-microcystin-producing strain of Microcystis aeruginosa, NIES-44, which was isolated from a Japanese lake. PMID:25792056

Okano, Kunihiro; Miyata, Naoyuki; Ozaki, Yasuo

2015-01-01

289

Complete Genome Sequence of Rahnella aquatilis CIP 78.65  

SciTech Connect

Rahnella aquatilis CIP 78.65 is a gammaproteobacterium isolated from a drinking water source in Lille, France. Here we report the complete genome sequence of Rahnella aquatilis CIP 78.65, the type strain of R. aquatilis.

Martinez, Robert J [University of Alabama, Tuscaloosa; Bruce, David [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Held, Brittany [Los Alamos National Laboratory (LANL); Land, Miriam L [ORNL; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Pennacchio, Len [U.S. Department of Energy, Joint Genome Institute; Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Sobeckya, Patricia A. [University of Alabama, Tuscaloosa

2012-01-01

290

The genome sequence of the filamentous fungus Neurospora crassa   

E-print Network

Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase ...

Read, Nick D; et al

2003-04-24

291

The genome sequence of the filamentous fungus Neurospora crassa  

E-print Network

The genome sequence of the filamentous fungus Neurospora crassa James E. Galagan1 , Sarah E. Calvo1 is a multicellular filamentous fungus, it has also provided a system to study cellular differentiation

Kellis, Manolis

292

Fulfilling the Promise of a Sequenced Human Genome – Part II  

SciTech Connect

Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 2 of 2

Green, Eric [National Human Genome Research Institute

2009-05-27

293

Fulfilling the Promise of a Sequenced Human Genome – Part I  

SciTech Connect

Eric Green, scientific director of the National Human Genome Research Institute (NHGRI), gives the opening keynote speech at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM on May 27, 2009. Part 1 of 2

Green, Eric [National Human Genome Research Institute

2009-05-27

294

Initial genome sequencing and analysis of multiple myeloma  

E-print Network

Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. ...

Lander, Eric S.

295

Genome sequence of vanilla distortion mosaic virus infecting Coriandrum sativum.  

PubMed

The 9573-nucleotide genome of a potyvirus was sequenced from a Coriandrum sativum plant from India with viral symptoms. On analysis, this virus was shown to have greater than 85 % nucleotide sequence identity to vanilla distortion mosaic virus (VDMV). Analysis of the putative coat protein sequence confirmed that this virus was in fact VDMV, with greater than 91 % amino acid sequence identity. The genome appears to encode a 3083-amino-acid polyprotein potentially cleaved into the 10 mature proteins expected in potyviruses. Phylogenetic analysis confirmed that VDMV is a distinct but ungrouped member of the genus Potyvirus. PMID:25252813

Adams, I P; Rai, S; Deka, M; Harju, V; Hodges, T; Hayward, G; Skelton, A; Fox, A; Boonham, N

2014-12-01

296

A non-radioactive multiprime sequencing method for HIV genomes  

Microsoft Academic Search

A manual non-radioactive DNA sequencing protocol was developed for rapid analysis of variable HIV-1 genomes. Sets of up to ten primers were used in one sequencing reaction. After polyacrylamide gel electrophoresis and blotting onto nylon membranes the individual sequences were detected by hybridization with digoxigenin-labelled oligonucleotides and chemiluminescence. The method is applicable to any sequencing project where numerous variants of

Jutta Huber; Wolfgang Hell; Hans Wolf

1995-01-01

297

Intra-species sequence comparisons for annotating genomes  

SciTech Connect

Analysis of sequence variation among members of a single species offers a potential approach to identify functional DNA elements responsible for biological features unique to that species. Due to its high rate of allelic polymorphism and ease of genetic manipulability, we chose the sea squirt, Ciona intestinalis, to explore intra-species sequence comparisons for genome annotation. A large number of C. intestinalis specimens were collected from four continents and a set of genomic intervals amplified, resequenced and analyzed to determine the mutation rates at each nucleotide in the sequence. We found that regions with low mutation rates efficiently demarcated functionally constrained sequences: these include a set of noncoding elements, which we showed in C intestinalis transgenic assays to act as tissue-specific enhancers, as well as the location of coding sequences. This illustrates that comparisons of multiple members of a species can be used for genome annotation, suggesting a path for the annotation of the sequenced genomes of organisms occupying uncharacterized phylogenetic branches of the animal kingdom and raises the possibility that the resequencing of a large number of Homo sapiens individuals might be used to annotate the human genome and identify sequences defining traits unique to our species. The sequence data from this study has been submitted to GenBank under accession nos. AY667278-AY667407.

Boffelli, Dario; Weer, Claire V.; Weng, Li; Lewis, Keith D.; Shoukry, Malak I.; Pachter, Lior; Keys, David N.; Rubin, Edward M.

2004-07-15

298

Alfresco---A Workbench for Comparative Genomic Sequence Analysis  

Microsoft Academic Search

Comparative analysis of genomic sequences provides a powerful tool for identifying regions of potential biologic function; by comparing corresponding regions of genomes from suitable species, protein coding or regulatory regions can be identified by their homology. This requires the use of several specific types of computational analysis tools. Many programs exist for these types of analysis; not many exist for

Niclas Jareborg; Richard Durbin

2000-01-01

299

A Cryptographic Approach to Securely Share and Query Genomic Sequences  

Microsoft Academic Search

To support large-scale biomedical research projects, organizations need to share person-specific genomic sequences without violating the privacy of their data subjects. In the past, organizations protected subjects' identities by removing identifiers, such as name and social security number; however, recent investigations illustrate that deidentified genomic data can be ldquoreidentifiedrdquo to named individuals using simple automated methods. In this paper, we

Murat Kantarcioglu; Ying Liu; Bradley Malin

2008-01-01

300

Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis  

Microsoft Academic Search

We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the ~160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated

J. M. Carlton; R. P. Hirt; J. C. Silva; A. L. Delcher; Michael Schatz; Qi Zhao; J. R. Wortman; S. L. Bidwell; U. C. M. Alsmark; Sébastien Besteiro; Thomas Sicheritz-Ponten; C. J. Noel; J. B. Dacks; P. G. Foster; Cedric Simillion; Y. Van de Peer; Diego Miranda-Saavedra; G. J. Barton; G. D. Westrop; S. Muller; Daniele Dessi; P. L. Fiori; Qinghu Ren; Ian Paulsen; Hanbang Zhang; F. D. Bastida-Corcuera; Augusto Simoes-Barbosa; M. T. Brown; R. D. Hayes; Mandira Mukherjee; C. Y. Okumura; Rachel Schneider; A. J. Smith; Stepanka Vanacova; Maria Villalvazo; B. J. Haas; Mihaela Pertea; Tamara V. Feldblyum; T. R. Utterback; Chung-Li Shu; Kazutoyo Osoegawa; P. J. de Jong; Ivan Hrdy; Lenka Horvathova; Zuzana Zubacova; Pavel Dolezal; Shehre-Banoo Malik; J. M. Logsdon; Katrin Henze; Arti Gupta; Ching C. Wang; R. L. Dunne; J. A. Upcroft; Peter Upcroft; Owen White; S. L. Salzberg; Petrus Tang; Cheng-Hsun Chiu; Ying-Shiung Lee; T. M. Embley; G. H. Coombs; J. C. Mottram; Jan Tachezy; C. M. Fraser-Liggett; P. J. Johnson

2007-01-01

301

Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...

302

Draft Genome Sequence of Mycobacterium austroafricanum DSM 44191  

PubMed Central

We announce the draft genome sequence of Mycobacterium austroafricanum DSM 44191T (= E9789-SA12441T), a non-tuberculosis species responsible for opportunistic infection. The genome described here has a size of 6,772,357 bp with a G+C content of 66.79% and contains 6,419 protein-coding genes and 112 RNA genes. PMID:24744336

Croce, Olivier; Robert, Catherine; Raoult, Didier

2014-01-01

303

Draft Genome Sequence of Enterobacter cloacae Strain JD6301  

PubMed Central

Enterobacter cloacae strain JD6301 was isolated from a mixed culture with wastewater collected from a municipal treatment facility and oleaginous microorganisms. A draft genome sequence of this organism indicates that it has a genome size of 4,772,910 bp, an average G+C content of 53%, and 4,509 protein-coding genes. PMID:24874669

Wilson, Jessica G.; French, William T.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Woyke, Tanja; Shapiro, Nicole; Bullard, James W.; Champlin, Franklin R.

2014-01-01

304

Draft genome sequences of 10 strains of the genus exiguobacterium.  

PubMed

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

Vishnivetskaya, Tatiana A; Chauhan, Archana; Layton, Alice C; Pfiffner, Susan M; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C; Markowitz, Victor M; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W; Pati, Amrita; Stamatis, Dimitrios; Reddy, T B K; Shapiro, Nicole; Nordberg, Henrik P; Cantor, Michael N; Hua, X Susan; Woyke, Tanja

2014-01-01

305

Draft Genome Sequences of 10 Strains of the Genus Exiguobacterium  

PubMed Central

High-quality draft genome sequences were determined for 10 Exiguobacterium strains in order to provide insight into their evolutionary strategies for speciation and environmental adaptation. The selected genomes include psychrotrophic and thermophilic species from a range of habitats, which will allow for a comparison of metabolic pathways and stress response genes. PMID:25323723

Chauhan, Archana; Layton, Alice C.; Pfiffner, Susan M.; Huntemann, Marcel; Copeland, Alex; Chen, Amy; Kyrpides, Nikos C.; Markowitz, Victor M.; Palaniappan, Krishna; Ivanova, Natalia; Mikhailova, Natalia; Ovchinnikova, Galina; Andersen, Evan W.; Pati, Amrita; Stamatis, Dimitrios; Reddy, T. B. K.; Shapiro, Nicole; Nordberg, Henrik P.; Cantor, Michael N.; Hua, X. Susan; Woyke, Tanja

2014-01-01

306

Draft Genome Sequence of Entomopathogenic Serratia liquefaciens Strain FK01  

PubMed Central

In the present study, we determined the draft genome sequence of the entomopathogenic bacterium Serratia liquefaciens FK01, which is highly virulent to the silkworm. The draft genome is ~5.28 Mb in size, and the G+C content is 55.8%. PMID:24970828

Taira, Erika; Mon, Hiroaki; Mori, Kazuki; Akasaka, Taiki; Tashiro, Kousuke; Yasunaga-Aoki, Chisa; Lee, Jae Man; Kusakabe, Takahiro

2014-01-01

307

Genome Sequence of Xanthomonas citri pv. mangiferaeindicae Strain LMG 941  

PubMed Central

We report the 5.1-Mb genome sequence of Xanthomonas citri pv. mangiferaeindicae strain LMG 941, the causal agent of bacterial black spot in mango. Apart from evolutionary studies, the draft genome will be a valuable resource for the epidemiological studies and quarantine of this phytopathogen. PMID:22582385

Midha, Samriti; Ranjan, Manish; Sharma, Vikas; Pinnaka, Anil Kumar

2012-01-01

308

The genome sequence and structure of rice chromosome 1  

Microsoft Academic Search

The rice species Oryza sativa is considered to be a model plant because of its small genome size, extensive genetic map, relative ease of transformation and synteny with other cereal crops. Here we report the essentially complete sequence of chromosome 1, the longest chromosome in the rice genome. We summarize characteristics of the chromosome structure and the biological insight gained

Takuji Sasaki; Takashi Matsumoto; Kimiko Yamamoto; Katsumi Sakata; Tomoya Baba; Yuichi Katayose; Jianzhong Wu; Yoshihito Niimura; Zhukuan Cheng; Yoshiaki Nagamura; Baltazar A. Antonio; Hiroyuki Kanamori; Satomi Hosokawa; Masatoshi Masukawa; Koji Arikawa; Yoshino Chiden; Mika Hayashi; Masako Okamoto; Tsuyu Ando; Hiroyoshi Aoki; Kohei Arita; Masao Hamada; Chizuko Harada; Saori Hijishita; Mikiko Honda; Yoko Ichikawa; Atsuko Idonuma; Masumi Iijima; Michiko Ikeda; Maiko Ikeno; Sachie Ito; Tomoko Ito; Yuichi Ito; Yukiyo Ito; Aki Iwabuchi; Kozue Kamiya; Wataru Karasawa; Satoshi Katagiri; Ari Kikuta; Noriko Kobayashi; Izumi Kono; Kayo Machita; Tomoko Maehara; Hiroshi Mizuno; Tatsumi Mizubayashi; Yoshiyuki Mukai; Hideki Nagasaki; Marina Nakashima; Yuko Nakama; Yumi Nakamichi; Mari Nakamura; Nobukazu Namiki; Manami Negishi; Isamu Ohta; Nozomi Ono; Shoko Saji; Kumiko Sakai; Michie Shibata; Takanori Shimokawa; Ayahiko Shomura; Jianyu Song; Yuka Takazaki; Kimihiro Terasawa; Kumiko Tsuji; Kazunori Waki; Harumi Yamagata; Hiroko Yamane; Shoji Yoshiki; Rie Yoshihara; Kazuko Yukawa; Huisun Zhong; Hisakazu Iwama; Toshinori Endo; Hidetaka Ito; Jang Ho Hahn; Ho-Il Kim; Moo-Young Eun; Masahiro Yano; Jiming Jiang; Takashi Gojobori

2002-01-01

309

Draft Genome Sequence of Necropsobacter rosorum Strain P709T  

PubMed Central

Necropsobacter is a recently described genus that contains a single species, N. rosorum, and belongs to the family Pasteurellaceae. Here, we present the draft genome of N. rosorum strain P709T, which is the first genome sequence from this species. PMID:25301642

Padmanabhan, Roshan; Robert, Catherine; Fenollar, Florence; Raoult, Didier

2014-01-01

310

Draft Genome Sequence of "Candidatus Liberibacter asiaticus" from California.  

PubMed

We report here the draft genome sequence of "Candidatus Liberibacter asiaticus" strain HHCA, collected from a lemon tree in California. The HHCA strain has a genome size of 1,150,620 bp, 36.5% G+C content, 1,119 predicted open reading frames, and 51 RNA genes. PMID:25278540

Zheng, Z; Deng, X; Chen, J

2014-01-01

311

Draft Genome Sequence of “Candidatus Liberibacter asiaticus” from California  

PubMed Central

We report here the draft genome sequence of “Candidatus Liberibacter asiaticus” strain HHCA, collected from a lemon tree in California. The HHCA strain has a genome size of 1,150,620 bp, 36.5% G+C content, 1,119 predicted open reading frames, and 51 RNA genes. PMID:25278540

Zheng, Z.

2014-01-01

312

Genome sequence of the cultivated cotton Gossypium arboreum  

Technology Transfer Automated Retrieval System (TEKTRAN)

Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...

313

The tomato genome sequence provides insight into fleshy fruit evolution  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...

314

Complete Genome Sequence of Marinobacter sp. BSs20148.  

PubMed

Marinobacter sp. BSs20148 was isolated from marine sediment collected from the Arctic Ocean at a water depth of 3,800 m. Here we report the complete genome sequence of Marinobacter sp. BSs20148. This genomic information will facilitate the study of the physiological metabolism, ecological roles, and evolution of the Marinobacter species. PMID:23682144

Song, Lai; Ren, Lufeng; Li, Xingang; Yu, Dan; Yu, Yong; Wang, Xumin; Liu, Guiming

2013-01-01

315

Tandem Clusters of Membrane Proteins in Complete Genome Sequences  

E-print Network

of genes coding for membrane proteins was investigated in 16 complete genomes: 4 archaea, 11 bacteria of isolated ATP-binding protein components in the ABC transporters. Possible implications of tandem clusterTandem Clusters of Membrane Proteins in Complete Genome Sequences Daisuke Kihara1 and Minoru

Kihara, Daisuke

316

Draft Genome Sequence of Highly Nematicidal Bacillus thuringiensis DB27  

PubMed Central

Here, we report the genome sequence of nematicidal Bacillus thuringiensis DB27, which provides first insights into the genetic determinants of its pathogenicity to nematodes. The genome consists of a 5.7-Mb chromosome and seven plasmids, three of which contain genes encoding nematicidal proteins. PMID:24558243

Corton, Craig; Pickard, Derek J.; Dougan, Gordon

2014-01-01

317

Draft Genome Sequence of Highly Nematicidal Bacillus thuringiensis DB27.  

PubMed

Here, we report the genome sequence of nematicidal Bacillus thuringiensis DB27, which provides first insights into the genetic determinants of its pathogenicity to nematodes. The genome consists of a 5.7-Mb chromosome and seven plasmids, three of which contain genes encoding nematicidal proteins. PMID:24558243

Iatsenko, Igor; Corton, Craig; Pickard, Derek J; Dougan, Gordon; Sommer, Ralf J

2014-01-01

318

Genome Sequence of a Thermophilic Bacillus, Geobacillus thermodenitrificans DSM465  

PubMed Central

Geobacillus thermodenitrificans NG80-2 encodes a LadA-mediated alkane degradation pathway, while G. thermodenitrificans DSM465 cannot utilize alkanes. Here, we report the draft genome sequence of G. thermodenitrificans DSM465, which may help reveal the genomic differences between these two strains in regards to the biodegradation of alkanes. PMID:24336381

Yao, Nana; Ren, Yi

2013-01-01

319

Complete Genome Sequence of Pronghorn Virus, a Pestivirus  

PubMed Central

The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

Ridpath, Julia F.; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

2014-01-01

320

Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila  

Microsoft Academic Search

The soil actinomycete Kocuria rhizophila belongs to the suborder Micrococcineae, a divergent bacterial group for which only a limited amount of genomic information is currently available. K. rhizophila is also important in industrial applications; e.g., it is commonly used as a standard quality control strain for antimicrobial susceptibility testing. Sequencing and annotation of the genome of K. rhizophila DC2201 (NBRC

Hiromi Takarada; Mitsuo Sekine; Hiroki Kosugi; Yasunori Matsuo; Takatomo Fujisawa; Seiha Omata; Emi Kishi; Ai Shimizu; Naofumi Tsukatani; Satoshi Tanikawa; Nobuyuki Fujita; Shigeaki Harayama

2008-01-01

321

Complete genome sequence of pronghorn virus, a pestivirus.  

PubMed

The complete genome sequence of pronghorn virus, a member of the Pestivirus genus of the family Flaviviridae, was determined here. The virus, originally isolated from a pronghorn antelope, has a genome of 12,273 nucleotides, with a single open reading frame of 11,694 bases encoding 3,897 amino acids. PMID:24926058

Neill, John D; Ridpath, Julia F; Fischer, Nicole; Grundhoff, Adam; Postel, Alexander; Becher, Paul

2014-01-01

322

Complete genome sequence of Pronghorn Virus, a Pestivirus  

Technology Transfer Automated Retrieval System (TEKTRAN)

The complete genome sequence of Pronghorn virus, a member of the Pestivirus genus of the Flaviviridae, was determined. The virus, originally isolated from a pronghorn antelope, had a genome of 12,287 nucleotides with a single open reading frame of 11,694 bases encoding 3898 amino acids....

323

The genome sequence of the rice blast fungus Magnaporthe grisea  

Microsoft Academic Search

Magnaporthe grisea is the most destructive pathogen of rice worldwide and the principal model organism for elucidating the molecular basis of fungal disease of plants. Here, we report the draft sequence of the M. grisea genome. Analysis of the gene set provides an insight into the adaptations required by a fungus to cause disease. The genome encodes a large and

Ralph A. Dean; Nicholas J. Talbot; Daniel J. Ebbole; Mark L. Farman; Thomas K. Mitchell; Marc J. Orbach; Michael Thon; Resham Kulkarni; Jin-Rong Xu; Huaqin Pan; Nick D. Read; Yong-Hwan Lee; Ignazio Carbone; Doug Brown; Yeon Yee Oh; Nicole Donofrio; Jun Seop Jeong; Darren M. Soanes; Slavica Djonovic; Elena Kolomiets; Cathryn Rehmeyer; Weixi Li; Michael Harding; Soonok Kim; Marc-Henri Lebrun; Heidi Bohnert; Sean Coughlan; Jonathan Butler; Sarah Calvo; Li-Jun Ma; Robert Nicol; Seth Purcell; Chad Nusbaum; James E. Galagan; Bruce W. Birren

2005-01-01

324

Draft Genome Sequence of Pseudomonas sp. nov. H2  

PubMed Central

We report the draft genome sequence of Pseudomonas sp. nov. H2, isolated from creek sediment in Moscow, ID, USA. The strain is most closely related to Pseudomonas putida. However, it has a slightly smaller genome that appears to have been impacted by horizontal gene transfer and poorly maintains IncP-1 plasmids. PMID:25838493

Loftie-Eaton, Wesley; Suzuki, Haruo; Bashford, Kelsie; Heuer, Holger; Stragier, Pieter; De Vos, Paul; Settles, Matthew L.

2015-01-01

325

Genome Sequence of the Asiatic Species Borrelia persica  

PubMed Central

We report the complete genome sequence of Borrelia persica, the causative agent of tick-borne relapsing fever borreliosis on the Asian continent. Its genome of 1,784,979 bp contains 1,850 open reading frames, three ribosomal RNAs, and 32 tRNAs. One clustered regularly interspaced short palindromic repeat (CRISPR) was detected. PMID:24407639

Elbir, Haitham; Larsson, Pär; Normark, Johan; Upreti, Mukunda; Korenberg, Edward; Larsson, Christer

2014-01-01

326

Whole genome sequence of “Candidatus Liberibacter asiaticus” from Guangdong, China  

Technology Transfer Automated Retrieval System (TEKTRAN)

The draft genome sequence of “Candidatus Liberibacter asiaticus” strain A4, isolated from a mandarin citrus in Guangdong, P. R. China, is reported. The A4 strain has a genome size of 1,208,625 bp, G+C content of 36.4%, 1,107 predicted open reading frames, and 53 RNA genes....

327

RESEARCH Open Access Genomic and small RNA sequencing of  

E-print Network

of sorghum as a reference genome sequence for Andropogoneae grasses Kankshita Swaminathan1,2 , Magdy origins of Mxg, and suggest that while the repeat content of Mxg differs from sorghum, the sorghum genome. Included within the Andropogoneae are major crops such as maize, Sorghum bicolor (sorghum), sugarcane

Green, Pamela

328

A snapshot of the emerging tomato genome sequence  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of tomato (Solanum lycopersicum) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States) as part of a larger initiative called the ‘International Solanaceae Genome Proje...

329

MAIZE CHLOROTIC DWARF VIRUS GENOME SEQUENCE AND POLYPROTEIN CLEAVAGE  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genomic sequence (11.8 kb) of the severe Ohio Maize chlorotic dwarf virus isolate (MCDV-S, genus Waikavirus) was determined from overlapping cDNA clones. Approximately 400 kDa polyprotein encoded by the viral genome is post-translationally cleaved into several smaller functional proteins. Wher...

330

Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113  

PubMed Central

Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

2012-01-01

331

Completion of the Porcine Epidemic Diarrhoea Coronavirus (PEDV) Genome Sequence  

Microsoft Academic Search

The sequence of the replicase gene of porcine epidemic diarrhoea virus (PEDV) has been determined. This completes the sequence of the entire genome of strain CV777, which was found to be 28,033 nucleotides (nt) in length (excluding the poly A-tail). A cloning strategy, which involves primers based on conserved regions in the predicted ORF1 products from other coronaviruses whose genome

Rolf Kocherhans; Anne Bridgen; Mathias Ackermann; Kurt Tobler

2001-01-01

332

The Genome Sequence of the SARS-Associated Coronavirus  

Microsoft Academic Search

We sequenced the 29,751-base genome of the severe acute respiratory syndrome (SARS)-associated coronavirus known as the Tor2 isolate. The genome sequence reveals that this coronavirus is only moderately related to other known coronaviruses, including two human coronaviruses, HCoV-OC43 and HCoV-229E. Phylogenetic analysis of the predicted viral proteins indicates that the virus does not closely resemble any of the three previously

Marco A. Marra; Steven J. M. Jones; Caroline R. Astell; Robert A. Holt; Angela Brooks-Wilson; Yaron S. N. Butterfield; Jaswinder Khattra; Jennifer K. Asano; Sarah A. Barber; Susanna Y. Chan; Alison Cloutier; Shaun M. Coughlin; Doug Freeman; Noreen Girn; Obi L. Griffith; Stephen R. Leach; Michael Mayo; Helen McDonald; Stephen B. Montgomery; Pawan K. Pandoh; Anca S. Petrescu; A. Gordon Robertson; Jacqueline E. Schein; Asim Siddiqui; Duane E. Smailus; Jeff M. Stott; George S. Yang; Francis Plummer; Anton Andonov; Harvey Artsob; Nathalie Bastien; Kathy Bernard; Timothy F. Booth; Donnie Bowness; Michael Drebot; Lisa Fernando; Ramon Flick; Michael Garbutt; Michael Garbutt; Allen Grolla; Heinz Feldmann; Adrienne Meyers; Amin Kabani; Yan Li; Susan Normand; Ute Stroher; Graham A. Tipples; Shaun Tyler; Robert Vogrig; Diane Ward; Robert C. Brunham; Mel Krajden; Martin Petric; Danuta M. Skowronski; Chris Upton; Rachel L. Roper

2003-01-01

333

Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii  

Microsoft Academic Search

The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted proteincoding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related

Carol J. Bult; Owen White; Gary J. Olsen; Lixin Zhou; Robert D. Fleischmann; Granger G. Sutton; Judith A. Blake; Lisa M. Fitzgerald; Rebecca A. Clayton; Jeannine D. Gocayne; Anthony R. Kerlavage; Brian A. Dougherty; Jean-Francois Tomb; Mark D. Adams; Claudia I. Reich; Ross Overbeek; Ewen F. Kirkness; Keith G. Weinstock; Joseph M. Merrick; Anna Glodek; John L. Scott; Neil S. M. Geoghagen; Janice F. Weidman; Joyce L. Fuhrmann; Dave Nguyen; Teresa R. Utterback; Jenny M. Kelley; Jeremy D. Peterson; Paul W. Sadow; Michael C. Hanna; Matthew D. Cotton; Kevin M. Roberts; Margaret A. Hurst; Brian P. Kaine; Mark Borodovsky; Hans-Peter Klenk; Claire M. Fraser; Hamilton O. Smith; Carl R. Woese; J. Craig Venter

1996-01-01

334

Use of information theory to study genome sequences  

NASA Astrophysics Data System (ADS)

The genome sequence carries information about life as an order of four bases. It is considered that this order indicates a special code structure. In this paper we discuss how the mutual entropy, the main concept in Shannon's communication theory, can be used to study genome sequences, and how a measure introduced in our previous paper [10] for the analysis of similarities of code structures is applied for examining the coding structure of several species, in particular, HIV-1.

Ohya, Masanori; Sato, Keiko

2000-12-01

335

Reference genome sequence of the model plant Setaria  

Microsoft Academic Search

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The â400-Mb assembly covers â80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We

Tuskan; Gerald A

2012-01-01

336

Salmonella serotype determination utilizing high-throughput genome sequencing data.  

PubMed

Serotyping forms the basis of national and international surveillance networks for Salmonella, one of the most prevalent foodborne pathogens worldwide (1-3). Public health microbiology is currently being transformed by whole-genome sequencing (WGS), which opens the door to serotype determination using WGS data. SeqSero (www.denglab.info/SeqSero) is a novel Web-based tool for determining Salmonella serotypes using high-throughput genome sequencing data. SeqSero is based on curated databases of Salmonella serotype determinants (rfb gene cluster, fliC and fljB alleles) and is predicted to determine serotype rapidly and accurately for nearly the full spectrum of Salmonella serotypes (more than 2,300 serotypes), from both raw sequencing reads and genome assemblies. The performance of SeqSero was evaluated by testing (i) raw reads from genomes of 308 Salmonella isolates of known serotype; (ii) raw reads from genomes of 3,306 Salmonella isolates sequenced and made publicly available by GenomeTrakr, a U.S. national monitoring network operated by the Food and Drug Administration; and (iii) 354 other publicly available draft or complete Salmonella genomes. We also demonstrated Salmonella serotype determination from raw sequencing reads of fecal metagenomes from mice orally infected with this pathogen. SeqSero can help to maintain the well-established utility of Salmonella serotyping when integrated into a platform of WGS-based pathogen subtyping and characterization. PMID:25762776

Zhang, Shaokang; Yin, Yanlong; Jones, Marcus B; Zhang, Zhenzhen; Deatherage Kaiser, Brooke L; Dinsmore, Blake A; Fitzgerald, Collette; Fields, Patricia I; Deng, Xiangyu

2015-05-01

337

Large-Scale Sequencing: The Future of Genomic Sciences Colloquium  

SciTech Connect

Genetic sequencing and the various molecular techniques it has enabled have revolutionized the field of microbiology. Examining and comparing the genetic sequences borne by microbes - including bacteria, archaea, viruses, and microbial eukaryotes - provides researchers insights into the processes microbes carry out, their pathogenic traits, and new ways to use microorganisms in medicine and manufacturing. Until recently, sequencing entire microbial genomes has been laborious and expensive, and the decision to sequence the genome of an organism was made on a case-by-case basis by individual researchers and funding agencies. Now, thanks to new technologies, the cost and effort of sequencing is within reach for even the smallest facilities, and the ability to sequence the genomes of a significant fraction of microbial life may be possible. The availability of numerous microbial genomes will enable unprecedented insights into microbial evolution, function, and physiology. However, the current ad hoc approach to gathering sequence data has resulted in an unbalanced and highly biased sampling of microbial diversity. A well-coordinated, large-scale effort to target the breadth and depth of microbial diversity would result in the greatest impact. The American Academy of Microbiology convened a colloquium to discuss the scientific benefits of engaging in a large-scale, taxonomically-based sequencing project. A group of individuals with expertise in microbiology, genomics, informatics, ecology, and evolution deliberated on the issues inherent in such an effort and generated a set of specific recommendations for how best to proceed. The vast majority of microbes are presently uncultured and, thus, pose significant challenges to such a taxonomically-based approach to sampling genome diversity. However, we have yet to even scratch the surface of the genomic diversity among cultured microbes. A coordinated sequencing effort of cultured organisms is an appropriate place to begin, since not only are their genomes available, but they are also accompanied by data on environment and physiology that can be used to understand the resulting data. As single cell isolation methods improve, there should be a shift toward incorporating uncultured organisms and communities into this effort. Efforts to sequence cultivated isolates should target characterized isolates from culture collections for which biochemical data are available, as well as other cultures of lasting value from personal collections. The genomes of type strains should be among the first targets for sequencing, but creative culture methods, novel cell isolation, and sorting methods would all be helpful in obtaining organisms we have not yet been able to cultivate for sequencing. The data that should be provided for strains targeted for sequencing will depend on the phylogenetic context of the organism and the amount of information available about its nearest relatives. Annotation is an important part of transforming genome sequences into useful resources, but it represents the most significant bottleneck to the field of comparative genomics right now and must be addressed. Furthermore, there is a need for more consistency in both annotation and achieving annotation data. As new annotation tools become available over time, re-annotation of genomes should be implemented, taking advantage of advancements in annotation techniques in order to capitalize on the genome sequences and increase both the societal and scientific benefit of genomics work. Given the proper resources, the knowledge and ability exist to be able to select model systems, some simple, some less so, and dissect them so that we may understand the processes and interactions at work in them. Colloquium participants suggest a five-pronged, coordinated initiative to exhaustively describe six different microbial ecosystems, designed to describe all the gene diversity, across genomes. In this effort, sequencing should be complemented by other experimental data, particularly transcriptomics and metabolomics data, all of which

Margaret Riley; Merry Buckley

2009-01-01

338

Mulan: multiple-sequence alignment to predict functional elements in genomic sequences.  

PubMed

Multiple sequence alignment analysis is a powerful approach for translating the evolutionary selective power into phylogenetic relationships to localize functional coding and noncoding genomic elements. The tool Mulan (http://mulan.dcode.org/) has been designed to effectively perform multiple comparisons of genomic sequences necessary to facilitate bioinformatic-driven biological discoveries. The Mulan network server is capable of comparing both closely and distantly related genomes to identify conserved elements over a broad range of evolutionary time. Several novel algorithms are brought together in this tool: the tba multisequence aligner program used to rapidly identify local sequence conservation and the multiTF program to detect evolutionarily conserved transcription factor binding sites in alignments. Mulan is integrated with the ERC Browser, the UCSC Genome Browser for quick uploads of available sequences and supports two-way communication with the GALA database to overlay GALA functional genome annotation with sequence conservation profiles. Local multiple alignments computed by Mulan ensure reliable representation of short- and large-scale genomic rearrangements in distant organisms. Recently, we have also introduced the ability to handle duplications to permit the reliable reconstruction of evolutionary events that underlie the genome sequence data. Here, we describe the main features of the Mulan tool that include the interactive modification of critical conservation parameters, visualization options, and dynamic access to sequence data from visual graphs for flexible and easy-to-perform analysis of differentially evolving genomic regions. PMID:17993678

Loots, Gabriela G; Ovcharenko, Ivan

2007-01-01

339

Mitochondrial Genome Sequence of the Legume Vicia faba  

PubMed Central

The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000?bp with a genome complexity of 387,745?bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806?bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs. PMID:23675376

Negruk, Valentine

2013-01-01

340

Genome sequence of the date palm Phoenix dactylifera L  

PubMed Central

Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4?Mb in size and covers >90% of the genome (~671?Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm’s unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants. PMID:23917264

Al-Mssallem, Ibrahim S.; Hu, Songnian; Zhang, Xiaowei; Lin, Qiang; Liu, Wanfei; Tan, Jun; Yu, Xiaoguang; Liu, Jiucheng; Pan, Linlin; Zhang, Tongwu; Yin, Yuxin; Xin, Chengqi; Wu, Hao; Zhang, Guangyu; Ba Abdullah, Mohammed M.; Huang, Dawei; Fang, Yongjun; Alnakhli, Yasser O.; Jia, Shangang; Yin, An; Alhuzimi, Eman M.; Alsaihati, Burair A.; Al-Owayyed, Saad A.; Zhao, Duojun; Zhang, Sun; Al-Otaibi, Noha A.; Sun, Gaoyuan; Majrashi, Majed A.; Li, Fusen; Tala; Wang, Jixiang; Yun, Quanzheng; Alnassar, Nafla A.; Wang, Lei; Yang, Meng; Al-Jelaify, Rasha F.; Liu, Kan; Gao, Shenghan; Chen, Kaifu; Alkhaldi, Samiyah R.; Liu, Guiming; Zhang, Meng; Guo, Haiyan; Yu, Jun

2013-01-01

341

Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences  

PubMed Central

As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

2002-01-01

342

Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani  

Microsoft Academic Search

The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

Holger Br; Gerhard Gottschalkb

343

Insights in metabolism and toxin production from the complete genome sequence of Clostridium tetani  

Microsoft Academic Search

The decryption of prokaryotic genome sequences progresses rapidly and provides the scientific community with an enormous amount of information. Clostridial genome sequencing projects have been finished only recently, starting with the genome of the solvent-producing Clostridium acetobutylicum in 2001. A lot of attention has been devoted to the genomes of pathogenic clostridia. In 2002, the genome sequence of C. perfringens,

Holger Brüggemann; Gerhard Gottschalk

2004-01-01

344

Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution  

SciTech Connect

Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

Schulman, Al

2009-08-09

345

Complete genome sequence of Serratia plymuthica strain AS12  

SciTech Connect

A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

2012-01-01

346

Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities  

Microsoft Academic Search

ABSTRACT T he application,of whole-genome,shotgun,sequencing to microbial,communities,represents,a major development in metagenomics, the study of uncultured,microbes,via the tools of modern,genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic,communities,from,an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding,to previously,published,work,on viral communities from,marine,and,fecal samples. The interpretation,of this new,kind of

Kevin Chen; Lior Pachter

2005-01-01

347

Massively parallel sequencing: the new frontier of hematologic genomics  

PubMed Central

Genomic technologies are becoming a routine part of human genetic analysis. The exponential growth in DNA sequencing capability has brought an unprecedented understanding of human genetic variation and the identification of thousands of variants that impact human health. In this review, we describe the different types of DNA variation and provide an overview of existing DNA sequencing technologies and their applications. As genomic technologies and knowledge continue to advance, they will become integral in clinical practice. To accomplish the goal of personalized genomic medicine for patients, close collaborations between researchers and clinicians will be essential to develop and curate deep databases of genetic variation and their associated phenotypes. PMID:24021669

Nickerson, Deborah A.; Reiner, Alex P.

2013-01-01

348

GIST: Genomic island suite of tools for predicting genomic islands in genomic sequences  

PubMed Central

Genomic Islands (GIs) are genomic regions that are originally from other organisms, through a process known as Horizontal Gene Transfer (HGT). Detection of GIs plays a significant role in biomedical research since such align genomic regions usually contain important features, such as pathogenic genes. We have developed a use friendly graphic user interface, Genomic Island Suite of Tools (GIST), which is a platform for scientific users to predict GIs. This software package includes five commonly used tools, AlienHunter, IslandPath, Colombo SIGI-HMM, INDeGenIUS and Pai-Ida. It also includes an optimization program EGID that ensembles the result of existing tools for more accurate prediction. The tools in GIST can be used either separately or sequentially. GIST also includes a downloadable feature that facilitates collecting the input genomes automatically from the FTP server of the National Center for Biotechnology Information (NCBI). GIST was implemented in Java, and was compiled and executed on Linux/Unix operating systems. Availability The database is available for free at http://www5.esu.edu/cpsc/bioinfo/software/GIST PMID:22419842

Hasan, Mohammad Shabbir; Liu, Qi; Wang, Han; Fazekas, John; Chen, Bernard; Che, Dongsheng

2012-01-01

349

Genome sequence of the cultivated cotton Gossypium arboreum.  

PubMed

The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells. PMID:24836287

Li, Fuguang; Fan, Guangyi; Wang, Kunbo; Sun, Fengming; Yuan, Youlu; Song, Guoli; Li, Qin; Ma, Zhiying; Lu, Cairui; Zou, Changsong; Chen, Wenbin; Liang, Xinming; Shang, Haihong; Liu, Weiqing; Shi, Chengcheng; Xiao, Guanghui; Gou, Caiyun; Ye, Wuwei; Xu, Xun; Zhang, Xueyan; Wei, Hengling; Li, Zhifang; Zhang, Guiyin; Wang, Junyi; Liu, Kun; Kohel, Russell J; Percy, Richard G; Yu, John Z; Zhu, Yu-Xian; Wang, Jun; Yu, Shuxun

2014-06-01

350

A DRAFT SEQUENCE OF THE RICE GENOME (ORYZA SATIVA L. SSP. INDICA)  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of the japonica subspecies of rice, an important cereal and model monocot, was sequenced and assembled by whole-genome shotgun sequencing. The assembled sequence covers 93% of the 420-megabase genome. Gene predictions on the assembled sequence suggest that the genome contains 32,000 to 50...

351

Draft genome sequence of adzuki bean, Vigna angularis.  

PubMed

Adzuki bean (Vigna angularis var. angularis) is a dietary legume crop in East Asia. The presumed progenitor (Vigna angularis var. nipponensis) is widely found in East Asia, suggesting speciation and domestication in these temperate climate regions. Here, we report a draft genome sequence of adzuki bean. The genome assembly covers 75% of the estimated genome and was mapped to 11 pseudo-chromosomes. Gene prediction revealed 26,857 high confidence protein-coding genes evidenced by RNAseq of different tissues. Comparative gene expression analysis with V. radiata showed that the tissue specificity of orthologous genes was highly conserved. Additional re-sequencing of wild adzuki bean, V. angularis var. nipponensis, and V. nepalensis, was performed to analyze the variations between cultivated and wild adzuki bean. The determined divergence time of adzuki bean and the wild species predated archaeology-based domestication time. The present genome assembly will accelerate the genomics-assisted breeding of adzuki bean. PMID:25626881

Kang, Yang Jae; Satyawan, Dani; Shim, Sangrea; Lee, Taeyoung; Lee, Jayern; Hwang, Won Joo; Kim, Sue K; Lestari, Puji; Laosatit, Kularb; Kim, Kil Hyun; Ha, Tae Joung; Chitikineni, Annapurna; Kim, Moon Young; Ko, Jong-Min; Gwag, Jae-Gyun; Moon, Jung-Kyung; Lee, Yeong-Ho; Park, Beom-Seok; Varshney, Rajeev K; Lee, Suk-Ha

2015-01-01

352

Draft Genome Sequences of Two Virulent Serotypes of Avian Pasteurella multocida  

PubMed Central

Here we report the draft genome sequences of two virulent avian strains of Pasteurella multocida. Comparative analyses of these genomes were done with the published genome sequence of avirulent P. multocida strain Pm70. PMID:23405337

Abrahante, Juan E.; Johnson, Timothy J.; Hunter, Samuel S.; Maheswaran, Samuel K.; Hauglund, Melissa J.; Bayles, Darrell O.; Tatum, Fred M.

2013-01-01

353

Complete Genome Sequence of Equine Herpesvirus Type 9  

PubMed Central

Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

Yamaguchi, Tsuyoshi; Yamada, Souichi

2012-01-01

354

Transcriptome and genome sequencing uncovers functional variation in humans  

PubMed Central

Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

2013-01-01

355

Finding diagnostic phenotypic features of Photobacterium in the genome sequences.  

PubMed

Photobacterium species are ubiquitous in the aquatic environment and can be found in association with animal hosts including pathogenic and mutualistic associations. The traditional phenotypic characterization of Photobacterium is expensive, time-consuming and restricted to a limited number of features. An alternative is to infer phenotypic information directly from whole genome sequences. The present study evaluates the usefulness of whole genome sequences as a source of phenotypic information and compares diagnostic phenotypes of the Photobacterium species from the literature with the predicted phenotypes obtained from whole genome sequences. All genes coding for the specific proteins involved in metabolic pathways responsible for positive phenotypes of the seventeen diagnostic features were found in the majority of the Photobacterium genomes. In the Photobacterium species that were negative for a given phenotype, at least one or several genes involved in the respective biochemical pathways were absent. PMID:25724129

Amaral, Gilda Rose S; Campeão, Mariana E; Swings, Jean; Thompson, Fabiano L; Thompson, Cristiane C

2015-05-01

356

Complete genome sequence of equine herpesvirus type 9.  

PubMed

Equine herpesvirus type 9 (EHV-9), which we isolated from a case of epizootic encephalitis in a herd of Thomson's gazelles (Gazella thomsoni) in 1993, has been known to cause fatal encephalitis in Thomson's gazelle, giraffe, and polar bear in natural infections. Our previous report indicated that EHV-9 was similar to the equine pathogen equine herpesvirus type 1 (EHV-1), which mainly causes abortion, respiratory infection, and equine herpesvirus myeloencephalopathy. We determined the genome sequence of EHV-9. The genome has a length of 148,371 bp and all 80 of the open reading frames (ORFs) found in the genome of EHV-1. The nucleotide sequences of the ORFs in EHV-9 were 86 to 95% identical to those in EHV-1. The whole genome sequence should help to reveal the neuropathogenicity of EHV-9. PMID:23166237

Fukushi, Hideto; Yamaguchi, Tsuyoshi; Yamada, Souichi

2012-12-01

357

Complete genome sequence of Streptobacillus moniliformis type strain (9901T)  

SciTech Connect

Streptobacillus moniliformis Levaditi et al. 1925 is the sole and type species of the genus, and is of phylogenetic interest because of its isolated location in the sparsely populated and neither taxonomically nor genomically much accessed family 'Leptotrichiaceae' within the phylum 'Fusobacteria'. S. moniliformis, a Gram-negative, non-motile and pleomorphic bacterium, is the etiologic agent of rat bite fever and Haverhill fever. Strain 9901T, the type strain of the species, was isolated from a patient with rat bite fever. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is only the second completed genome sequence of the order 'Fusobacteriales' and no more than the third sequence from the phylum 'Fusobacteria'. The 1,662,578 bp long chromosome and the 10,702 bp plasmid with a total of 1511 protein-coding and 55 RNA genes are part of the Genomic Encyclopedia of Bacteria and Archaea project.

Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Sims, David [Los Alamos National Laboratory (LANL); Meincke, Linda [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Sproer, Cathrin [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL)

2009-01-01

358

Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes  

Microsoft Academic Search

Non-retroviral RNA virus sequences (NRVSs) have been found in the chromosomes of vertebrates and fungi, but not plants. Here we report similarly endogenized NRVSs derived from plus-, negative-, and double-stranded RNA viruses in plant chromosomes. These sequences were found by searching public genomic sequence databases, and, importantly, most NRVSs were subsequently detected by direct molecular analyses of plant DNAs. The

Sotaro Chiba; Hideki Kondo; Akio Tani; Daisuke Saisho; Wataru Sakamoto; Satoko Kanematsu; Nobuhiro Suzuki

2011-01-01

359

Brucella microti: the genome sequence of an emerging pathogen  

Microsoft Academic Search

BACKGROUND: Using a combination of pyrosequencing and conventional Sanger sequencing, the complete genome sequence of the recently described novel Brucella species, Brucella microti, was determined. B. microti is a member of the genus Brucella within the Alphaproteobacteria, which consists of medically important highly pathogenic facultative intracellular bacteria. In contrast to all other Brucella species, B. microti is a fast growing

Stéphane Audic; Magali Lescot; Jean-Michel Claverie; Holger C Scholz

2009-01-01

360

MAP2: multiple alignment of syntenic genomic sequences  

Microsoft Academic Search

We describe a multiple alignment program named MAP2 based on a generalized pairwise global align- ment algorithm for handling long, different intergenic and intragenic regions in genomic sequences. The MAP2 program produces an ordered list of local mul- tiple alignments of similar regions among sequences, where different regions between local alignments are indicated by reporting only similar regions. We propose

Liang Ye; Xiaoqiu Huang

2005-01-01

361

Environmental Genome Shotgun Sequencing of the Sargasso Sea  

Microsoft Academic Search

We have applied ``whole-genome shotgun sequencing'' to microbial populations collected en masse on tangential flow and impact filters from seawater samples collected from the Sargasso Sea near Bermuda. A total of 1.045 billion base pairs of nonredundant sequence was generated, annotated, and analyzed to elucidate the gene content, diversity, and relative abundance of the organisms within these environmental samples. These

J. Craig Venter; Karin Remington; John F. Heidelberg; Aaron L. Halpern; Doug Rusch; Dongying Wu; Ian Paulsen; Karen E. Nelson; William Nelson; Derrick E. Fouts; Samuel Levy; Anthony H. Knap; Michael W. Lomas; Ken Nealson; Owen White; Jeremy Peterson; Jeff Hoffman; Rachel Parsons; Holly Baden-Tillson; Cynthia Pfannkoch; Yu-Hui Rogers; Hamilton O. Smith

2004-01-01

362

Characterization of microsatellites revealed by genomic sequencing of Populus trichocarpa  

Microsoft Academic Search

Microsatellites or simple sequence repeats (SSRs) are highly polymorphic, codominant markers that have great value for the construction of genetic maps, comparative mapping, population genetic surveys, and paternity analy- ses. Here, we report the development and testing of a set of SSR markers derived from shotgun sequencing from Populus trichocarpa Torr. & A. Gray, a nonenriched genomic DNA library, and

Gerald A. Tuskan; Lee E. Gunter; Zamin K. Yang; TongMing Yin; Mitchell M. Sewell; Stephen P. DiFazio

2004-01-01

363

GENOMIC SEQUENCE ANALYSIS OF LEPTOSPIRA BORGPETERSENII SEROVAR HARDJO  

Technology Transfer Automated Retrieval System (TEKTRAN)

A genomic library from Leptospira borgpetersenii serovar hardjo strain JB197 was prepared by mechanically shearing the DNA and inserting it into a positive selection vector. DNA was prepared from approximately 22,000 random clones and used as templates for automated sequencing. Sequence data was c...

364

PHYTOPHTHORA GENOME SEQUENCES UNCOVER EVOLUTIONARY ORIGINS AND MECHANISMS OF PATHOGENESIS  

Technology Transfer Automated Retrieval System (TEKTRAN)

Draft genome sequences of the soybean pathogen Phytophthora sojae and the sudden oak death pathogen Phytophthora ramorum have been determined. Oomycetes such as these Phytophthora species share the kingdom Stramenopiles with photosynthetic algae such as diatoms, and the Phytophthora sequences sugges...

365

Sequencing the Genome of the Heirloom Watermelon Cultivar Charleston Gray  

Technology Transfer Automated Retrieval System (TEKTRAN)

The genome of the watermelon cultivar Charleston Gray, a major heirloom which has been used in breeding programs of many watermelon cultivars, was sequenced. Our strategy involved a hybrid approach using the Illumina and 454/Titanium next-generation sequencing technologies. For Illumina, shotgun g...

366

The impact of next-generation sequencing on genomics  

Microsoft Academic Search

This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also

Jun Zhang; Rod Chiodini; Ahmed Badr; Genfa Zhang

2011-01-01

367

Initial sequence and comparative analysis of the cat genome  

PubMed Central

The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ?65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172

Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

2007-01-01

368

Sequence Analysis of the Genome of Carnation (Dianthus caryophyllus L.)  

PubMed Central

The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. ‘Francesco’ was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568 887 315 bp, consisting of 45 088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16 644 bp and 60 737 bp, respectively, and the longest scaffold was 1 287 144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ?98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. PMID:24344172

Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

2014-01-01

369

[Organization of simple sequences in the Drosophilia melanoga ter genome].  

PubMed

Fragments of Drosophila melanogaster DNA that intensively hybridize with simple sequences poly[(dG-dT).(dC-dA)], poly[(dA).(dT)] and poly[(dG-dA).(dC-dT)] were cloned. The first two types of simple sequences are organized in these clones as separated stretches of moderate length, repeated many times within 12-15 kb. Each cluster contains only one type of the simple sequences and originates from a unique in the genome. In contrast, poly[(dG-dA).(dC-dT)] occurs in the genome as several isolated motifs. PMID:2978049

Vashakidze, R P; Mamulashvili, N A; Kalandarishvili, K G; Kolchinski?, A M; Zaalishvili, M M

1988-01-01

370

Genome sequence-independent identification of RNA editing sites.  

PubMed

RNA editing generates post-transcriptional sequence changes that can be deduced from RNA-seq data, but detection typically requires matched genomic sequence or multiple related expression data sets. We developed the GIREMI tool (genome-independent identification of RNA editing by mutual information; https://www.ibp.ucla.edu/research/xiao/GIREMI.html) to predict adenosine-to-inosine editing accurately and sensitively from a single RNA-seq data set of modest sequencing depth. Using GIREMI on existing data, we observed tissue-specific and evolutionary patterns in editing sites in the human population. PMID:25730491

Zhang, Qing; Xiao, Xinshu

2015-04-01

371

Whole Genome Sequencing Transcriptome identification and analysis by next generation sequencing  

E-print Network

Whole Genome Sequencing Transcriptome identification and analysis by next generation sequencing #12;#12;Open reading frames on a chromosome Illumina sequencing reads Gene with low expression Gene with high expression #12;RNA-seq data Each horizontal track is from a different developmental stage #12;#12;Figure 1

372

The complete mitochondrial genome sequence of Xenocypris davidi (Bleeker).  

PubMed

Xenocypris davidi is a member of Cyprindae and widely distributed in China. To understand the systematic status of this species, we sequenced the whole mitochondrial genome of Xenocypris davidi. The complete mitochondrial genome is 16,630 bp in length including the typical structure of 22 tRNA, 2 rRNA, 13 protein-coding genes and the non-coding region. The major non-coding sequence which is the control region containing 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F). The second non-coding sequence is the origin of light-strand replication (OL). This region has the potential to fold in a step-loop secondary structure. The mitochondrial genomic sequence will help us to study the conservation genetic and evolution of Xenocypris. PMID:23815323

Liu, Yu

2014-10-01

373

The Diploid Genome Sequence of an Individual Human  

Microsoft Academic Search

Presented here is a genome sequence of an individual human. It was produced from ;32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison

Samuel Levy; Granger Sutton; Pauline C. Ng; Lars Feuk; Aaron L. Halpern; Brian P. Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F. Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R. MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B. Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A. Kravitz; Dana A. Busam; Karen Y. Beeson; Tina C. McIntosh; Karin A. Remington; Josep F. Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E. Frazier; Stephen W. Scherer; Robert L. Strausberg; J. Craig Venter

2007-01-01

374

A strategy for sequencing the genome 5 years early  

SciTech Connect

In meetings over the past 6 weeks, two respected gene sequencers have been delivering the message that the chief goal of the Human Genome Project - obtaining a complete sequence of the 3 billion bases in human DNA - can be achieved as early as 2001, 5 years ahead of schedule. This assumes a basic shift from mapping to sequencing. The interest in and controversy surrounding this announcement are discussed in this article.

Marshall, E.

1995-02-10

375

A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)  

ScienceCinema

Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

FitzGerald, Michael [Broad Institute

2013-02-12

376

Sequence-Based Mapping of the Polyploid Wheat Genome  

PubMed Central

The emergence of new sequencing technologies has provided fast and cost-efficient strategies for high-resolution mapping of complex genomes. Although these approaches hold great promise to accelerate genome analysis, their application in studying genetic variation in wheat has been hindered by the complexity of its polyploid genome. Here, we applied the next-generation sequencing of a wheat doubled-haploid mapping population for high-resolution gene mapping and tested its utility for ordering shotgun sequence contigs of a flow-sorted wheat chromosome. A bioinformatical pipeline was developed for reliable variant analysis of sequence data generated for polyploid wheat mapping populations. The results of variant mapping were consistent with the results obtained using the wheat 9000 SNP iSelect assay. A reference map of the wheat genome integrating 2740 gene-associated single-nucleotide polymorphisms from the wheat iSelect assay, 1351 diversity array technology, 118 simple sequence repeat/sequence-tagged sites, and 416,856 genotyping-by-sequencing markers was developed. By analyzing the sequenced megabase-size regions of the wheat genome we showed that mapped markers are located within 40?100 kb from genes providing a possibility for high-resolution mapping at the level of a single gene. In our population, gene loci controlling a seed color phenotype cosegregated with 2459 markers including one that was located within the red seed color gene. We demonstrate that the high-density reference map presented here is a useful resource for gene mapping and linking physical and genetic maps of the wheat genome. PMID:23665877

Saintenac, Cyrille; Jiang, Dayou; Wang, Shichen; Akhunov, Eduard

2013-01-01

377

Corruption of genomic databases with anomalous sequence.  

PubMed Central

We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%. PMID:1614861

Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L

1992-01-01

378

Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)  

DOE Data Explorer

Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

379

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change.  

PubMed

We report the 207-Mb genome sequence of the North American Arabidopsis lyrata strain MN47 based on 8.3× dideoxy sequence coverage. We predict 32,670 genes in this outcrossing species compared to the 27,025 genes in the selfing species Arabidopsis thaliana. The much smaller 125-Mb genome of A. thaliana, which diverged from A. lyrata 10 million years ago, likely constitutes the derived state for the family. We found evidence for DNA loss from large-scale rearrangements, but most of the difference in genome size can be attributed to hundreds of thousands of small deletions, mostly in noncoding DNA and transposons. Analysis of deletions and insertions still segregating in A. thaliana indicates that the process of DNA loss is ongoing, suggesting pervasive selection for a smaller genome. The high-quality reference genome sequence for A. lyrata will be an important resource for functional, evolutionary and ecological studies in the genus Arabidopsis. PMID:21478890

Hu, Tina T; Pattyn, Pedro; Bakker, Erica G; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M; Fahlgren, Noah; Fawcett, Jeffrey A; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D; Ossowski, Stephan; Ottilar, Robert P; Salamov, Asaf A; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E; Bergelson, Joy; Carrington, James C; Gaut, Brandon S; Schmutz, Jeremy; Mayer, Klaus F X; Van de Peer, Yves; Grigoriev, Igor V; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

2011-05-01

380

Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner  

PubMed Central

We define a “threaded blockset,” which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for “threaded blockset aligner”) builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser. PMID:15060014

Blanchette, Mathieu; Kent, W. James; Riemer, Cathy; Elnitski, Laura; Smit, Arian F.A.; Roskin, Krishna M.; Baertsch, Robert; Rosenbloom, Kate; Clawson, Hiram; Green, Eric D.; Haussler, David; Miller, Webb

2004-01-01

381

Widespread mitovirus sequences in plant genomes  

PubMed Central

The exploration of the evolution of RNA viruses has been aided recently by the discovery of copies of fragments or complete genomes of non-retroviral RNA viruses (Non-retroviral Endogenous RNA Viral Elements, or NERVEs) in many eukaryotic nuclear genomes. Among the most prominent NERVEs are partial copies of the RNA dependent RNA polymerase (RdRP) of the mitoviruses in plant mitochondrial genomes. Mitoviruses are in the family Narnaviridae, which are the simplest viruses, encoding only a single protein (the RdRP) in their unencapsidated viral plus strand. Narnaviruses are known only in fungi, and the origin of plant mitochondrial mitovirus NERVEs appears to be horizontal transfer from plant pathogenic fungi. At least one mitochondrial mitovirus NERVE, but not its nuclear copy, is expressed. PMID:25870770

Warner, Benjamin E.; Yerramsetty, Pradeep

2015-01-01

382

Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies  

Microsoft Academic Search

With genome sequencing efforts increasing expo- nentially, valuable information accumulates on geno- mic content of the various organisms sequenced. Projector 2 uses (un)finished genomic sequences of an organism as a template to infer linkage informa- tion for a genome sequence assembly of a related organism being sequenced. The remaining gaps between contigs for which no linkage information is present can

Sacha A. F. T. Van Hijum; Aldert L. Zomer; Oscar P. Kuipers; Jan Kok

2005-01-01

383

Genome Sequence of the Pea Aphid Acyrthosiphon pisum  

PubMed Central

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. PMID:20186266

2010-01-01

384

Genome sequence of the pea aphid Acyrthosiphon pisum.  

PubMed

Aphids are important agricultural pests and also biological models for studies of insect-plant interactions, symbiosis, virus vectoring, and the developmental causes of extreme phenotypic plasticity. Here we present the 464 Mb draft genome assembly of the pea aphid Acyrthosiphon pisum. This first published whole genome sequence of a basal hemimetabolous insect provides an outgroup to the multiple published genomes of holometabolous insects. Pea aphids are host-plant specialists, they can reproduce both sexually and asexually, and they have coevolved with an obligate bacterial symbiont. Here we highlight findings from whole genome analysis that may be related to these unusual biological features. These findings include discovery of extensive gene duplication in more than 2000 gene families as well as loss of evolutionarily conserved genes. Gene family expansions relative to other published genomes include genes involved in chromatin modification, miRNA synthesis, and sugar transport. Gene losses include genes central to the IMD immune pathway, selenoprotein utilization, purine salvage, and the entire urea cycle. The pea aphid genome reveals that only a limited number of genes have been acquired from bacteria; thus the reduced gene count of Buchnera does not reflect gene transfer to the host genome. The inventory of metabolic genes in the pea aphid genome suggests that there is extensive metabolite exchange between the aphid and Buchnera, including sharing of amino acid biosynthesis between the aphid and Buchnera. The pea aphid genome provides a foundation for post-genomic studies of fundamental biological questions and applied agricultural problems. PMID:20186266

2010-02-01

385

Comprehensive Genome Sequence Analysis of a Breast Cancer Amplicon  

PubMed Central

Gene amplification occurs in most solid tumors and is associated with poor prognosis. Amplification of 20q13.2 is common to several tumor types including breast cancer. The 1 Mb of sequence spanning the 20q13.2 breast cancer amplicon is one of the most exhaustively studied segments of the human genome. These studies have included amplicon mapping by comparative genomic hybridization (CGH), fluorescent in-situ hybridization (FISH), array-CGH, quantitative microsatellite analysis (QUMA), and functional genomic studies. Together these studies revealed a complex amplicon structure suggesting the presence of at least two driver genes in some tumors. One of these, ZNF217, is capable of immortalizing human mammary epithelial cells (HMEC) when overexpressed. In addition, we now report the sequencing of this region in human and mouse, and on quantitative expression studies in tumors. Amplicon localization now is straightforward and the availability of human and mouse genomic sequence facilitates their functional analysis. However, comprehensive annotation of megabase-scale regions requires integration of vast amounts of information. We present a system for integrative analysis and demonstrate its utility on 1.2 Mb of sequence spanning the 20q13.2 breast cancer amplicon and 865 kb of syntenic murine sequence. We integrate tumor genome copy number measurements with exhaustive genome landscape mapping, showing that amplicon boundaries are associated with maxima in repetitive element density and a region of evolutionary instability. This integration of comprehensive sequence annotation, quantitative expression analysis, and tumor amplicon boundaries provide evidence for an additional driver gene prefoldin 4 (PFDN4), coregulated genes, conserved noncoding regions, and associate repetitive elements with regions of genomic instability at this locus. PMID:11381030

Collins, Colin; Volik, Stanislav; Kowbel, David; Ginzinger, David; Ylstra, Bauke; Cloutier, Thomas; Hawkins, Trevor; Predki, Paul; Martin, Christopher; Wernick, Meredith; Kuo, Wen-Lin; Alberts, Arthur; Gray, Joe W.

2001-01-01

386

The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences  

E-print Network

after the first plant genome sequence was com- pleted [1],of the genome sequence of the flowering plant Arabidopsisgenome ref- erence sequence would fill a great evolutionary gap, but it * Correspondence: dbneale@ucdavis.edu Department of Plant

2010-01-01

387

The complete sequence of a human parainfluenzavirus 4 genome.  

PubMed

Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536

Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

2009-06-01

388

Plasmodium knowlesi Genome Sequences from Clinical Isolates Reveal Extensive Genomic Dimorphism  

PubMed Central

Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology. PMID:25830531

Millar, Scott B.; Sanderson, Theo; Otto, Thomas D.; Lu, Woon Chan; Krishna, Sanjeev; Rayner, Julian C.; Cox-Singh, Janet

2015-01-01

389

Human Genomic Sequences That Inhibit Splicing  

Microsoft Academic Search

Mammalian genes are characterized by relatively small exons surrounded by variable lengths of intronic sequence. Sequences similar to the splice signals that define the 5* and 3* boundaries of these exons are also present in abundance throughout the surrounding introns. What causes the real sites to be distinguished from the multitude of pseudosites in pre-mRNA is unclear. Much progress has

WILLIAM G. FAIRBROTHER; LAWRENCE A. CHASIN

2000-01-01

390

Inconsistencies in Neanderthal Genomic DNA Sequences  

Microsoft Academic Search

Two recently published papers describe nuclear DNA sequences that were obtained from the same Neanderthal fossil. Our reanalyses of the data from these studies show that they are not consistent with each other and point to serious problems with the data quality in one of the studies, possibly due to modern human DNA contaminants and\\/or a high rate of sequencing

Jeffrey D. Wall; Sung K. Kim

2007-01-01

391

Building a model: developing genomic resources for common milkweed ( Asclepias syriaca ) with low coverage genome sequencing  

Microsoft Academic Search

Background  Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic\\u000a resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of\\u000a the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development

Shannon CK Straub; Mark Fishbein; Tatyana Livshultz; Zachary Foster; Matthew Parks; Kevin Weitemier; Richard C Cronn; Aaron Liston

2011-01-01

392

Complete Chloroplast Genome Sequence of a Major Allogamous Forage Species, Perennial Ryegrass (Lolium perenne L.)  

PubMed Central

Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1–27 codons in comparison of L. perenne to other Poaceae and 1–68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT–PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan. PMID:19414502

Diekmann, Kerstin; Hodkinson, Trevor R.; Wolfe, Kenneth H.; van den Bekerom, Rob; Dix, Philip J.; Barth, Susanne

2009-01-01

393

Whole-genome haplotyping by dilution, amplification, and sequencing.  

PubMed

Standard whole-genome genotyping technologies are unable to determine haplotypes. Here we describe a method for rapid and cost-effective long-range haplotyping. Genomic DNA is diluted and distributed into multiple aliquots such that each aliquot receives a fraction of a haploid copy. The DNA template in each aliquot is amplified by multiple displacement amplification, converted into barcoded sequencing libraries using Nextera technology, and sequenced in multiplexed pools. To assess the performance of our method, we combined two male genomic DNA samples at equal ratios, resulting in a sample with diploid X chromosomes with known haplotypes. Pools of the multiplexed sequencing libraries were subjected to targeted pull-down of a 1-Mb contiguous region of the X-chromosome Duchenne muscular dystrophy gene. We were able to phase the Duchenne muscular dystrophy region into two contiguous haplotype blocks with a mean length of 494 kb. The haplotypes showed 99% agreement with the consensus base calls made by sequencing the individual DNAs. We subsequently used the strategy to haplotype two human genomes. Standard genomic sequencing to identify all heterozygous SNPs in the sample was combined with dilution-amplification-based sequencing data to resolve the phase of identified heterozygous SNPs. Using this procedure, we were able to phase >95% of the heterozygous SNPs from the diploid sequence data. The N50 for a Yoruba male DNA was 702 kb whereas the N50 for a European female DNA was 358 kb. Therefore, the strategy described here is suitable for haplotyping of a set of targeted regions as well as of the entire genome. PMID:23509297

Kaper, Fiona; Swamy, Sajani; Klotzle, Brandy; Munchel, Sarah; Cottrell, Joseph; Bibikova, Marina; Chuang, Han-Yu; Kruglyak, Semyon; Ronaghi, Mostafa; Eberle, Michael A; Fan, Jian-Bing

2013-04-01

394

The genome sequence of caenorhabditis briggsae: a platform for comparative genomics  

Microsoft Academic Search

The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome

Lincoln D. Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R. Brent; Nansheng Chen; Asif Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter DEustachio; David H. A. Fitch; Lucinda A. Fulton; Robert E. Fulton; Sam Griffiths-Jones; Todd W. Harris; LaDeana W. Hillier; Ravi Kamath; Patricia E. Kuwabara; Elaine R. Mardis; Marco A. Marra; Tracie L. Miner; Patrick Minx; James C. Mullikin; Robert W. Plumb; Jane Rogers; Jacqueline E. Schein; Marc Sohrmann; John Spieth; Jason E. Stajich; Chaochun Wei; David Willey; Richard K. Wilson; Richard Durbin; Robert H. Waterston

2003-01-01

395

The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics  

Microsoft Academic Search

The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome

Lincoln D Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R Brent; Nansheng Chen; Asif Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter DEustachio; David H. A Fitch; Lucinda A Fulton; Robert E Fulton; Sam Griffiths-Jones; Todd W Harris; LaDeana W Hillier; Ravi Kamath; Patricia E Kuwabara; Elaine R Mardis; Marco A Marra; Tracie L Miner; Patrick Minx; James C Mullikin; Robert W Plumb; Jane Rogers; Jacqueline E Schein; Marc Sohrmann; John Spieth; Jason E Stajich; Chaochun Wei; David Willey; Richard K Wilson; Richard Durbin; Robert H Waterston

2003-01-01

396

The power of EST sequence data: Relation to Acyrthosiphon pisum genome annotation and functional genomics initiatives  

Technology Transfer Automated Retrieval System (TEKTRAN)

Genes important to aphid biology, survival and reproduction were successfully identified by use of a genomics approach. We created and described the Sequencing, compilation, and annotation of the approxiamtely 525Mb nuclear genome of the pea aphid, Acyrthosiphon pisum, which represents an important ...

397

The ClinSeq Project: Piloting large-scale genome sequencing for research in genomic medicine  

Microsoft Academic Search

ClinSeq is a pilot project to investigate the use of whole-genome sequencing as a tool for clinical research. By piloting the acquisition of large amounts of DN A sequence data from individual human subjects, we are fostering the development of hypothesis-generating approaches for performing research in genomic medicine, including the exploration of issues re- lated to the genetic architecture of

Leslie G. Biesecker; James C. Mullikin; Flavia M. Facio; Clesson Turner; Praveen F. Cherukuri; Robert W. Blakesley; Gerard G. Bouffard; Peter S. Chines; Pedro Cruz; Nancy F. Hansen; Jamie K. Teer; Baishali Maskeri; Alice C. Young; Teri A. Manolio; Alexander F. Wilson; Toren Finkel; Paul Hwang; Andrew Arai; Alan T. Remaley; Vandana Sachdev; Robert Shamburek; Richard O. Cannon; Eric D. Green

2009-01-01

398

A draft sequence of the Neandertal Genome  

E-print Network

? · Closest relatives of Homo sapiens · Occupied Europe & Asia · Appeared in fossil record 400,000 years ago-8794-2-29-2-l.jpg #12;Where were the Neandertals? http://news.nationalgeographic.com/news/2003/03/photogalleries/neanderthal/images/ primary/neanderthals.jpg #12;How would you tell (if there was gene flow)? · Look for parts of the genome

Borenstein, Elhanan

399

The Illumina-solexa sequencing protocol for bacterial genomes.  

PubMed

Based on reversible dye-terminators technology, the Illumina-solexa sequencing platform enables rapid sequencing-by-synthesis (SBS) of large DNA stretches spanning entire genomes, with the latest instruments capable of producing hundreds of gigabases of data in a single sequencing run. Illumina's NGS instruments powerfully combine the flexibility of single reads with short- and long-insert paired-end reads, and enable a wide range of DNA sequencing applications. Here, we describe the paired-end library preparation with an average insert size of 470 bp, 2 kbp, and 6 kbp, together with the DNA cluster generation and sequencing procedure of E. coli O104:H4 genome on Illumina Hiseq 2000 platform. PMID:25343860

Hu, Zhenfei; Cheng, Lei; Wang, Hai

2015-01-01

400

Complete mitochondrial genome sequence of Cheirotonus jansoni (Coleoptera: Scarabaeidae).  

PubMed

We sequenced the complete mitochondrial genome (mitogenome) of Cheirotonus jansoni (Coleoptera: Scarabaeidae), an endangered insect species from Southeast Asia. This long legged scarab is widely collected and reared for sale, although it is rare and protected in the wild. The circular genome is 17,249 bp long and contains a typical gene complement: 13 protein-coding genes, 2 rRNA genes, 22 putative tRNA genes, and a non-coding AT-rich region. Its gene order and arrangement are identical to the common type found in most insect mitogenomes. As with all other sequenced coleopteran species, a 5-bp long TAGTA motif was detected in the intergenic space sequence located between trnS(UCN) and nad1. The atypical cox1 start codon is AAC, and the putative initiation codon for the atp8 gene appears to be GTC, instead of the frequently found ATN. By sequence comparison, the 2590-bp long non-coding AT-rich region is the second longest among the coleopterans, with two tandem repeat regions: one is 10 copies of an 88-bp sequence and the other is 2 copies of a 153-bp sequence. Additionally, the A+T content (64%) of the 13 protein-coding genes is the lowest among all sequenced coleopteran species. This newly sequenced genome aids in our understanding of the comparative biology of the mitogenomes of coleopteran species and supplies important data for the conservation of this species. PMID:24634126

Shao, L L; Huang, D Y; Sun, X Y; Hao, J S; Cheng, C H; Zhang, W; Yang, Q

2014-01-01

401

Strong nucleosomes of mouse genome including recovered centromeric sequences.  

PubMed

Recently discovered strong nucleosomes (SNs) characterized by visibly periodical DNA sequences have been found to concentrate in centromeres of Arabidopsis thaliana and in transient meiotic centromeres of Caenorhabditis elegans. To find out whether such affiliation of SNs to centromeres is a more general phenomenon, we studied SNs of the Mus musculus. The publicly available genome sequences of mouse, as well as of practically all other eukaryotes do not include the centromere regions which are difficult to assemble because of a large amount of repeat sequences in the centromeres and pericentromeric regions. We recovered those missing sequences using the data from MNase-seq experiments in mouse embryonic stem cells, where the sequence of DNA inside nucleosomes, including missing regions, was determined by 100-bp paired-end sequencing. Those nucleosome sequences, which are not matching to the published genome sequence, would largely belong to the centromeres. By evaluating SN densities in centromeres and in non-centromeric regions, we conclude that mouse SNs concentrate in the centromeres of telocentric mouse chromosomes, with ~3.9 times excess compared to their density in the rest of the genome. The remaining non-centromeric SNs are harbored mainly by introns and intergenic regions, by retro-transposons, in particular. The centromeric involvement of the SNs opens new horizons for the chromosome and centromere structure studies. PMID:24998943

Salih, Bilal F; Teif, Vladimir B; Tripathi, Vijay; Trifonov, Edward N

2015-06-01

402

Whole genome sequencing in clinical and public health microbiology.  

PubMed

Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

2015-04-01

403

Whole genome sequencing in clinical and public health microbiology  

PubMed Central

SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories. As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future. Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

Kwong, J. C.; McCallum, N.; Sintchenko, V.; Howden, B. P.

2015-01-01

404

Genome rearrangements caused by interstitial telomeric sequences in yeast  

PubMed Central

Interstitial telomeric sequences (ITSs) are present in many eukaryotic genomes and are linked to genome instabilities and disease in humans. The mechanisms responsible for ITS-mediated genome instability are not understood in molecular detail. Here, we use a model Saccharomyces cerevisiae system to characterize genome instability mediated by yeast telomeric (Ytel) repeats embedded within an intron of a reporter gene inside a yeast chromosome. We observed a very high rate of small insertions and deletions within the repeats. We also found frequent gross chromosome rearrangements, including deletions, duplications, inversions, translocations, and formation of acentric minichromosomes. The inversions are a unique class of chromosome rearrangement involving an interaction between the ITS and the true telomere of the chromosome. Because we previously found that Ytel repeats cause strong replication fork stalling, we suggest that formation of double-stranded DNA breaks within the Ytel sequences might be responsible for these gross chromosome rearrangements. PMID:24191060

Aksenova, Anna Y.; Greenwell, Patricia W.; Dominska, Margaret; Shishkin, Alexander A.; Kim, Jane C.; Petes, Thomas D.; Mirkin, Sergei M.

2013-01-01

405

Complete Genome Sequence of Klebsiella pneumoniae Phage JD001  

PubMed Central

Klebsiella pneumoniae is a member of the family Enterobacteriaceae, opportunistic pathogens that are among the eight most prevalent infectious agents in hospitals. The emergence of multidrug-resistant strains of K. pneumoniae has became a public health problem globally. To develop an effective antimicrobial agent, we isolated a bacteriophage, named JD001, from seawater and sequenced its genome. Comparative genome analysis of phage JD001 with other K. pneumoniae bacteriophages revealed that phage JD001 has little similarity to previously published K. pneumoniae phages KP15, KP32, KP34, and phiKO2. Here we announce the complete genome sequence of JD001 and report major findings from the genomic analysis. PMID:23166250

Cui, Zelin; Shen, Wenbin; Wang, Zheng; Zhang, Haotian; Me, Rao; Wang, Yanchun; Zeng, Lingbin; Zhu, Yongzhang; Qin, Jinhong

2012-01-01

406

Complete genome sequence of Alicyclobacillus acidocaldarius type strain (104-IAT)  

SciTech Connect

Alicyclobacillus acidocaldarius (Darland and Brock 1971) is the type species of the larger of the two genera in the bacillal family Alicyclobacillaceae . A. acidocaldarius is a free-living and non-pathogenic organism, but may also be associated with food and fruit spoilage. Due to its acidophilic nature, several enzymes from this species have since long been subjected to detailed molecular and biochemical studies. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of the family Alicyclobacillaceae . The 3,205,686 bp long genome (chromosome and three plasmids) with its 3,153 protein-coding and 82 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Chen, Feng [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Chang, Yun-Juan [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Meincke, Linda [Los Alamos National Laboratory (LANL); Sims, David [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Brettin, Tom [Los Alamos National Laboratory (LANL); Detter, J C [U.S. Department of Energy, Joint Genome Institute; Wahrenburg, Claudia [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Pukall, Rudiger [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

2010-01-01

407

Draft genome sequence of the rubber tree Hevea brasiliensis  

PubMed Central

Background Hevea brasiliensis, a member of the Euphorbiaceae family, is the major commercial source of natural rubber (NR). NR is a latex polymer with high elasticity, flexibility, and resilience that has played a critical role in the world economy since 1876. Results Here, we report the draft genome sequence of H. brasiliensis. The assembly spans ~1.1 Gb of the estimated 2.15 Gb haploid genome. Overall, ~78% of the genome was identified as repetitive DNA. Gene prediction shows 68,955 gene models, of which 12.7% are unique to Hevea. Most of the key genes associated with rubber biosynthesis, rubberwood formation, disease resistance, and allergenicity have been identified. Conclusions The knowledge gained from this genome sequence will aid in the future development of high-yielding clones to keep up with the ever increasing need for natural rubber. PMID:23375136

2013-01-01

408

Complete genome sequence of Arcobacter nitrofigilis type strain (CIT)  

SciTech Connect

Arcobacter nitrofigilis (McClung et al. 1983) Vandamme et al. 1991 is the type species of the genus Arcobacter in the epsilonproteobacterial family Campylobacteraceae. The species was first described in 1983 as Campylobacter nitrofigilis [1] after its detection as a free-living, nitrogen-fixing Campylobacter species associated with Spartina alterniflora Loisel. roots [2]. It is of phylogenetic interest because of its lifestyle as a symbiotic organism in a marine environment in contrast to many other Arcobacter species which are associated with warm-blooded animals and tend to be pathogenic. Here we describe the features of this organism, together with the complete genome sequence, and annotation. This is the first complete genome sequence of a type stain of the genus Arcobacter. The 3,192,235 bp genome with its 3,154 protein-coding and 70 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.

Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Chertkov, Olga [Los Alamos National Laboratory (LANL); Bruce, David [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Jeffries, Cynthia [Oak Ridge National Laboratory (ORNL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute

2010-01-01

409

Complete genome sequence of Arthrobacter sp. strain FB24  

SciTech Connect

Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program.

Nakatsu, C. H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, T.; Han, Cliff F.; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

2013-09-30

410

Complete genome sequence of Arthrobacter sp. strain FB24  

PubMed Central

Arthrobacter sp. strain FB24 is a species in the genus Arthrobacter Conn and Dimmick 1947, in the family Micrococcaceae and class Actinobacteria. A number of Arthrobacter genome sequences have been completed because of their important role in soil, especially bioremediation. This isolate is of special interest because it is tolerant to multiple metals and it is extremely resistant to elevated concentrations of chromate. The genome consists of a 4,698,945 bp circular chromosome and three plasmids (96,488, 115,507, and 159,536 bp, a total of 5,070,478 bp), coding 4,536 proteins of which 1,257 are without known function. This genome was sequenced as part of the DOE Joint Genome Institute Program. PMID:24501649

Nakatsu, Cindy H.; Barabote, Ravi; Thompson, Sue; Bruce, David; Detter, Chris; Brettin, Thomas; Han, Cliff; Beasley, Federico; Chen, Weimin; Konopka, Allan; Xie, Gary

2013-01-01

411

959 Nematode Genomes: a semantic wiki for coordinating sequencing projects.  

PubMed

Genome sequencing has been democratized by second-generation technologies, and even small labs can sequence metazoan genomes now. In this article, we describe '959 Nematode Genomes'--a community-curated semantic wiki to coordinate the sequencing efforts of individual labs to collectively sequence 959 genomes spanning the phylum Nematoda. The main goal of the wiki is to track sequencing projects that have been proposed, are in progress, or have been completed. Wiki pages for species and strains are linked to pages for people and organizations, using machine- and human-readable metadata that users can query to see the status of their favourite worm. The site is based on the same platform that runs Wikipedia, with semantic extensions that allow the underlying taxonomy and data storage models to be maintained and updated with ease compared with a conventional database-driven web site. The wiki also provides a way to track and share preliminary data if those data are not polished enough to be submitted to the official sequence repositories. In just over a year, this wiki has already fostered new international collaborations and attracted newcomers to the enthusiastic community of nematode genomicists. www.nematodegenomes.org. PMID:22058131

Kumar, Sujai; Schiffer, Philipp H; Blaxter, Mark

2012-01-01

412

AACR 2014: NCI/NIH-Sponsored Session: Large-Scale Genomics Data for the Research Community through the NCI Center for Cancer Genomics  

Cancer.gov

The NCI’s Center for Cancer Genomics (CCG), which includes the Office of Cancer Genomics and The Cancer Genome Atlas Program Office, provides the research community access to large-scale molecular characterization data, which is largely sequence-based. CCG programs aim to improve patient outcome through identification of valid molecular targets and associated molecular markers (prognostic or diagnostic), in and across diseases investigated, which should ultimately lead to the rapid development of novel, more effective therapies.

413

Genome Sequence of the Cat Pathogen, Chlamydophila felis  

Microsoft Academic Search

Chlamydophila felis (Chlamydia psittaci feline pneumonitis agent) is a worldwide spread pathogen for pneumonia and conjunctivitis in cats. Herein, we determined the entire genomic DNA sequence of the Japanese C. felis strain Fe\\/C-56 to understand the mechanism of diseases caused by this pathogen. The C. felis genome is composed of a circular 1 166 239 bp chromosome encoding 1005 protein-coding

Yoshinao Azuma; Hideki Hirakawa; Atsushi Yamashita; Yan Cai; Mohd Akhlakur Rahman; Harumi Suzuki; Shigeki Mitaku; Hidehiro Toh; Susumu Goto; Tomoyuki Murakami; Kazuro Sugi; Hideo Hayashi; Hideto Fukushi; Masahira Hattori; Satoru Kuhara; Mutsunori Shirai

2006-01-01

414

Genome Sequence of a Baculovirus Pathogenic for Culex nigripalpus  

Microsoft Academic Search

In this report we describe the complete genome sequence of a nucleopolyhedrovirus that infects larval stages of the mosquito Culex nigripalpus (CuniNPV). The CuniNPV genome is a circular double-stranded DNA molecule of 108,252 bp and is predicted to contain 109 genes. Although 36 of these genes show homology to genes from other baculoviruses, their orientation and order exhibit little conservation

C. L. Afonso; E. R. Tulman; Z. Lu; C. A. Balinsky; B. A. Moser; J. J. Becnel; D. L. Rock; G. F. Kutish

2001-01-01

415

Genome Sequence of Corynebacterium ulcerans Strain FRC11  

PubMed Central

Here, we present the genome sequence of Corynebacterium ulcerans strain FRC11. The genome includes one circular chromosome of 2,442,826 bp (53.35% G+C content), and 2,210 genes were predicted, 2,146 of which are putative protein-coding genes, with 12 rRNAs and 51 tRNAs; 1 pseudogene was also identified. PMID:25767241

Benevides, Leandro de Jesus; Viana, Marcus Vinicius Canário; Mariano, Diego César Batista; Rocha, Flávia de Souza; Bagano, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Dorella, Fernanda Alves; Leal, Carlos Augusto Gomes; Carvalho, Alex Fiorini; Soares, Siomar de Castro; Carneiro, Adriana; Ramos, Rommel; Badell-Ocando, Edgar; Guiso, Nicole; Silva, Artur; Figueiredo, Henrique; Guimarães, Luis Carlos

2015-01-01

416

Genome Sequence of Pseudomonas azelaica Strain Aramco J  

PubMed Central

We report here the draft genome sequence of Pseudomonas azelaica strain Aramco J (7.3 Mbp; GC content, 61.9%), one of the few bacteria that can completely mineralize different hydroxybiphenyls, e.g., 2-hydroxybiphenyl, 2,2?-dihydroxybiphenyl, and 3-hydroxybiphenyl. The findings obtained from its genome annotation suggest that this strain becomes a useful biocatalyst for aromatic bioconversions. PMID:25744991

El-Said Mohamed, Magdy; García, José L.; Martínez, Igor; del Cerro, Carlos; Nogales, Juan

2015-01-01

417

Genome Sequences of Equid Herpesviruses 2 and 5  

PubMed Central

We resequenced the genome of equid herpesvirus 2 (EHV2) strain 86/67 and sequenced the genomes of EHV2 strain G9/92 and equid herpesvirus 5 (EHV5) strain 2-141/67. The most prominent genetic differences are the dissimilar locations of the interleukin-10 (IL-10)-like genes and the presence of an OX-2-like gene in EHV5 only. PMID:25767243

Wilkie, Gavin S.; Kerr, Karen; Stewart, James P.; Studdert, Michael J.

2015-01-01

418

The genome sequence of the plant pathogen Xylella fastidiosa  

Microsoft Academic Search

Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis—a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and two plasmids of 51,158 bp and 1,285 bp. We can assign

A. J. G. Simpson; F. C. Reinach; P. Arruda; F. A. Abreu; M. Acencio; R. Alvarenga; L. M. C. Alves; J. E. Araya; G. S. Baia; C. S. Baptista; M. H. Barros; E. D. Bonaccorsi; S. Bordin; J. M. Bové; M. R. S. Briones; A. A. Camargo; L. E. A. Camargo; D. M. Carraro; H. Carrer; N. B. Colauto; C. Colombo; F. F. Costa; M. C. R. Costa; C. M. Costa-Neto; L. L. Coutinho; M. Cristofani; E. Dias-Neto; C. Docena; H. El-Dorry; A. P. Facincani; A. J. S. Ferreira; V. C. A. Ferreira; J. A. Ferro; J. S. Fraga; S. C. França; M. C. Franco; L. R. Furlan; M. Garnier; G. H. Goldman; M. H. S. Goldman; S. L. Gomes; A. Gruber; P. L. Ho; J. D. Hoheisel; M. L. Junqueira; E. L. Kemper; J. P. Kitajima; E. E. Kuramae; F. Laigret; M. R. Lambais; L. C. C. Leite; E. G. M. Lemos; M. V. F. Lemos; S. A. Lopes; C. R. Lopes; J. A. Machado; M. A. Machado; A. M. B. N. Madeira; H. M. F. Madeira; C. L. Marino; M. V. Marques; E. A. L. Martins; E. M. F. Martins; A. Y. Matsukuma; C. F. M. Menck; E. C. Miracca; C. Y. Miyaki; C. B. Monteiro-Vitorello; D. H. Moon; M. A. Nagai; A. L. T. O. Nascimento; L. E. S. Netto; A. Nhani; F. G. Nobrega; L. R. Nunes; M. A. Oliveira; M. C. de Oliveira; R. C. de Oliveira; D. A. Palmieri; B. R. Peixoto; G. A. G. Pereira; H. A. Pereira; J. B. Pesquero; R. B. Quaggio; P. G. Roberto; V. Rodrigues; A. J. de M. Rosa; V. E. de Rosa; R. G. de Sá; R. V. Santelli; H. E. Sawasaki; A. C. R. da Silva; F. R. da Silva; W. A. Silva; J. F. da Silveira; M. L. Z. Silvestri; W. J. Siqueira; A. A. de Souza; A. P. de Souza; M. F. Terenzi; D. Truffi; S. M. Tsai; M. H. Tsuhako; H. Vallada; M. A. Van Sluys; S. Verjovski-Almeida; A. L. Vettore; M. A. Zago; J. Meidanis; J. C. Setubal

2000-01-01

419

Genome Sequence of Pseudomonas azelaica Strain Aramco J.  

PubMed

We report here the draft genome sequence of Pseudomonas azelaica strain Aramco J (7.3 Mbp; GC content, 61.9%), one of the few bacteria that can completely mineralize different hydroxybiphenyls, e.g., 2-hydroxybiphenyl, 2,2'-dihydroxybiphenyl, and 3-hydroxybiphenyl. The findings obtained from its genome annotation suggest that this strain becomes a useful biocatalyst for aromatic bioconversions. PMID:25744991

El-Said Mohamed, Magdy; García, José L; Martínez, Igor; Del Cerro, Carlos; Nogales, Juan; Díaz, Eduardo

2015-01-01

420

Structure, sequence and expression of the hepatitis delta (?) viral genome  

NASA Astrophysics Data System (ADS)

Biochemical and electron microscopic data indicate that the human hepatitis ? viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis ? viral infections.

Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael

1986-10-01

421

The genome sequence of the filamentous fungus Neurospora crassa  

Microsoft Academic Search

Neurospora crassa is a central organism in the history of twentieth-century genetics, biochemistry and molecular biology. Here, we report a high-quality draft sequence of the N. crassa genome. The approximately 40-megabase genome encodes about 10,000 protein-coding genes-more than twice as many as in the fission yeast Schizosaccharomyces pombe and only about 25% fewer than in the fruitfly Drosophila melanogaster. Analysis

James E. Galagan; Sarah E. Calvo; Katherine A. Borkovich; Eric U. Selker; Nick D. Read; David Jaffe; William FitzHugh; Li-Jun Ma; Serge Smirnov; Seth Purcell; Bushra Rehman; Timothy Elkins; Reinhard Engels; Shunguang Wang; Cydney B. Nielsen; Jonathan Butler; Matthew Endrizzi; Dayong Qui; Peter Ianakiev; Deborah Bell-Pedersen; Mary Anne Nelson; Margaret Werner-Washburne; Claude P. Selitrennikoff; John A. Kinsey; Edward L. Braun; Alex Zelter; Ulrich Schulte; Gregory O. Kothe; Gregory Jedd; Werner Mewes; Chuck Staben; Edward Marcotte; David Greenberg; Alice Roy; Karen Foley; Jerome Naylor; Nicole Stange-Thomann; Robert Barrett; Sante Gnerre; Michael Kamal; Manolis Kamvysselis; Evan Mauceli; Cord Bielke; Stephen Rudd; Dmitrij Frishman; Svetlana Krystofova; Carolyn Rasmussen; Robert L. Metzenberg; David D. Perkins; Scott Kroken; Carlo Cogoni; Giuseppe Macino; David Catcheside; Weixi Li; Robert J. Pratt; Stephen A. Osmani; Colin P. C. DeSouza; Louise Glass; Marc J. Orbach; J. Andrew Berglund; Rodger Voelker; Oded Yarden; Michael Plamann; Stephan Seiler; Jay Dunlap; Alan Radford; Rodolfo Aramayo; Donald O. Natvig; Lisa A. Alex; Gertrud Mannhaupt; Daniel J. Ebbole; Michael Freitag; Ian Paulsen; Matthew S. Sachs; Eric S. Lander; Chad Nusbaum; Bruce Birren

2003-01-01

422

Genome Sequence of Corynebacterium ulcerans Strain FRC11.  

PubMed

Here, we present the genome sequence of Corynebacterium ulcerans strain FRC11. The genome includes one circular chromosome of 2,442,826 bp (53.35% G+C content), and 2,210 genes were predicted, 2,146 of which are putative protein-coding genes, with 12 rRNAs and 51 tRNAs; 1 pseudogene was also identified. PMID:25767241

Benevides, Leandro de Jesus; Viana, Marcus Vinicius Canário; Mariano, Diego César Batista; Rocha, Flávia de Souza; Bagano, Priscilla Carolinne; Folador, Edson Luiz; Pereira, Felipe Luiz; Dorella, Fernanda Alves; Leal, Carlos Augusto Gomes; Carvalho, Alex Fiorini; Soares, Siomar de Castro; Carneiro, Adriana; Ramos, Rommel; Badell-Ocando, Edgar; Guiso, Nicole; Silva, Artur; Figueiredo, Henrique; Azevedo, Vasco; Guimarães, Luis Carlos

2015-01-01

423

Contribution to Sequencing of the Deinococcus radiodurans Genome  

SciTech Connect

The stated goal of this project was to supply The Institute for Genomic Research (TIGR) with pure DNA from the bacterium Deinocmus radiodurans RI for purposes of complete genomic sequencing by TIGR. We subsequently decided to expand this project to include a second goal; this second goal was the development of a NotI chromosomal map of D. radiodurans R1 using Pulsed Field Gel Electrophoresis (PFGE).

Minton, K.W.

1999-03-11

424

Complete Sequence of the Citrus Tristeza Virus RNA Genome  

Microsoft Academic Search

The sequence of the entire genome of citrus tristeza virus (CTV), Florida isolate T36, was completed. The 19,296-nt CTV genome encodes 12 open reading frames (ORFs) potentially coding for at least 17 protein products. The 5?-proximal ORF 1a starts at nucleotide 108 and encodes a large polyprotein with calculated MW of 349 kDa containing domains characteristic of (from 5? to

A. V. Karasev; V. P. Boyko; S. Gowda; O. V. Nikolaeva; M. E. Hilf; E. V. Koonin; C. L. Niblett; K. Cline; D. J. Gumpf; R. F. Lee; S. M. Garnsey; D. J. Lewandowski; W. O. Dawson

1995-01-01

425

Mitochondrial genome sequence of the bluegill sunfish (Lepomis macrochirus).  

PubMed

The bluegill sunfish (Lepomis macrochirus) belongs to Lepomis genera of the family Centrarchidae, which is an economically important freshwater species in China. This study presents the complete mitochondrial genome of L. macrochirus, which is the first complete sequence from sunfish species. L. macrochirus mitochondrial DNA is 16,489 bp long, with the genome organization and gene order being identical to that of the typical vertebrate. PMID:22165836

Li, Sheng-Jie; Cai, Lei; Bai, Jun-Jie

2011-10-01

426

Complete genomic sequence of Pasteurella multocida,Pm70  

PubMed Central

We present here the complete genome sequence of a common avian clone of Pasteurella multocida, Pm70. The genome of Pm70 is a single circular chromosome 2,257,487 base pairs in length and contains 2,014 predicted coding regions, 6 ribosomal RNA operons, and 57 tRNAs. Genome-scale evolutionary analyses based on pairwise comparisons of 1,197 orthologous sequences between P. multocida, Haemophilus influenzae, and Escherichia coli suggest that P. multocida and H. influenzae diverged ?270 million years ago and the ? subdivision of the proteobacteria radiated about 680 million years ago. Two previously undescribed open reading frames, accounting for ?1% of the genome, encode large proteins with homology to the virulence-associated filamentous hemagglutinin of Bordetella pertussis. Consistent with the critical role of iron in the survival of many microbial pathogens, in silico and whole-genome microarray analyses identified more than 50 Pm70 genes with a potential role in iron acquisition and metabolism. Overall, the complete genomic sequence and preliminary functional analyses provide a foundation for future research into the mechanisms of pathogenesis and host specificity of this important multispecies pathogen. PMID:11248100

May, Barbara J.; Zhang, Qing; Li, Ling Ling; Paustian, Michael L.; Whittam, Thomas S.; Kapur, Vivek

2001-01-01

427

Low-pass sequencing for microbial comparative genomics  

PubMed Central

Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics. PMID:14718067

Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy

2004-01-01

428

Draft genome sequence of the Tibetan antelope  

PubMed Central

The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared with other plain-dwelling mammals, the genome of the Tibetan antelope shows signals of adaptive evolution and gene-family expansion in genes associated with energy metabolism and oxygen transmission. Both the highland American pika, and the Tibetan antelope have signals of positive selection for genes involved in DNA repair and the production of ATPase. Genes associated with hypoxia seem to have experienced convergent evolution. Thus, our study suggests that common genetic mechanisms might have been utilized to enable high-altitude adaptation. PMID:23673643

Ge, Ri-Li; Cai, Qingle; Shen, Yong-Yi; San, A; Ma, Lan; Zhang, Yong; Yi, Xin; Chen, Yan; Yang, Lingfeng; Huang, Ying; He, Rongjun; Hui, Yuanyuan; Hao, Meirong; Li, Yue; Wang, Bo; Ou, Xiaohua; Xu, Jiaohui; Zhang, Yongfen; Wu, Kui; Geng, Chunyu; Zhou, Weiping; Zhou, Taicheng; Irwin, David M.; Yang, Yingzhong; Ying, Liu; Bao, Haihua; Kim, Jaebum; Larkin, Denis M.; Ma, Jian; Lewin, Harris A.; Xing, Jinchuan; Platt, Roy N.; Ray, David A.; Auvil, Loretta; Capitanu, Boris; Zhang, Xiufeng; Zhang, Guojie; Murphy, Robert W.; Wang, Jun; Zhang, Ya-Ping; Wang, Jian

2013-01-01

429

Draft genome sequence of the Tibetan antelope.  

PubMed

The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared with other plain-dwelling mammals, the genome of the Tibetan antelope shows signals of adaptive evolution and gene-family expansion in genes associated with energy metabolism and oxygen transmission. Both the highland American pika, and the Tibetan antelope have signals of positive selection for genes involved in DNA repair and the production of ATPase. Genes associated with hypoxia seem to have experienced convergent evolution. Thus, our study suggests that common genetic mechanisms might have been utilized to enable high-altitude adaptation. PMID:23673643

Ge, Ri-Li; Cai, Qingle; Shen, Yong-Yi; San, A; Ma, Lan; Zhang, Yong; Yi, Xin; Chen, Yan; Yang, Lingfeng; Huang, Ying; He, Rongjun; Hui, Yuanyuan; Hao, Meirong; Li, Yue; Wang, Bo; Ou, Xiaohua; Xu, Jiaohui; Zhang, Yongfen; Wu, Kui; Geng, Chunyu; Zhou, Weiping; Zhou, Taicheng; Irwin, David M; Yang, Yingzhong; Ying, Liu; Bao, Haihua; Kim, Jaebum; Larkin, Denis M; Ma, Jian; Lewin, Harris A; Xing, Jinchuan; Platt, Roy N; Ray, David A; Auvil, Loretta; Capitanu, Boris; Zhang, Xiufeng; Zhang, Guojie; Murphy, Robert W; Wang, Jun; Zhang, Ya-Ping; Wang, Jian

2013-01-01

430

The minimum information about a genome sequence (MIGS) specification.  

PubMed

With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases. PMID:18464787

Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; DePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; San Gil, Ingio; Wilson, Gareth; Wipat, Anil

2008-05-01

431

The minimum information about a genome sequence (MIGS) specification  

PubMed Central

With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases. PMID:18464787

Field, Dawn; Garrity, George; Gray, Tanya; Morrison, Norman; Selengut, Jeremy; Sterk, Peter; Tatusova, Tatiana; Thomson, Nicholas; Allen, Michael J; Angiuoli, Samuel V; Ashburner, Michael; Axelrod, Nelson; Baldauf, Sandra; Ballard, Stuart; Boore, Jeffrey; Cochrane, Guy; Cole, James; Dawyndt, Peter; De Vos, Paul; dePamphilis, Claude; Edwards, Robert; Faruque, Nadeem; Feldman, Robert; Gilbert, Jack; Gilna, Paul; Glöckner, Frank Oliver; Goldstein, Philip; Guralnick, Robert; Haft, Dan; Hancock, David; Hermjakob, Henning; Hertz-Fowler, Christiane; Hugenholtz, Phil; Joint, Ian; Kagan, Leonid; Kane, Matthew; Kennedy, Jessie; Kowalchuk, George; Kottmann, Renzo; Kolker, Eugene; Kravitz, Saul; Kyrpides, Nikos; Leebens-Mack, Jim; Lewis, Suzanna E; Li, Kelvin; Lister, Allyson L; Lord, Phillip; Maltsev, Natalia; Markowitz, Victor; Martiny, Jennifer; Methe, Barbara; Mizrachi, Ilene; Moxon, Richard; Nelson, Karen; Parkhill, Julian; Proctor, Lita; White, Owen; Sansone, Susanna-Assunta; Spiers, Andrew; Stevens, Robert; Swift, Paul; Taylor, Chris; Tateno, Yoshio; Tett, Adrian; Turner, Sarah; Ussery, David; Vaughan, Bob; Ward, Naomi; Whetzel, Trish; Gil, Ingio San; Wilson, Gareth; Wipat, Anil

2008-01-01

432

Complete genome sequence of Haliscomenobacter hydrossis type strain (OT)  

SciTech Connect

Haliscomenobacter hydrossis van Veen et al. 1973 is the type species of the genus Halisco- menobacter, which belongs to order 'Sphingobacteriales'. The species is of interest because of its isolated phylogenetic location in the tree of life, especially the so far genomically un- charted part of it, and because the organism grows in a thin, hardly visible hyaline sheath. Members of the species were isolated from fresh water of lakes and from ditch water. The genome of H. hydrossis is the first completed genome sequence reported from a member of the family 'Saprospiraceae'. The 8,771,651 bp long genome with its three plasmids of 92 kbp, 144 kbp and 164 kbp length contains 6,848 protein-coding and 60 RNA genes, and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Daligault, Hajnalka E. [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Zeytun, Ahmet [Los Alamos National Laboratory (LANL); Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Glavina Del Rio, Tijana [U.S. Department of Energy, Joint Genome Institute; Tice, Hope [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Verbarg, Susanne [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute

2011-01-01

433

The Bacillus subtilis genome sequence: the molecular blueprint of a soil bacterium  

Microsoft Academic Search

The rate at which entire microbial genomes are being sequenced has accelerated rapidly over the past two years, promising to revolutionise our understanding of microbial molecular biology and genetics. The Bacillus subtilis genome sequence is the first complete genome of a free-living soil and rhizosphere bacterium. Data derived from the genome sequence and the systematic functional analysis programme, together with

Anil Wipat; Colin R Harwood

1999-01-01

434

The first complete chloroplast genome sequence of a lycophyte, Huperzia lucidula (Lycopodiaceae)  

E-print Network

The first complete chloroplast genome sequence of a lycophyte, Huperzia lucidula (Lycopodiaceae complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade, and shotgun sequencing to 8Ã? depth coverage to obtain the complete chloroplast genome sequence. The genome

Olmstead, Richard

435

Analysis of the genome sequence of the owering plant Arabidopsis thaliana  

E-print Network

Analysis of the genome sequence of the ¯owering plant Arabidopsis thaliana The Arabidopsis Genome multicellular eukaryotes. This is the ®rst complete genome sequence of a plant and provides the foundations, but these genome sequences represent a limited survey of multicellular organisms. Flowering plants have unique

Dangl, Jeff

436

Closed Genome Sequence of Noninvasive Streptococcus pyogenes M/emm3 Strain STAB902  

E-print Network

Closed Genome Sequence of Noninvasive Streptococcus pyogenes M/emm3 Strain STAB902 Nicolas Soriano Rennes 1, Rennes, Franced We report a closed genome sequence of group A Streptococcus genotype emm3 (GAS. Closed genome sequence of noninvasive Streptococcus pyogenes M/emm3 strain STAB902. Genome Announc. 2

Paris-Sud XI, Université de

437

Supplementary Information Genome Sequence and Assembly  

E-print Network

distachyon (Brachypodium) Bd21 plants derived by single- seed descent for 8 generations to reduce potential genome coverage. 1 BAC libraries DH and DB are described in 2-4 . Details of BAC libraries BD_CBa and BD 1 (HinDIII) 103,216 30,704 0.05 BAC DB 1 (BamH1) 108,177 36,388 0.04 BAC BD_CBa 2 (EcoR1) 124,935 25

Green, Pamela

438

Rosaceaous Genome Sequencing: Perspectives and Progress  

Microsoft Academic Search

\\u000a The long-term goal of plant genomics is to identify, isolate and determine the function of plant genes that are associated\\u000a with both vegetative and reproductive phenotypes. Most phenotypes require the coordinated activity and regulatory control\\u000a of suites of genes over time and in precise positions within the plant. Until recently, the idea of establishing a comprehensive\\u000a approach to isolate and

Bryon Sosinski; Vladimir Shulaev; Amit Dhingra; Ananth Kalyanaraman; Roger Bumgarner; Daniel Rokhsar; Ignazio Verde; Riccardo Velasco; Albert G. Abbott

439

Complete genome sequence of Pyrolobus fumarii type strain (1AT)  

SciTech Connect

Pyrolobus fumarii Bl chl et al. 1997 is the type species of the genus Pyrolobus, which be- longs to the crenarchaeal family Pyrodictiaceae. The species is a facultatively microaerophilic non-motile crenarchaeon. It is of interest because of its isolated phylogenetic location in the tree of life and because it is a hyperthermophilic chemolithoautotroph known as the primary producer of organic matter at deep-sea hydrothermal vents. P. fumarii exhibits currently the highest optimal growth temperature of all life forms on earth (106 C). This is the first com- pleted genome sequence of a member of the genus Pyrolobus to be published and only the second genome sequence from a member of the family Pyrodictiaceae. Although Diversa Corporation announced the completion of sequencing of the P. fumarii genome on Septem- ber 25, 2001, this sequence was never released to the public. The 1,843,267 bp long genome with its 1,986 protein-coding and 52 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Huber, Harald [Universitat Regensburg, Regensburg, Germany; Yasawong, Montri [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Spring, Stefan [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Abt, Birte [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Sikorski, Johannes [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Wirth, Reinhard [Universitat Regensburg, Regensburg, Germany; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute

2011-01-01

440

Complete genome sequence of Riemerella anatipestifer type strain (ATCC 11845).  

PubMed

Riemerella anatipestifer (Hendrickson and Hilbert 1932) Segers et al. 1993 is the type species of the genus Riemerella, which belongs to the family Flavobacteriaceae. The species is of interest because of the position of the genus in the phylogenetic tree and because of its role as a pathogen of commercially important avian species worldwide. This is the first completed genome sequence of a member of the genus Riemerella. The 2,155,121 bp long genome with its 2,001 protein-coding and 51 RNA genes consists of one circular chromosome and is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:21677851

Mavromatis, Konstantinos; Lu, Megan; Misra, Monica; Lapidus, Alla; Nolan, Matt; Lucas, Susan; Hammon, Nancy; Deshpande, Shweta; Cheng, Jan-Fang; Tapia, Roxane; Han, Cliff; Goodwin, Lynne; Pitluck, Sam; Liolios, Konstantinos; Pagani, Ioanna; Ivanova, Natalia; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Jeffries, Cynthia D; Detter, John C; Brambilla, Evelyne-Marie; Rohde, Manfred; Göker, Markus; Gronow, Sabine; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Klenk, Hans-Peter; Kyrpides, Nikos C

2011-04-29

441

Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing  

PubMed Central

Background Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models. PMID:21542930

2011-01-01

442

Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome  

Microsoft Academic Search

Background  It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most\\u000a informative species and features of genome evolution for comparison remain to be determined.\\u000a \\u000a \\u000a \\u000a \\u000a Results  We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D.

Casey M Bergman; Barret D Pfeiffer; Diego E Rincón-Limas; Roger A Hoskins; Andreas Gnirke; Chris J Mungall; Adrienne M Wang; Brent Kronmiller; Joanne Pacleb; Soo Park; Mark Stapleton; Kenneth Wan; Reed A George; Pieter J de Jong; Juan Botas; Gerald M Rubin; Susan E Celniker

2002-01-01

443

The complete mitochondrial genome sequence of the budgerigar, Melopsittacus undulatus.  

PubMed

Abstract Here, we describe the budgie's mitochondrial genome sequence, a resource that can facilitate this parrot's use as a model organism as well as for determining its phylogenetic relatedness to other parrots/Psittaciformes. The estimated total length of the sequence was 18,193?bp. In addition to the to the 13 protein and tRNA and rRNA coding regions, the sequence also includes a duplicated hypervariable region, a feature unique to only a few birds. The two hypervariable regions shared a sequence identity of about 86%. PMID:24660934

Guan, Xiaojing; Xu, Jun; Smith, Edward J

2014-03-24

444

LLNL Genomic Assessment: Viral and Bacterial Sequencing Needs for TMTI, Task 1.4.2 Report  

SciTech Connect

Good progress has been made on both bacterial and viral sequencing by the TMTI centers. While access to appropriate samples is a limiting factor to throughput, excellent progress has been made with respect to getting agreements in place with key sources of relevant materials. Sharing of sequenced genomes funded by TMTI has been extremely limited to date. The April 2010 exercise should force a resolution to this, but additional managerial pressures may be needed to ensure that rapid sharing of TMTI-funded sequencing occurs, regardless of collaborator constraints concerning ultimate publication(s). Policies to permit TMTI-internal rapid sharing of sequenced genomes should be written into all TMTI agreements with collaborators now being negotiated. TMTI needs to establish a Web-based system for tracking samples destined for sequencing. This includes metadata on sample origins and contributor, information on sample shipment/receipt, prioritization by TMTI, assignment to one or more sequencing centers (including possible TMTI-sponsored sequencing at a contributor site), and status history of the sample sequencing effort. While this system could be a component of the AFRL system, it is not part of any current development effort. Policy and standardized procedures are needed to ensure appropriate verification of all TMTI samples prior to the investment in sequencing. PCR, arrays, and classical biochemical tests are examples of potential verification methods. Verification is needed to detect miss-labeled, degraded, mixed or contaminated samples. Regular QC exercises are needed to ensure that the TMTI-funded centers are meeting all standards for producing quality genomic sequence data.

Slezak, T; Borucki, M; Lam, M; Lenhoff, R; Vitalis, E

2010-01-26

445

The complete genome sequence of a European goose reovirus strain.  

PubMed

The complete genomic sequence of a Hungarian goose orthoreovirus strain (D20/99) is reported in this study. The genome of D20/99 is 22,969 bp in length (range, 3958 bp for L1 to 1124 bp for S4) and encodes 11 putative proteins. Pairwise sequence comparisons and phylogenetic analyses indicated that D20/99 shares genetic signatures with some contemporary Chinese duck and goose reovirus strains, except for the ?A, ?NS and ?A protein coding genes, which represented independent genetic lineages. This study implies a greater genetic diversity among waterfowl-origin orthoreoviruses than hitherto recognized. PMID:24573219

Dandár, Eszter; Farkas, Szilvia L; Marton, Szilvia; Oldal, Miklós; Jakab, Ferenc; Mató, Tamás; Palya, Vilmos; Bányai, Krisztián

2014-08-01

446

Deep Whole-Genome Sequencing of 100 Southeast Asian Malays  

PubMed Central

Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies. PMID:23290073

Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

2013-01-01

447

Complete Genome Sequence of Methanobacterium thermoautotrophicum ?H: Functional . . .  

E-print Network

the ORF-encoded polypeptides are related to sequences with unknown functions, and 496 (27%) have little or no homology to sequences in public databases. Comparisons with Eucarya-, Bacteria-, and Archaea -specific databases reveal that 1,013 of the putative gene products (54%) are most similar to polypeptide sequences described previously for other organisms in the domain Archaea. Comparisons with the Methanococcus jannaschii genome data underline the extensive divergence that has occurred between these two methanogens; only 352 (19%) of M. thermoautotrophicum ORFs encode sequences that are >50% identical to M. jannaschii polypeptides, and there is little conservation in the relative locations of orthologous genes. When the M. thermoautotrophicum ORFs are compared to sequences from only the eucaryal and bacterial domains, 786 (42%) are more similar to bacterial sequences and 241 (13%) are more similar to eucaryal sequences. The bacterial domain-like gene products include the ma

Smith; Douglas R. Smith; Lynn A. Doucette-stamm; Craig Deloughery; Hongmei Lee; Joann Dubois; Tyler Aldredge; Romina Bashirzadeh; Derron Blakely; Robin Cook; Bryan Pothier; Dayong Qiu; Rob Spadafora; Rita Vicaire; Ying Wang; Jamey Wierzbowski; Rene Gibson; Nilofer Jiwani; Anthony Caruso; David Bush; Hershel Safer; Donivan Patwell; Shashi Prabhakar; Steve Mcdougall; George Shimer; Anil Goyal; Shmuel Pietrokovski; George M. Church; Charles J. Daniels; Jen-i Mao; Phil Rice; Jörk Nölling; John N. Reeve

1997-01-01

448

The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus).  

PubMed

We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the thylacine's basal position in Dasyuromorphia, aided by mitochondrial genome sequence that we generated from the extant numbat (Myrmecobius fasciatus). Surprisingly, both of our thylacine sequences differ by 11%-15% from putative thylacine mitochondrial genes in GenBank, with one of our samples originating from a direct offspring of the previously sequenced individual. Our data sample each mitochondrial nucleotide an average of 50 times, thereby providing the first high-fidelity reference sequence for thylacine population genetics. Our two sequences differ in only five nucleotides out of 15,452, hinting at a very low genetic diversity shortly before extinction. Despite the samples' heavy contamination with bacterial and human DNA and their temperate storage history, we estimate that as much as one-third of the total DNA in each sample is from the thylacine. The microbial content of the two thylacine samples was subjected to metagenomic analysis, and showed striking differences between a wild-captured individual and a born-in-captivity one. This study therefore adds to the growing evidence that extensive sequencing of museum collections is both feasible and desirable, and can yield complete genomes. PMID:19139089

Miller, Webb; Drautz, Daniela I; Janecka, Jan E; Lesk, Arthur M; Ratan, Aakrosh; Tomsho, Lynn P; Packard, Mike; Zhang, Yeting; McClellan, Lindsay R; Qi, Ji; Zhao, Fangqing; Gilbert, M Thomas P; Dalén, Love; Arsuaga, Juan Luis; Ericson, Per G P; Huson, Daniel H; Helgen, Kristofer M; Murphy, William J; Götherström, Anders; Schuster, Stephan C

2009-02-01

449

The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus)  

PubMed Central

We report the first two complete mitochondrial genome sequences of the thylacine (Thylacinus cynocephalus), or so-called Tasmanian tiger, extinct since 1936. The thylacine's phylogenetic position within australidelphian marsupials has long been debated, and here we provide strong support for the thylacine's basal position in Dasyuromorphia, aided by mitochondrial genome sequence that we generated from the extant numbat (Myrmecobius fasciatus). Surprisingly, both of our thylacine sequences differ by 11%–15% from putative thylacine mitochondrial genes in GenBank, with one of our samples originating from a direct offspring of the previously sequenced individual. Our data sample each mitochondrial nucleotide an average of 50 times, thereby providing the first high-fidelity reference sequence for thylacine population genetics. Our two sequences differ in only five nucleotides out of 15,452, hinting at a very low genetic diversity shortly before extinction. Despite the samples’ heavy contamination with bacterial and human DNA and their temperate storage history, we estimate that as much as one-third of the total DNA in each sample is from the thylacine. The microbial content of the two thylacine samples was subjected to metagenomic analysis, and showed striking differences between a wild-captured individual and a born-in-captivity one. This study therefore adds to the growing evidence that extensive sequencing of museum collections is both feasible and desirable, and can yield complete genomes. PMID:19139089

Miller, Webb; Drautz, Daniela I.; Janecka, Jan E.; Lesk, Arthur M.; Ratan, Aakrosh; Tomsho, Lynn P.; Packard, Mike; Zhang, Yeting; McClellan, Lindsay R.; Qi, Ji; Zhao, Fangqing; Gilbert, M. Thomas P.; Dalén, Love; Arsuaga, Juan Luis; Ericson, Per G.P.; Huson, Daniel H.; Helgen, Kristofer M.; Murphy, William J.; Götherström, Anders; Schuster, Stephan C.

2009-01-01

450

Genome sequencing highlights the dynamic early history of dogs.  

PubMed

To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we generated high-quality genome sequences from three gray wolves, one from each of the three putative centers of dog domestication, two basal dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. Analysis of these sequences supports a demographic model in which dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow. In dogs, the domestication bottleneck involved at least a 16-fold reduction in population size, a much more severe bottleneck than estimated previously. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was substantially larger than represented by modern wolf populations. We narrow the plausible range for the date of initial dog domestication to an interval spanning 11-16 thousand years ago, predating the rise of agriculture. In light of this finding, we expand upon previous work regarding the increase in copy number of the amylase gene (AMY2B) in dogs, which is believed to have aided digestion of starch in agricultural refuse. We find standing variation for amylase copy number variation in wolves and little or no copy number increase in the Dingo and Husky lineages. In conjunction with the estimated timing of dog origins, these results provide additional support to archaeological finds, suggesting the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that, surprisingly, none of the extant wolf lineages from putative domestication centers is more closely related to dogs, and, instead, the sampled wolves form a sister monophyletic clade. This result, in combination with dog-wolf admixture during the process of domestication, suggests that a re-evaluation of past hypotheses regarding dog origins is necessary. PMID:24453982

Freedman, Adam H; Gronau, Ilan; Schweizer, Rena M; Ortega-Del Vecchyo, Diego; Han, Eunjung; Silva, Pedro M; Galaverni, Marco; Fan, Zhenxin; Marx, Peter; Lorente-Galdos, Belen; Beale, Holly; Ramirez, Oscar; Hormozdiari, Farhad; Alkan, Can; Vilà, Carles; Squire, Kevin; Geffen, Eli; Kusak, Josip; Boyko, Adam R; Parker, Heidi G; Lee, Clarence; Tadigotla, Vasisht; Siepel, Adam; Bustamante, Carlos D; Harkins, Timothy T; Nelson, Stanley F; Ostrander, Elaine A; Marques-Bonet, Tomas; Wayne, Robert K; Novembre, John

2014-01-01

451

The Genomic HyperBrowser: inferential genomics at the sequence level  

PubMed Central

The immense increase in the generation of genomic scale data poses an unmet analytical challenge, due to a lack of established methodology with the required flexibility and power. We propose a first principled approach to statistical analysis of sequence-level genomic information. We provide a growing collection of generic biological investigations that query pairwise relations between tracks, represented as mathematical objects, along the genome. The Genomic HyperBrowser implements the approach and is available at http://hyperbrowser.uio.no. PMID:21182759

2010-01-01

452

Sequence modelling and an extensible data model for genomic database  

SciTech Connect

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and opera