Science.gov

Sample records for multi-platform whole-genome microarray

  1. Microarray-based whole-genome hybridization as a tool for determining procaryotic species relatedness

    SciTech Connect

    Wu, L.; Liu, X.; Fields, M.W.; Thompson, D.K.; Bagwell, C.E.; Tiedje, J. M.; Hazen, T.C.; Zhou, J.

    2008-01-15

    The definition and delineation of microbial species are of great importance and challenge due to the extent of evolution and diversity. Whole-genome DNA-DNA hybridization is the cornerstone for defining procaryotic species relatedness, but obtaining pairwise DNA-DNA reassociation values for a comprehensive phylogenetic analysis of procaryotes is tedious and time consuming. A previously described microarray format containing whole-genomic DNA (the community genome array or CGA) was rigorously evaluated as a high-throughput alternative to the traditional DNA-DNA reassociation approach for delineating procaryotic species relationships. DNA similarities for multiple bacterial strains obtained with the CGA-based hybridization were comparable to those obtained with various traditional whole-genome hybridization methods (r=0.87, P<0.01). Significant linear relationships were also observed between the CGA-based genome similarities and those derived from small subunit (SSU) rRNA gene sequences (r=0.79, P<0.0001), gyrB sequences (r=0.95, P<0.0001) or REP- and BOX-PCR fingerprinting profiles (r=0.82, P<0.0001). The CGA hybridization-revealed species relationships in several representative genera, including Pseudomonas, Azoarcus and Shewanella, were largely congruent with previous classifications based on various conventional whole-genome DNA-DNA reassociation, SSU rRNA and/or gyrB analyses. These results suggest that CGA-based DNA-DNA hybridization could serve as a powerful, high-throughput format for determining species relatedness among microorganisms.

  2. Construction and evaluation of a Clostridium thermocellum ATCC 27405 whole-genome oligonucleotide microarray

    SciTech Connect

    Brown, Steven David; Raman, Babu; McKeown, Catherine K; Kale, Shubhangi P; He, Zhili; Mielenz, Jonathan R

    2007-04-01

    Clostridium thermocellum is an anaerobic, thermophilic bacterium that can directly convert cellulosic substrates into ethanol. Microarray technology is a powerful tool to gain insights into cellular processes by examining gene expression under various physiological states. Oligonucleotide microarray probes were designed for 96.7% of the 3163 C. thermocellum ATCC 27405 candidate protein-encoding genes and then a partial-genome microarray containing 70 C. thermocellum specific probes was constructed and evaluated. We detected a signal-to-noise ratio of three with as little as 1.0 ng of genomic DNA and only low signals from negative control probes (nonclostridial DNA), indicating the probes were sensitive and specific. In order to further test the specificity of the array we amplified and hybridized 10 C. thermocellum polymerase chain reaction products that represented different genes and found gene specific hybridization in each case. We also constructed a whole-genome microarray and prepared total cellular RNA from the same point in early-logarithmic growth phase from two technical replicates during cellobiose fermentation. The reliability of the microarray data was assessed by cohybridization of labeled complementary DNA from the cellobiose fermentation samples and the pattern of hybridization revealed a linear correlation. These results taken together suggest that our oligonucleotide probe set can be used for sensitive and specific C. thermocellum transcriptomic studies in the future.

  3. Construction of Whole Genome Microarrays, and Expression Analysis of Desulfovibrio vulgaris cells in Metal-Reducing Conditions (Uranium and Chromium)

    SciTech Connect

    Fields, Matthew W.

    2005-06-01

    One of the major goals of the project is to construct whole-genome microarrays for Desulfovibrio vulgaris. Previous whole-genome microarrays constructed at ORNL have been PCR-amplimer based, and we wanted to re-evaluate the type of microarrays being built because oligonucleotide probes have several advantages. Microarrays have been generally constructed with two types of probes, PCR-generated probes that typically range in size between 200 and 2000 bp, and oligonucleotide probes with typical size of 20-70 nt. Producing PCR product-based DNA arrays can be a time-consuming procedure that includes PCR primer design, amplification, size verification, product purification, and product quantification. Also, some ORFs are difficult to amplify and thus the construction of comprehensive arrays can be a challenge. Recently, to alleviate some of the problems associated with PCR product-based microarrays, oligonucleotide microarrays that contain probes longer than 40 nt have been evaluated and used for whole genome expression studies. These microarrays should have higher specificity and are easy to construct, and can thus provide an important alternative approach to monitor gene expression. However, due to the smaller probe size, it is expected that the detection sensitivity of oligonucleotide arrays will be lower than PCR product-based probes.

  4. Effects of a Strong Static Magnetic Field on Bacterium Shewanellaoneidensis: An Assessment by Using Whole Genome Microarray.

    SciTech Connect

    Gao, W.; Liu, Y.; Zhou, J.-Z.; Hongjun, P.

    2007-04-02

    The effect of a strong static 14.1 T magnetic field on logphase cells of bacterial strain Shewanella oneidensis MR-1 was evaluatedby using whole genome microarray of this bacterium. Although differenceswere not observed between the treatment and control by measuring theoptical density (OD), colony forming unit (CFU), as well as post-exposuregrowth of cells, transcriptional expression levels of 65 genes werealtered according to our microarray data. Among these genes, 21 wereupregulated while other 44were downregulated, compared withcontrol.

  5. Construction and Evaluation of Desulfovibrio vulgaris Whole-Genome Oligonucleotide Microarrays

    SciTech Connect

    Z. He; Q. He; L. Wu; M.E. Clark; J.D. Wall; Jizhong Zhou; Matthew W. Fields

    2004-03-17

    Desulfovibrio vulgaris Hildenborough has been the focus of biochemical and physiological studies in the laboratory, and the metabolic versatility of this organism has been largely recognized, particularly the reduction of sulfate, fumarate, iron, uranium and chromium. In addition, a Desulfovibrio sp. has been shown to utilize uranium as the sole electron acceptor. D. vulgaris is a d-Proteobacterium with a genome size of 3.6 Mb and 3584 ORFs. The whole-genome microarrays of D. vulgaris have been constructed using 70mer oligonucleotides. All ORFs in the genome were represented with 3471 (97.1%) unique probes and 103 (2.9%) non-specific probes that may have cross-hybridization with other ORFs. In preparation for use of the experimental microarrays, artificial probes and targets were designed to assess specificity and sensitivity and identify optimal hybridization conditions for oligonucleotide microarrays. The results indicated that for 50mer and 70mer oligonucleotide arrays, hybridization at 45 C to 50 C, washing at 37 C and a wash time of 2.5 to 5 minutes obtained specific and strong hybridization signals. In order to evaluate the performance of the experimental microarrays, growth conditions were selected that were expected to give significant hybridization differences for different sets of genes. The initial evaluations were performed using D. vulgaris cells grown at logarithmic and stationary phases. Transcriptional analysis of D. vulgaris cells sampled during logarithmic phase growth indicated that 25% of annotated ORFs were up-regulated and 3% of annotated ORFs were downregulated compared to stationary phase cells. The up-regulated genes included ORFs predicted to be involved with acyl chain biosynthesis, amino acid ABC transporter, translational initiation factors, and ribosomal proteins. In the stationary phase growth cells, the two most up-regulated ORFs (70-fold) were annotated as a carboxynorspermidine decarboxylase and a 2C-methyl-D-erythritol-2

  6. Whole genome protein microarrays for serum profiling of immunodominant antigens of Bacillus anthracis

    PubMed Central

    Kempsell, Karen E.; Kidd, Stephen P.; Lewandowski, Kuiama; Elmore, Michael J.; Charlton, Sue; Yeates, Annemarie; Cuthbertson, Hannah; Hallis, Bassam; Altmann, Daniel M.; Rogers, Mitch; Wattiau, Pierre; Ingram, Rebecca J.; Brooks, Tim; Vipond, Richard

    2015-01-01

    A commercial Bacillus anthracis (Anthrax) whole genome protein microarray has been used to identify immunogenic Anthrax proteins (IAP) using sera from groups of donors with (a) confirmed B. anthracis naturally acquired cutaneous infection, (b) confirmed B. anthracis intravenous drug use-acquired infection, (c) occupational exposure in a wool-sorters factory, (d) humans and rabbits vaccinated with the UK Anthrax protein vaccine and compared to naïve unexposed controls. Anti-IAP responses were observed for both IgG and IgA in the challenged groups; however the anti-IAP IgG response was more evident in the vaccinated group and the anti-IAP IgA response more evident in the B. anthracis-infected groups. Infected individuals appeared somewhat suppressed for their general IgG response, compared with other challenged groups. Immunogenic protein antigens were identified in all groups, some of which were shared between groups whilst others were specific for individual groups. The toxin proteins were immunodominant in all vaccinated, infected or other challenged groups. However, a number of other chromosomally-located and plasmid encoded open reading frame proteins were also recognized by infected or exposed groups in comparison to controls. Some of these antigens e.g., BA4182 are not recognized by vaccinated individuals, suggesting that there are proteins more specifically expressed by live Anthrax spores in vivo that are not currently found in the UK licensed Anthrax Vaccine (AVP). These may perhaps be preferentially expressed during infection and represent expression of alternative pathways in the B. anthracis “infectome.” These may make highly attractive candidates for diagnostic and vaccine biomarker development as they may be more specifically associated with the infectious phase of the pathogen. A number of B. anthracis small hypothetical protein targets have been synthesized, tested in mouse immunogenicity studies and validated in parallel using human sera from

  7. Whole genome microarray analysis in non-small cell lung cancer

    PubMed Central

    AL Zeyadi, Mohammad; Dimova, Ivanka; Ranchich, Vladislav; Rukova, Blaga; Nesheva, Desislava; Hamude, Zora; Georgiev, Sevdalin; Petrov, Danail; Toncheva, Draga

    2015-01-01

    Lung cancer is a serious health problem, since it is one of the leading causes for death worldwide. Molecular–cytogenetic studies could provide reliable data about genetic alterations which could be related to disease pathogenesis and be used for better prognosis and treatment strategies. We performed whole genome oligonucleotide microarray-based comparative genomic hybridization in 10 samples of non-small cell lung cancer. Trisomies were discovered for chromosomes 1, 13, 18 and 20. Chromosome arms 5p, 7p, 11q, 20q and Хq were affected by genetic gains, and 1p, 5q, 10q and 15q, by genetic losses. Microstructural (<5 Mbp) genomic aberrations were revealed: gains in regions 7p (containing the epidermal growth factor receptor gene) and 12p (containing KRAS) and losses in 3p26 and 4q34. Based on high amplitude of alterations and small overlapping regions, new potential oncogenes may be suggested: NBPF4 (1p13.3); ETV1, AGR3 and TSPAN13 (7p21.3-7p21.1); SOX5 and FGFR1OP2 (12p12.1-12p11.22); GPC6 (13q32.1). Significant genetic losses were assumed to contain potential tumour-suppressor genes: DPYD (1p21.3); CLDN22, CLDN24, ING2, CASP3, SORBS2 (4q34.2-q35.1); DEFB (8p23.1). Our results complement the picture of genomic characterization of non-small cell lung cancer. PMID:26019623

  8. Whole genome protein microarrays for serum profiling of immunodominant antigens of Bacillus anthracis.

    PubMed

    Kempsell, Karen E; Kidd, Stephen P; Lewandowski, Kuiama; Elmore, Michael J; Charlton, Sue; Yeates, Annemarie; Cuthbertson, Hannah; Hallis, Bassam; Altmann, Daniel M; Rogers, Mitch; Wattiau, Pierre; Ingram, Rebecca J; Brooks, Tim; Vipond, Richard

    2015-01-01

    A commercial Bacillus anthracis (Anthrax) whole genome protein microarray has been used to identify immunogenic Anthrax proteins (IAP) using sera from groups of donors with (a) confirmed B. anthracis naturally acquired cutaneous infection, (b) confirmed B. anthracis intravenous drug use-acquired infection, (c) occupational exposure in a wool-sorters factory, (d) humans and rabbits vaccinated with the UK Anthrax protein vaccine and compared to naïve unexposed controls. Anti-IAP responses were observed for both IgG and IgA in the challenged groups; however the anti-IAP IgG response was more evident in the vaccinated group and the anti-IAP IgA response more evident in the B. anthracis-infected groups. Infected individuals appeared somewhat suppressed for their general IgG response, compared with other challenged groups. Immunogenic protein antigens were identified in all groups, some of which were shared between groups whilst others were specific for individual groups. The toxin proteins were immunodominant in all vaccinated, infected or other challenged groups. However, a number of other chromosomally-located and plasmid encoded open reading frame proteins were also recognized by infected or exposed groups in comparison to controls. Some of these antigens e.g., BA4182 are not recognized by vaccinated individuals, suggesting that there are proteins more specifically expressed by live Anthrax spores in vivo that are not currently found in the UK licensed Anthrax Vaccine (AVP). These may perhaps be preferentially expressed during infection and represent expression of alternative pathways in the B. anthracis "infectome." These may make highly attractive candidates for diagnostic and vaccine biomarker development as they may be more specifically associated with the infectious phase of the pathogen. A number of B. anthracis small hypothetical protein targets have been synthesized, tested in mouse immunogenicity studies and validated in parallel using human sera from the

  9. Detecting Staphylococcus aureus Virulence and Resistance Genes: a Comparison of Whole-Genome Sequencing and DNA Microarray Technology.

    PubMed

    Strauß, Lena; Ruffing, Ulla; Abdulla, Salim; Alabi, Abraham; Akulenko, Ruslan; Garrine, Marcelino; Germann, Anja; Grobusch, Martin Peter; Helms, Volkhard; Herrmann, Mathias; Kazimoto, Theckla; Kern, Winfried; Mandomando, Inácio; Peters, Georg; Schaumburg, Frieder; von Müller, Lutz; Mellmann, Alexander

    2016-04-01

    Staphylococcus aureusis a major bacterial pathogen causing a variety of diseases ranging from wound infections to severe bacteremia or intoxications. Besides host factors, the course and severity of disease is also widely dependent on the genotype of the bacterium. Whole-genome sequencing (WGS), followed by bioinformatic sequence analysis, is currently the most extensive genotyping method available. To identify clinically relevant staphylococcal virulence and resistance genes in WGS data, we developed anin silicotyping scheme for the software SeqSphere(+)(Ridom GmbH, Münster, Germany). The implemented target genes (n= 182) correspond to those queried by the IdentibacS. aureusGenotyping DNA microarray (Alere Technologies, Jena, Germany). Thein silicoscheme was evaluated by comparing the typing results of microarray and of WGS for 154 humanS. aureusisolates. A total of 96.8% (n= 27,119) of all typing results were equally identified with microarray and WGS (40.6% present and 56.2% absent). Discrepancies (3.2% in total) were caused by WGS errors (1.7%), microarray hybridization failures (1.3%), wrong prediction of ambiguous microarray results (0.1%), or unknown causes (0.1%). Superior to the microarray, WGS enabled the distinction of allelic variants, which may be essential for the prediction of bacterial virulence and resistance phenotypes. Multilocus sequence typing clonal complexes and staphylococcal cassette chromosomemecelement types inferred from microarray hybridization patterns were equally determined by WGS. In conclusion, WGS may substitute array-based methods due to its universal methodology, open and expandable nature, and rapid parallel analysis capacity for different characteristics in once-generated sequences.

  10. Detecting Staphylococcus aureus Virulence and Resistance Genes: a Comparison of Whole-Genome Sequencing and DNA Microarray Technology

    PubMed Central

    Strauß, Lena; Ruffing, Ulla; Abdulla, Salim; Alabi, Abraham; Akulenko, Ruslan; Garrine, Marcelino; Germann, Anja; Grobusch, Martin Peter; Helms, Volkhard; Herrmann, Mathias; Kazimoto, Theckla; Kern, Winfried; Mandomando, Inácio; Peters, Georg; Schaumburg, Frieder; von Müller, Lutz

    2016-01-01

    Staphylococcus aureus is a major bacterial pathogen causing a variety of diseases ranging from wound infections to severe bacteremia or intoxications. Besides host factors, the course and severity of disease is also widely dependent on the genotype of the bacterium. Whole-genome sequencing (WGS), followed by bioinformatic sequence analysis, is currently the most extensive genotyping method available. To identify clinically relevant staphylococcal virulence and resistance genes in WGS data, we developed an in silico typing scheme for the software SeqSphere+ (Ridom GmbH, Münster, Germany). The implemented target genes (n = 182) correspond to those queried by the Identibac S. aureus Genotyping DNA microarray (Alere Technologies, Jena, Germany). The in silico scheme was evaluated by comparing the typing results of microarray and of WGS for 154 human S. aureus isolates. A total of 96.8% (n = 27,119) of all typing results were equally identified with microarray and WGS (40.6% present and 56.2% absent). Discrepancies (3.2% in total) were caused by WGS errors (1.7%), microarray hybridization failures (1.3%), wrong prediction of ambiguous microarray results (0.1%), or unknown causes (0.1%). Superior to the microarray, WGS enabled the distinction of allelic variants, which may be essential for the prediction of bacterial virulence and resistance phenotypes. Multilocus sequence typing clonal complexes and staphylococcal cassette chromosome mec element types inferred from microarray hybridization patterns were equally determined by WGS. In conclusion, WGS may substitute array-based methods due to its universal methodology, open and expandable nature, and rapid parallel analysis capacity for different characteristics in once-generated sequences. PMID:26818676

  11. Insights into the fluoride-resistant regulation mechanism of Acidithiobacillus ferrooxidans ATCC 23270 based on whole genome microarrays.

    PubMed

    Ma, Liyuan; Li, Qian; Shen, Li; Feng, Xue; Xiao, Yunhua; Tao, Jiemeng; Liang, Yili; Yin, Huaqun; Liu, Xueduan

    2016-10-01

    Acidophilic microorganisms involved in uranium bioleaching are usually suppressed by dissolved fluoride ions, eventually leading to reduced leaching efficiency. However, little is known about the regulation mechanisms of microbial resistance to fluoride. In this study, the resistance of Acidithiobacillus ferrooxidans ATCC 23270 to fluoride was investigated by detecting bacterial growth fluctuations and ferrous or sulfur oxidation. To explore the regulation mechanism, a whole genome microarray was used to profile the genome-wide expression. The fluoride tolerance of A. ferrooxidans cultured in the presence of FeSO4 was better than that cultured with the S(0) substrate. The differentially expressed gene categories closely related to fluoride tolerance included those involved in energy metabolism, cellular processes, protein synthesis, transport, the cell envelope, and binding proteins. This study highlights that the cellular ferrous oxidation ability was enhanced at the lower fluoride concentrations. An overview of the cellular regulation mechanisms of extremophiles to fluoride resistance is discussed. PMID:27519020

  12. Whole Genome Comparison of Campylobacter jejuni Human Isolates Using a Low-Cost Microarray Reveals Extensive Genetic Diversity

    PubMed Central

    Dorrell, Nick; Mangan, Joseph A.; Laing, Kenneth G.; Hinds, Jason; Linton, Dennis; Al-Ghusein, Hasan; Barrell, Bart G.; Parkhill, Julian; Stoker, Neil G.; Karlyshev, Andrey V.; Butcher, Philip D.; Wren, Brendan W.

    2001-01-01

    Campylobacter jejuni is the leading cause of bacterial food-borne diarrhoeal disease throughout the world, and yet is still a poorly understood pathogen. Whole genome microarray comparisons of 11 C. jejuni strains of diverse origin identified genes in up to 30 NCTC 11168 loci ranging from 0.7 to 18.7 kb that are either absent or highly divergent in these isolates. Many of these regions are associated with the biosynthesis of surface structures including flagella, lipo-oligosaccharide, and the newly identified capsule. Other strain-variable genes of known function include those responsible for iron acquisition, DNA restriction/modification, and sialylation. In fact, at least 21% of genes in the sequenced strain appear dispensable as they are absent or highly divergent in one or more of the isolates tested, thus defining 1300 C. jejuni core genes. Such core genes contribute mainly to metabolic, biosynthetic, cellular, and regulatory processes, but many virulence determinants are also conserved. Comparison of the capsule biosynthesis locus revealed conservation of all the genes in this region in strains with the same Penner serotype as strain NCTC 11168. By contrast, between 5 and 17 NCTC 11168 genes in this region are either absent or highly divergent in strains of a different serotype from the sequenced strain, providing further evidence that the capsule accounts for Penner serotype specificity. These studies reveal extensive genetic diversity among C. jejuni strains and pave the way toward identifying correlates of pathogenicity and developing improved epidemiological tools for this problematic pathogen. PMID:11591647

  13. Multi-Platform, Multi-Site, Microarray-Based Human Tumor Classification

    PubMed Central

    Bloom, Greg; Yang, Ivana V.; Boulware, David; Kwong, Ka Yin; Coppola, Domenico; Eschrich, Steven; Quackenbush, John; Yeatman, Timothy J.

    2004-01-01

    The introduction of gene expression profiling has resulted in the production of rich human data sets with potential for deciphering tumor diagnosis, prognosis, and therapy. Here we demonstrate how artificial neural networks (ANNs) can be applied to two completely different microarray platforms (cDNA and oligonucleotide), or a combination of both, to build tumor classifiers capable of deciphering the identity of most human cancers. First, 78 tumors representing eight different types of histologically similar adenocarcinoma, were evaluated with a 32k cDNA microarray and correctly classified by a cDNA-based ANN, using independent training and test sets, with a mean accuracy of 83%. To expand our approach, oligonucleotide data derived from six independent performance sites, representing 463 tumors and 21 tumor types, were assembled, normalized, and scaled. An oligonucleotide-based ANN, trained on a random fraction of the tumors (n = 343), was 88% accurate in predicting known pathological origin of the remaining fraction of tumors (n = 120) not exposed to the training algorithm. Finally, a mixed-platform classifier using a combination of both cDNA and oligonucleotide microarray data from seven performance sites, normalized and scaled from a large and diverse tumor set (n = 539), produced similar results (85% accuracy) on independent test sets. Further validation of our classifiers was achieved by accurately (84%) predicting the known primary site of origin for an independent set of metastatic lesions (n = 50), resected from brain, lung, and liver, potentially addressing the vexing classification problems imposed by unknown primary cancers. These cDNA- and oligonucleotide-based classifiers provide a first proof of principle that data derived from multiple platforms and performance sites can be exploited to build multi-tissue tumor classifiers. PMID:14695313

  14. Final Report Construction of Whole Genome Microarrays, and Expression Analysis of Desulfovibrio vulgaris cells in Metal-Reducing Conditions

    SciTech Connect

    M.W. Fields; J.D. Wall; J. Keasling; J. Zhou

    2008-05-15

    We continue to utilize the oligonucleotide microarrays that were constructed through funding with this project to characterize growth responses of Desulfovibrio vulgaris relevant to metal-reducing conditions. To effectively immobilize heavy metals and radionuclides via sulfate-reduction, it is important to understand the cellular responses to adverse factors observed at contaminated subsurface environments (e.g., nutrients, pH, contaminants, growth requirements and products). One of the major goals of the project is to construct whole-genome microarrays for Desulfovibrio vulgaris. First, in order to experimentally establish the criteria for designing gene-specific oligonucleotide probes, an oligonucleotide array was constructed that contained perfect match (PM) and mismatch (MM) probes (50mers and 70mers) based upon 4 genes. The effects of probe-target identity, continuous stretch, mismatch position, and hybridization free energy on specificity were examined. Little hybridization was observed at a probe-target identity of <85% for both 50mer and 70mer probes. 33 to 48% of the PM signal intensities were detected at a probe-target identity of 94% for 50mer oligonucleotides, and 43 to 55% for 70mer probes at a probe-target identity of 96%. When the effects of sequence identity and continuous stretch were considered independently, a stretch probe (>15 bases) contributed an additional 9% of the PM signal intensity compared to a non-stretch probe (< 15 bases) at the same identity level. Cross-hybridization increased as the length of continuous stretch increased. A 35-base stretch for 50mer probes or a 50-base stretch for 70mer probes had approximately 55% of the PM signal. Mismatches should be as close to the middle position of an oligonucleotide probe as possible to minimize cross-hybridization. Little cross-hybridization was observed for probes with a minimal binding free energy greater than -30 kcal/mol for 50mer probes or -40 kcal/mol for 70mer probes. Based on the

  15. Single-cell whole-genome amplification technique impacts the accuracy of SNP microarray-based genotyping and copy number analyses

    PubMed Central

    Treff, Nathan R.; Su, Jing; Tao, Xin; Northrop, Lesley E.; Scott, Richard T.

    2011-01-01

    Methods of comprehensive microarray-based aneuploidy screening in single cells are rapidly emerging. Whole-genome amplification (WGA) remains a critical component for these methods to be successful. A number of commercially available WGA kits have been independently utilized in previous single-cell microarray studies. However, direct comparison of their performance on single cells has not been conducted. The present study demonstrates that among previously published methods, a single-cell GenomePlex WGA protocol provides the best combination of speed and accuracy for single nucleotide polymorphism microarray-based copy number (CN) analysis when compared with a REPLI-g- or GenomiPhi-based protocol. Alternatively, for applications that do not have constraints on turnaround time and that are directed at accurate genotyping rather than CN assignments, a REPLI-g-based protocol may provide the best solution. PMID:21177337

  16. Development and Assessment of Whole-Genome Oligonucleotide Microarrays to Analyze an Anaerobic Microbial Community and its Responses to Oxidative Stress

    SciTech Connect

    Scholten, Johannes C.; Culley, David E.; Nie, Lei; Munn, Kyle J.; Chow, Lely; Brockman, Fred J.; Zhang, Weiwen

    2007-06-29

    The application of DNA microarray technology to investigate multiple-species microbial community presents great challenges. In this study, we reported the design and quality assessment of four whole genome oligonucleotide microarrays for two syntroph bacteria, Desulfovibrio vulgaris and Syntrophobacter fumaroxidans, and two archaeal methanogens, Methanosarcina barkeri and Methanospirillum hungatei, and their application to analyze global gene expression of this four-species microbial community in response to oxidative stress. In order to minimize the possible cross-hybridization, cross-genome comparison was performed to assure all probes unique to each genome so that the microarrays could provide species-level resolution. Microarray quality was validated by the good reproducibility of experimental measurements of multiple biological and analytical replicates. Microarray analysis showed that S. fumaroxidans and M. hungatei responded to the stress with up-regulation of several genes known to be involved in ROS detoxification, such as catalase and rubrerythrin in S. fumaroxidans and thioredoxin and heat shock protein Hsp20 in M. hungatei. Consistent with previous study in pure culture, the microarray analysis showed that genes involved in methane production and energy metabolism were down-regulated by oxidative stress in M. barkeri. However, D. vulgaris seemed less sensitive to the oxidative stress when grown in a community, with almost no gene up-regulated. The study demonstrated the successful application of microarray technology to multiple-species microbial community, and our preliminary results indicated that the approach can provide novel insights on the metabolic and regulatory networks within microbial communities.

  17. Whole Genome Sequencing

    MedlinePlus

    ... you want to learn. Search form Search Whole Genome Sequencing You are here Home Testing & Services Testing ... the full story, click here . What is whole genome sequencing? Whole genome sequencing is the mapping out ...

  18. Shared clonal cytogenetic abnormalities in aberrant mast cells and leukemic myeloid blasts detected by single nucleotide polymorphism microarray-based whole-genome scanning.

    PubMed

    Frederiksen, John K; Shao, Lina; Bixby, Dale L; Ross, Charles W

    2016-04-01

    Systemic mastocytosis (SM) is characterized by a clonal proliferation of aberrant mast cells within extracutaneous sites. In a subset of SM cases, a second associated hematologic non-mast cell disease (AHNMD) is also present, usually of myeloid origin. Polymerase chain reaction and targeted fluorescence in situ hybridization studies have provided evidence that, in at least some cases, the aberrant mast cells are related clonally to the neoplastic cells of the AHNMD. In this work, a single nucleotide polymorphism microarray (SNP-A) was used to characterize the cytogenetics of the aberrant mast cells from a patient with acute myeloid leukemia and concomitant mast cell leukemia associated with a KIT D816A mutation. The results demonstrate the presence of shared cytogenetic abnormalities between the mast cells and myeloid blasts, as well as additional abnormalities within mast cells (copy-neutral loss of heterozygosity) not detectable by routine karyotypic analysis. To our knowledge, this work represents the first application of SNP-A whole-genome scanning to the detection of shared cytogenetic abnormalities between the two components of a case of SM-AHNMD. The findings provide additional evidence of a frequent clonal link between aberrant mast cells and cells of myeloid AHNMDs, and also highlight the importance of direct sequencing for identifying uncommon activating KIT mutations.

  19. A functional genomics tool for the Pacific bluefin tuna: Development of a 44K oligonucleotide microarray from whole-genome sequencing data for global transcriptome analysis.

    PubMed

    Yasuike, Motoshige; Fujiwara, Atushi; Nakamura, Yoji; Iwasaki, Yuki; Nishiki, Issei; Sugaya, Takuma; Shimizu, Akio; Sano, Motohiko; Kobayashi, Takanori; Ototake, Mitsuru

    2016-02-01

    Bluefin tunas are one of the most important fishery resources worldwide. Because of high market values, bluefin tuna farming has been rapidly growing during recent years. At present, the most common form of the tuna farming is based on the stocking of wild-caught fish. Therefore, concerns have been raised about the negative impact of the tuna farming on wild stocks. Recently, the Pacific bluefin tuna (PBT), Thunnus orientalis, has succeeded in completing the reproduction cycle under aquaculture conditions, but production bottlenecks remain to be solved because of very little biological information on bluefin tunas. Functional genomics approaches promise to rapidly increase our knowledge on biological processes in the bluefin tuna. Here, we describe the development of the first 44K PBT oligonucleotide microarray (oligo-array), based on whole-genome shotgun (WGS) sequencing and large-scale expressed sequence tags (ESTs) data. In addition, we also introduce an initial 44K PBT oligo-array experiment using in vitro grown peripheral blood leukocytes (PBLs) stimulated with immunostimulants such as lipopolysaccharide (LPS: a cell wall component of Gram-negative bacteria) or polyinosinic:polycytidylic acid (poly I:C: a synthetic mimic of viral infection). This pilot 44K PBT oligo-array analysis successfully addressed distinct immune processes between LPS- and poly I:C- stimulated PBLs. Thus, we expect that this oligo-array will provide an excellent opportunity to analyze global gene expression profiles for a better understanding of diseases and stress, as well as for reproduction, development and influence of nutrition on tuna aquaculture production.

  20. Whole genome sequence typing and microarray profiling of nasal and blood stream methicillin-resistant Staphylococcus aureus isolates: Clues to phylogeny and invasiveness.

    PubMed

    Hamed, Mohamed; Nitsche-Schmitz, Daniel Patric; Ruffing, Ulla; Steglich, Matthias; Dordel, Janina; Nguyen, Duy; Brink, Jan-Hendrik; Chhatwal, Gursharan Singh; Herrmann, Mathias; Nübel, Ulrich; Helms, Volkhard; von Müller, Lutz

    2015-12-01

    Hospital-associated methicillin-resistant Staphylococcus aureus (MRSA) infections are frequently caused by predominant clusters of closely related isolates that cannot be discriminated by conventional diagnostic typing methods. Whole genome sequencing (WGS) and DNA microarray (MA) now allow for better discrimination within a prevalent clonal complex (CC). This single center exploratory study aims to distinguish invasive (blood stream infection) and non-invasive (nasal colonization) MRSA isolates of the same CC5 into phylogenetic- and virulence-associated genotypic subgroups by WGS and MA. A cohort of twelve blood stream and fifteen nasal MRSA isolates of CC5 (spa-types t003 and t504) was selected. Isolates were propagated at the same period of time from unrelated patients treated at the University of Saarland Medical Center, Germany. Rooted phylotyping based on WGS with core-genome single nucleotide polymorphism (SNP) analysis revealed two local clusters of closely related CC5 subgroups (t504 and Clade1 t003) which were separated from other local t003 isolates and from unrelated CC5 MRSA reference isolates of German origin. Phylogenetic subtyping was not associated with invasiveness when comparing blood stream and nasal isolates. Clustering based on MA profiles was not concordant with WGS phylotyping, but MA profiles may identify subgroups of isolates with nasal and blood stream origin. Among the new putative virulence associated genes identified by WGS, the strongest association with blood stream infections was shown for ebhB mutants. Analysis of the core-genome together with the accessory genome enables subtyping of closely related MRSA isolates according to phylogeny and presumably also to the potential virulence capacity of isolates.

  1. A Whole-Genome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 Days on Board of Shenzhou 8

    PubMed Central

    Neef, Maren; Ecke, Margret; Hampp, Rüdiger

    2015-01-01

    The Simbox mission was the first joint space project between Germany and China in November 2011. Eleven-day-old Arabidopsis thaliana wild type semisolid callus cultures were integrated into fully automated plant cultivation containers and exposed to spaceflight conditions within the Simbox hardware on board of the spacecraft Shenzhou 8. The related ground experiment was conducted under similar conditions. The use of an in-flight centrifuge provided a 1 g gravitational field in space. The cells were metabolically quenched after 5 days via RNAlater injection. The impact on the Arabidopsis transcriptome was investigated by means of whole-genome gene expression analysis. The results show a major impact of nonmicrogravity related spaceflight conditions. Genes that were significantly altered in transcript abundance are mainly involved in protein phosphorylation and MAPK cascade-related signaling processes, as well as in the cellular defense and stress responses. In contrast to short-term effects of microgravity (seconds, minutes), this mission identified only minor changes after 5 days of microgravity. These concerned genes coding for proteins involved in the plastid-associated translation machinery, mitochondrial electron transport, and energy production. PMID:25654111

  2. A whole-genome microarray study of Arabidopsis thaliana semisolid callus cultures exposed to microgravity and nonmicrogravity related spaceflight conditions for 5 days on board of Shenzhou 8.

    PubMed

    Fengler, Svenja; Spirer, Ina; Neef, Maren; Ecke, Margret; Nieselt, Kay; Hampp, Rüdiger

    2015-01-01

    The Simbox mission was the first joint space project between Germany and China in November 2011. Eleven-day-old Arabidopsis thaliana wild type semisolid callus cultures were integrated into fully automated plant cultivation containers and exposed to spaceflight conditions within the Simbox hardware on board of the spacecraft Shenzhou 8. The related ground experiment was conducted under similar conditions. The use of an in-flight centrifuge provided a 1 g gravitational field in space. The cells were metabolically quenched after 5 days via RNAlater injection. The impact on the Arabidopsis transcriptome was investigated by means of whole-genome gene expression analysis. The results show a major impact of nonmicrogravity related spaceflight conditions. Genes that were significantly altered in transcript abundance are mainly involved in protein phosphorylation and MAPK cascade-related signaling processes, as well as in the cellular defense and stress responses. In contrast to short-term effects of microgravity (seconds, minutes), this mission identified only minor changes after 5 days of microgravity. These concerned genes coding for proteins involved in the plastid-associated translation machinery, mitochondrial electron transport, and energy production.

  3. Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

    SciTech Connect

    Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy; Zhou, Jizhong

    2010-05-17

    Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the average nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.

  4. High resolution copy number variation data in the NCI-60 cancer cell lines from whole genome microarrays accessible through CellMiner.

    PubMed

    Varma, Sudhir; Pommier, Yves; Sunshine, Margot; Weinstein, John N; Reinhold, William C

    2014-01-01

    Array-based comparative genomic hybridization (aCGH) is a powerful technique for detecting gene copy number variation. It is generally considered to be robust and convenient since it measures DNA rather than RNA. In the current study, we combine copy number estimates from four different platforms (Agilent 44 K, NimbleGen 385 K, Affymetrix 500 K and Illumina Human1Mv1_C) to compute a reliable, high-resolution, easy to understand output for the measure of copy number changes in the 60 cancer cells of the NCI-DTP (the NCI-60). We then relate the results to gene expression. We explain how to access that database using our CellMiner web-tool and provide an example of the ease of comparison with transcript expression, whole exome sequencing, microRNA expression and response to 20,000 drugs and other chemical compounds. We then demonstrate how the data can be analyzed integratively with transcript expression data for the whole genome (26,065 genes). Comparison of copy number and expression levels shows an overall medium high correlation (median r = 0.247), with significantly higher correlations (median r = 0.408) for the known tumor suppressor genes. That observation is consistent with the hypothesis that gene loss is an important mechanism for tumor suppressor inactivation. An integrated analysis of concurrent DNA copy number and gene expression change is presented. Limiting attention to focal DNA gains or losses, we identify and reveal novel candidate tumor suppressors with matching alterations in transcript level.

  5. A Whole-Genome Microarray Study of Arabidopis Thaliana Cell Cultures Exposed to Real and Simulated Partial-G Forces: A Comparison of Parabolic Flight and Clinostat Data

    NASA Astrophysics Data System (ADS)

    Fengler, S.; Spirer, I.; Neef, M.; Ecke, M.; Hauslage, J.; Hampp, R.

    2015-09-01

    Cell cultures of the plant model organism Arabidopsis thaliana were exposed to partial-g forces during parabolic flight and clinostat experiments (0.38 g, 0. 16 g and 0.5 g). To investigate gravity-dependent alterations in gene expression, samples were metabolically quenched and used for microarray analysis. An attempt to identify the potential threshold acceleration showed that the smaller the experienced g-force, the greater was the susceptibility of the cell cultures. Compared to short-term ~sg during a regular parabolic flight, the number of differentially expressed genes under partial-g was lower. In addition, the effect on the alteration of amounts of transcripts decreased during partial-g parabolic flight due to the sequence of the different parabolas (0.38 g, 0.16 g and ~sg). A time-dependent analysis under simulated 0.5 g indicates that adaptation occurs within minutes. Differentially expressed genes (at least 2-fold altered in expression) under real flight conditions were to some extent identical with those affected by clinorotation. The highest number of identical genes was detected within seconds of exposure to 0.38 g.

  6. Multi-Platform Avionics Simulator

    NASA Technical Reports Server (NTRS)

    Clark, Micah; Steinke, Robert; McMahon, Elihu

    2006-01-01

    Multi-Platform Avionics Simulator (MPAvSim) is a software library for development of simulations of avionic hardware. MPAvSim facilitates simulation of interactions between flight software and such avionic peripheral equipment as telecommunication devices, thrusters, pyrotechnic devices, motor controllers, and scientific instruments. MPAvSim focuses on the behavior of avionics as seen by flight software, rather than on performing high-fidelity simulations of dynamics. However, MPAvSim is easily integrable with other programs that do perform such simulations. MPAvSim makes it possible to do real-time partial hardware- in-the-loop simulations. An MPAvSim simulation consists of execution chains (see figure) represented by flow graphs of models, defined here as stateless procedures that do some work. During a simulation, MPAvSim walks the execution chain, running each model in turn. Using MPAvSim, flight software can be run against a spacecraft that is all simulation, all hardware, or part hardware and part simulation. With respect to a specific piece of hardware, either the hardware itself or its simulation can be plugged in without affecting the rest of the system. Thus, flight software can be tested before hardware is available, and as items of hardware become available, they can be substituted for their simulations, with minimal disruption.

  7. Microarrays

    ERIC Educational Resources Information Center

    Plomin, Robert; Schalkwyk, Leonard C.

    2007-01-01

    Microarrays are revolutionizing genetics by making it possible to genotype hundreds of thousands of DNA markers and to assess the expression (RNA transcripts) of all of the genes in the genome. Microarrays are slides the size of a postage stamp that contain millions of DNA sequences to which single-stranded DNA or RNA can hybridize. This…

  8. Whole Genome Amplification from Blood Spot Samples.

    PubMed

    Sørensen, Karina Meden

    2015-01-01

    Whole genome amplification is an invaluable technique when working with DNA extracted from blood spots, as the DNA obtained from this source often is too limited for extensive genetic analysis. Two techniques that amplify the entire genome are common. Here, both are described with focus on the benefits and drawbacks of each system. However, in order to obtain the best possible WGA result the quality of input DNA extracted from the blood spot is essential, but also time consumption, flexibility in format and elution volume and price of the technology are factors influencing system choice. Here, three DNA extraction techniques are described and the above aspects are compared between the systems.

  9. Microbial species delineation using whole genome sequences

    SciTech Connect

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  10. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  11. Whole genome amplification in preimplantation genetic diagnosis.

    PubMed

    Zheng, Ying-ming; Wang, Ning; Li, Lei; Jin, Fan

    2011-01-01

    Preimplantation genetic diagnosis (PGD) refers to a procedure for genetically analyzing embryos prior to implantation, improving the chance of conception for patients at high risk of transmitting specific inherited disorders. This method has been widely used for a large number of genetic disorders since the first successful application in the early 1990s. Polymerase chain reaction (PCR) and fluorescent in situ hybridization (FISH) are the two main methods in PGD, but there are some inevitable shortcomings limiting the scope of genetic diagnosis. Fortunately, different whole genome amplification (WGA) techniques have been developed to overcome these problems. Sufficient DNA can be amplified and multiple tasks which need abundant DNA can be performed. Moreover, WGA products can be analyzed as a template for multi-loci and multi-gene during the subsequent DNA analysis. In this review, we will focus on the currently available WGA techniques and their applications, as well as the new technical trends from WGA products.

  12. Principles of Whole-Genome Amplification.

    PubMed

    Czyz, Zbigniew Tadeusz; Kirsch, Stefan; Polzer, Bernhard

    2015-01-01

    Modern molecular biology relies on large amounts of high-quality genomic DNA. However, in a number of clinical or biological applications this requirement cannot be met, as starting material is either limited (e.g., preimplantation genetic diagnosis (PGD) or analysis of minimal residual cancer) or of insufficient quality (e.g., formalin-fixed paraffin-embedded tissue samples or forensics). As a consequence, in order to obtain sufficient amounts of material to analyze these demanding samples by state-of-the-art modern molecular assays, genomic DNA has to be amplified. This chapter summarizes available technologies for whole-genome amplification (WGA), bridging the last 25 years from the first developments to currently applied methods. We will especially elaborate on research application, as well as inherent advantages and limitations of various WGA technologies.

  13. Strategies and tools for whole genome alignments

    SciTech Connect

    Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas; Ishkhanov,Tigran; Ryaboy, Dmitriy; Rubin, Edward; Pachter, Lior; Dubchak, Inna

    2002-11-25

    The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With a view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.

  14. Whole genome sequence analysis of Mycobacterium suricattae.

    PubMed

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; van der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah Musa; Siame, Kabengele Keith; Gey van Pittius, Nicolaas Claudius; van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-12-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi. PMID:26542221

  15. Minimum Contradiction Matrices in Whole Genome Phylogenies

    PubMed Central

    Thuillard, Marc

    2008-01-01

    Minimum contradiction matrices are a useful complement to distance-based phylogenies. A minimum contradiction matrix represents phylogenetic information under the form of an ordered distance matrix Yi, jn. A matrix element corresponds to the distance from a reference vertex n to the path (i, j). For an X-tree or a split network, the minimum contradiction matrix is a Robinson matrix. It therefore fulfills all the inequalities defining perfect order: Yi, jn ≥ Yi,kn, Yk jn ≥ Yk, In, i ≤ j ≤ k < n. In real phylogenetic data, some taxa may contradict the inequalities for perfect order. Contradictions to perfect order correspond to deviations from a tree or from a split network topology. Efficient algorithms that search for the best order are presented and tested on whole genome phylogenies with 184 taxa including many Bacteria, Archaea and Eukaryota. After optimization, taxa are classified in their correct domain and phyla. Several significant deviations from perfect order correspond to well-documented evolutionary events. PMID:19204821

  16. Whole-Genome Sequencing in Outbreak Analysis

    PubMed Central

    Turner, Stephen D.; Riley, Margaret F.; Petri, William A.; Hewlett, Erik L.

    2015-01-01

    SUMMARY In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  17. Small Sample Whole-Genome Amplification

    SciTech Connect

    Hara, C A; Nguyen, C P; Wheeler, E K; Sorensen, K J; Arroyo, E S; Vrankovich, G P; Christian, A T

    2005-09-20

    Many challenges arise when trying to amplify and analyze human samples collected in the field due to limitations in sample quantity, and contamination of the starting material. Tests such as DNA fingerprinting and mitochondrial typing require a certain sample size and are carried out in large volume reactions; in cases where insufficient sample is present whole genome amplification (WGA) can be used. WGA allows very small quantities of DNA to be amplified in a way that enables subsequent DNA-based tests to be performed. A limiting step to WGA is sample preparation. To minimize the necessary sample size, we have developed two modifications of WGA: the first allows for an increase in amplified product from small, nanoscale, purified samples with the use of carrier DNA while the second is a single-step method for cleaning and amplifying samples all in one column. Conventional DNA cleanup involves binding the DNA to silica, washing away impurities, and then releasing the DNA for subsequent testing. We have eliminated losses associated with incomplete sample release, thereby decreasing the required amount of starting template for DNA testing. Both techniques address the limitations of sample size by providing ample copies of genomic samples. Carrier DNA, included in our WGA reactions, can be used when amplifying samples with the standard purification method, or can be used in conjunction with our single-step DNA purification technique to potentially further decrease the amount of starting sample necessary for future forensic DNA-based assays.

  18. Microbial species delineation using whole genome sequences

    PubMed Central

    Varghese, Neha J.; Mukherjee, Supratim; Ivanova, Natalia; Konstantinidis, Konstantinos T.; Mavrommatis, Kostas; Kyrpides, Nikos C.; Pati, Amrita

    2015-01-01

    Increased sequencing of microbial genomes has revealed that prevailing prokaryotic species assignments can be inconsistent with whole genome information for a significant number of species. The long-standing need for a systematic and scalable species assignment technique can be met by the genome-wide Average Nucleotide Identity (gANI) metric, which is widely acknowledged as a robust measure of genomic relatedness. In this work, we demonstrate that the combination of gANI and the alignment fraction (AF) between two genomes accurately reflects their genomic relatedness. We introduce an efficient implementation of AF,gANI and discuss its successful application to 86.5M genome pairs between 13,151 prokaryotic genomes assigned to 3032 species. Subsequently, by comparing the genome clusters obtained from complete linkage clustering of these pairs to existing taxonomy, we observed that nearly 18% of all prokaryotic species suffer from anomalies in species definition. Our results can be used to explore central questions such as whether microorganisms form a continuum of genetic diversity or distinct species represented by distinct genetic signatures. We propose that this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, be used to address previous inconsistencies in species classification and as the primary guide for new taxonomic species assignment, supplemented by the traditional polyphasic approach, as required. PMID:26150420

  19. Whole-genome sequencing in outbreak analysis.

    PubMed

    Gilchrist, Carol A; Turner, Stephen D; Riley, Margaret F; Petri, William A; Hewlett, Erik L

    2015-07-01

    In addition to the ever-present concern of medical professionals about epidemics of infectious diseases, the relative ease of access and low cost of obtaining, producing, and disseminating pathogenic organisms or biological toxins mean that bioterrorism activity should also be considered when facing a disease outbreak. Utilization of whole-genome sequencing (WGS) in outbreak analysis facilitates the rapid and accurate identification of virulence factors of the pathogen and can be used to identify the path of disease transmission within a population and provide information on the probable source. Molecular tools such as WGS are being refined and advanced at a rapid pace to provide robust and higher-resolution methods for identifying, comparing, and classifying pathogenic organisms. If these methods of pathogen characterization are properly applied, they will enable an improved public health response whether a disease outbreak was initiated by natural events or by accidental or deliberate human activity. The current application of next-generation sequencing (NGS) technology to microbial WGS and microbial forensics is reviewed. PMID:25876885

  20. Small sample whole-genome amplification

    NASA Astrophysics Data System (ADS)

    Hara, Christine; Nguyen, Christine; Wheeler, Elizabeth; Sorensen, Karen; Arroyo, Erin; Vrankovich, Greg; Christian, Allen

    2005-11-01

    Many challenges arise when trying to amplify and analyze human samples collected in the field due to limitations in sample quantity, and contamination of the starting material. Tests such as DNA fingerprinting and mitochondrial typing require a certain sample size and are carried out in large volume reactions; in cases where insufficient sample is present whole genome amplification (WGA) can be used. WGA allows very small quantities of DNA to be amplified in a way that enables subsequent DNA-based tests to be performed. A limiting step to WGA is sample preparation. To minimize the necessary sample size, we have developed two modifications of WGA: the first allows for an increase in amplified product from small, nanoscale, purified samples with the use of carrier DNA while the second is a single-step method for cleaning and amplifying samples all in one column. Conventional DNA cleanup involves binding the DNA to silica, washing away impurities, and then releasing the DNA for subsequent testing. We have eliminated losses associated with incomplete sample release, thereby decreasing the required amount of starting template for DNA testing. Both techniques address the limitations of sample size by providing ample copies of genomic samples. Carrier DNA, included in our WGA reactions, can be used when amplifying samples with the standard purification method, or can be used in conjunction with our single-step DNA purification technique to potentially further decrease the amount of starting sample necessary for future forensic DNA-based assays.

  1. Rapid whole genome sequencing and precision neonatology.

    PubMed

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care.

  2. Rapid whole genome sequencing and precision neonatology.

    PubMed

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care. PMID:26521050

  3. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    SciTech Connect

    Fröhlich, Eleonore; Meindl, Claudia; Wagner, Karin; Leitinger, Gerd; Roblegg, Eva

    2014-10-15

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay.

  4. Post-Fragmentation Whole Genome Amplification-Based Method

    NASA Technical Reports Server (NTRS)

    Benardini, James; LaDuc, Myron T.; Langmore, John

    2011-01-01

    This innovation is derived from a proprietary amplification scheme that is based upon random fragmentation of the genome into a series of short, overlapping templates. The resulting shorter DNA strands (<400 bp) constitute a library of DNA fragments with defined 3 and 5 termini. Specific primers to these termini are then used to isothermally amplify this library into potentially unlimited quantities that can be used immediately for multiple downstream applications including gel eletrophoresis, quantitative polymerase chain reaction (QPCR), comparative genomic hybridization microarray, SNP analysis, and sequencing. The standard reaction can be performed with minimal hands-on time, and can produce amplified DNA in as little as three hours. Post-fragmentation whole genome amplification-based technology provides a robust and accurate method of amplifying femtogram levels of starting material into microgram yields with no detectable allele bias. The amplified DNA also facilitates the preservation of samples (spacecraft samples) by amplifying scarce amounts of template DNA into microgram concentrations in just a few hours. Based on further optimization of this technology, this could be a feasible technology to use in sample preservation for potential future sample return missions. The research and technology development described here can be pivotal in dealing with backward/forward biological contamination from planetary missions. Such efforts rely heavily on an increasing understanding of the burden and diversity of microorganisms present on spacecraft surfaces throughout assembly and testing. The development and implementation of these technologies could significantly improve the comprehensiveness and resolving power of spacecraft-associated microbial population censuses, and are important to the continued evolution and advancement of planetary protection capabilities. Current molecular procedures for assaying spacecraft-associated microbial burden and diversity have

  5. Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness

    MedlinePlus

    ... For Consumers Home For Consumers Consumer Updates Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness ... Bacteria that cause disease have millions of different genomes, or sequences of genetic code, each as unique ...

  6. Multiple Whole Genome Alignments Without a Reference Organism

    SciTech Connect

    Dubchak, Inna; Poliakov, Alexander; Kislyuk, Andrey; Brudno, Michael

    2009-01-16

    Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and sixDrosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families?perhaps the most challenging test for whole-genome alignments. Our whole-genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.

  7. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    PubMed Central

    Fröhlich, Eleonore; Meindl, Claudia; Wagner, Karin; Leitinger, Gerd; Roblegg, Eva

    2014-01-01

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. PMID:25102311

  8. Prospects and pitfalls in whole genome association studies

    PubMed Central

    Lawrence, Robert W; Evans, David M; Cardon, Lon R

    2005-01-01

    Recent large-scale studies of common genetic variation throughout the human genome are making it feasible to conduct whole genome studies of genotype–phenotype associations. Such studies have the potential to uncover novel contributors to common complex traits and thus lead to insights into the aetiology of multifactorial phenotypes. Despite this promise, it is important to recognize that the availability of genetic markers and the ability to assay them at realistic cost does not guarantee success of this approach. There are a number of practical issues that require close attention, some forms of allelic architecture are not readily amenable to the association approach with even the most rigorous design, and doubtless new hurdles will emerge as the studies begin. Here we discuss the promise and current challenges of the whole genome approach, and raise some issues to consider in interpreting the results of the first whole genome studies. PMID:16096108

  9. Isprs Benchmark for Multi-Platform Photogrammetry

    NASA Astrophysics Data System (ADS)

    Nex, F.; Gerke, M.; Remondino, F.; Przybilla, H.-J.; Bäumker, M.; Zurhorst, A.

    2015-03-01

    Airborne high resolution oblique imagery systems and RPAS/UAVs are very promising technologies that will keep on influencing the development of geomatics in the future years closing the gap between terrestrial and classical aerial acquisitions. These two platforms are also a promising solution for National Mapping and Cartographic Agencies (NMCA) as they allow deriving complementary mapping information. Although the interest for the registration and integration of aerial and terrestrial data is constantly increasing, only limited work has been truly performed on this topic. Several investigations still need to be undertaken concerning algorithms ability for automatic co-registration, accurate point cloud generation and feature extraction from multiplatform image data. One of the biggest obstacles is the non-availability of reliable and free datasets to test and compare new algorithms and procedures. The Scientific Initiative "ISPRS benchmark for multi-platform photogrammetry", run in collaboration with EuroSDR, aims at collecting and sharing state-of-the-art multi-sensor data (oblique airborne, UAV-based and terrestrial images) over an urban area. These datasets are used to assess different algorithms and methodologies for image orientation and dense matching. As ground truth, Terrestrial Laser Scanning (TLS), Aerial Laser Scanning (ALS) as well as topographic networks and GNSS points were acquired to compare 3D coordinates on check points (CPs) and evaluate cross sections and residuals on generated point cloud surfaces. In this paper, the acquired data, the pre-processing steps, the evaluation procedures as well as some preliminary results achieved with commercial software will be presented.

  10. Whole genome and transcriptome sequencing of a B3 thymoma.

    PubMed

    Petrini, Iacopo; Rajan, Arun; Pham, Trung; Voeller, Donna; Davis, Sean; Gao, James; Wang, Yisong; Giaccone, Giuseppe

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina) and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37). Copy number (CN) aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X) was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs) and 2 insertion/deletions (INDELs) were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  11. Supercomputing for the parallelization of whole genome analysis

    PubMed Central

    Puckelwartz, Megan J.; Pesce, Lorenzo L.; Nelakuditi, Viswateja; Dellefave-Castillo, Lisa; Golbus, Jessica R.; Day, Sharlene M.; Cappola, Thomas P.; Dorn, Gerald W.; Foster, Ian T.; McNally, Elizabeth M.

    2014-01-01

    Motivation: The declining cost of generating DNA sequence is promoting an increase in whole genome sequencing, especially as applied to the human genome. Whole genome analysis requires the alignment and comparison of raw sequence data, and results in a computational bottleneck because of limited ability to analyze multiple genomes simultaneously. Results: We now adapted a Cray XE6 supercomputer to achieve the parallelization required for concurrent multiple genome analysis. This approach not only markedly speeds computational time but also results in increased usable sequence per genome. Relying on publically available software, the Cray XE6 has the capacity to align and call variants on 240 whole genomes in ∼50 h. Multisample variant calling is also accelerated. Availability and implementation: The MegaSeq workflow is designed to harness the size and memory of the Cray XE6, housed at Argonne National Laboratory, for whole genome analysis in a platform designed to better match current and emerging sequencing volume. Contact: emcnally@uchicago.edu PMID:24526712

  12. Whole-Genome Sequencing of Two Bartonella bacilliformis Strains

    PubMed Central

    Guillen, Yolanda; Casadellà, Maria; García-de-la-Guarda, Ruth; Espinoza-Culupú, Abraham; Paredes, Roger; Ruiz, Joaquim

    2016-01-01

    Bartonella bacilliformis is the causative agent of Carrion’s disease, a highly endemic human bartonellosis in Peru. We performed a whole-genome assembly of two B. bacilliformis strains isolated from the blood of infected patients in the acute phase of Carrion’s disease from the Cusco and Piura regions in Peru. PMID:27389274

  13. Whole-Genome Sequencing of Two Bartonella bacilliformis Strains.

    PubMed

    Guillen, Yolanda; Casadellà, Maria; García-de-la-Guarda, Ruth; Espinoza-Culupú, Abraham; Paredes, Roger; Ruiz, Joaquim; Noguera-Julian, Marc

    2016-01-01

    Bartonella bacilliformis is the causative agent of Carrion's disease, a highly endemic human bartonellosis in Peru. We performed a whole-genome assembly of two B. bacilliformis strains isolated from the blood of infected patients in the acute phase of Carrion's disease from the Cusco and Piura regions in Peru. PMID:27389274

  14. Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms.

    PubMed

    Diskin, Sharon J; Li, Mingyao; Hou, Cuiping; Yang, Shuzhang; Glessner, Joseph; Hakonarson, Hakon; Bucan, Maja; Maris, John M; Wang, Kai

    2008-11-01

    Whole-genome microarrays with large-insert clones designed to determine DNA copy number often show variation in hybridization intensity that is related to the genomic position of the clones. We found these 'genomic waves' to be present in Illumina and Affymetrix SNP genotyping arrays, confirming that they are not platform-specific. The causes of genomic waves are not well-understood, and they may prevent accurate inference of copy number variations (CNVs). By measuring DNA concentration for 1444 samples and by genotyping the same sample multiple times with varying DNA quantity, we demonstrated that DNA quantity correlates with the magnitude of waves. We further showed that wavy signal patterns correlate best with GC content, among multiple genomic features considered. To measure the magnitude of waves, we proposed a GC-wave factor (GCWF) measure, which is a reliable predictor of DNA quantity (correlation coefficient = 0.994 based on samples with serial dilution). Finally, we developed a computational approach by fitting regression models with GC content included as a predictor variable, and we show that this approach improves the accuracy of CNV detection. With the wide application of whole-genome SNP genotyping techniques, our wave adjustment method will be important for taking full advantage of genotyped samples for CNV analysis.

  15. Whole genome amplification - Review of applications and advances

    SciTech Connect

    Hawkins, Trevor L.; Detter, J.C.; Richardson, Paul

    2001-11-15

    The concept of Whole Genome Amplification is something that has arisen in the past few years as modifications to the polymerase chain reaction (PCR) have been adapted to replicate regions of genomes which are of biological interest. The applications here are many--forensics, embryonic disease diagnosis, bio terrorism genome detection, ''imoralization'' of clinical samples, microbial diversity, and genotyping. The key question is if DNA can be replicated a genome at a time without bias or non random distribution of the target. Several papers published in the last year and currently in preparation may lead to the conclusion that whole genome amplification may indeed be possible and therefore open up a new avenue to molecular biology.

  16. Comparative genomic hybridization with single cells after whole genome amplification

    SciTech Connect

    Haddad, B.R.; Baldini, A.; Hughes, M.R.

    1994-09-01

    Conventional karyotype analysis is the ideal way to diagnose chromosomal imbalances. However it requires cell culture and chromosome preparation. There are instances where a very small number of cells are available for cytogenetic evaluation and chromosomes cannot be obtained. Comparative genomic hybridization (CGH) is a novel molecular cytogenetic technique that provides information about genetic imbalances affecting the genome. The power of this technique lies in its ability to detect genetic imbalances using total genomic DNA. We have previously demonstrated the feasibility of whole genome amplification from single cells for subsequent analysis of multiple genetic loci by PCR. In this present work, we combine whole genome amplification with CGH to detect chromosomal imbalances from small numbers of cells. Both cytogenetically normal and abnormal cells were individually picked by micromanipulation and subjected to whole genome amplification using random oligonucleotide primers. Amplified test and control DNA were differentially labeled by incorporation of digoxigenin or biotin, mixed together and hybridized to normal male metaphase spreads. Hybridization was detected with two fluorochromes, rhodamine-anti-digoxigenin and FITC -Avidin. Ratio of intensities of the two fluorochromes along the target chromosomes was analyzed using locally developed computer imaging software. Using the combination of whole genome amplification and CGH, we were able to detect different chromosomal aneuploidies from 30, 20, and 10 cells. It can also be applied to the analysis of fetal cells sorted from maternal circulation, or to tumor cells obtained from needle biopsies or from different body fluids and effusions. Finally, its successful application to single cells will have a great impact on preimplantation diagnosis.

  17. Mapping Challenging Mutations by Whole-Genome Sequencing

    PubMed Central

    Smith, Harold E.; Fabritius, Amy S.; Jaramillo-Lambert, Aimee; Golden, Andy

    2016-01-01

    Whole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homozygous variants). However, such approaches are not entirely suitable for the characterization of a variety of more challenging mutations, such as dominant and semidominant alleles or multigenic traits. Therefore, we have developed strategies for the identification of those classes of mutations, using polymorphism mapping in Caenorhabditis elegans as our model for validation. We also report an alternative approach for mutation identification from traditional recombinant crosses, and a solution to the technical challenge of sequencing sterile or terminally arrested strains where population size is limiting. The methods described herein extend the applicability of whole-genome sequencing to a broader spectrum of mutations, including classes that are difficult to map by traditional means. PMID:26945029

  18. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function. PMID:25743335

  19. Priors in Whole-Genome Regression: The Bayesian Alphabet Returns

    PubMed Central

    Gianola, Daniel

    2013-01-01

    Whole-genome enabled prediction of complex traits has received enormous attention in animal and plant breeding and is making inroads into human and even Drosophila genetics. The term “Bayesian alphabet” denotes a growing number of letters of the alphabet used to denote various Bayesian linear regressions that differ in the priors adopted, while sharing the same sampling model. We explore the role of the prior distribution in whole-genome regression models for dissecting complex traits in what is now a standard situation with genomic data where the number of unknown parameters (p) typically exceeds sample size (n). Members of the alphabet aim to confront this overparameterization in various manners, but it is shown here that the prior is always influential, unless n ≫ p. This happens because parameters are not likelihood identified, so Bayesian learning is imperfect. Since inferences are not devoid of the influence of the prior, claims about genetic architecture from these methods should be taken with caution. However, all such procedures may deliver reasonable predictions of complex traits, provided that some parameters (“tuning knobs”) are assessed via a properly conducted cross-validation. It is concluded that members of the alphabet have a room in whole-genome prediction of phenotypes, but have somewhat doubtful inferential value, at least when sample size is such that n ≪ p. PMID:23636739

  20. Mapping Challenging Mutations by Whole-Genome Sequencing.

    PubMed

    Smith, Harold E; Fabritius, Amy S; Jaramillo-Lambert, Aimee; Golden, Andy

    2016-01-01

    Whole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homozygous variants). However, such approaches are not entirely suitable for the characterization of a variety of more challenging mutations, such as dominant and semidominant alleles or multigenic traits. Therefore, we have developed strategies for the identification of those classes of mutations, using polymorphism mapping in Caenorhabditis elegans as our model for validation. We also report an alternative approach for mutation identification from traditional recombinant crosses, and a solution to the technical challenge of sequencing sterile or terminally arrested strains where population size is limiting. The methods described herein extend the applicability of whole-genome sequencing to a broader spectrum of mutations, including classes that are difficult to map by traditional means.

  1. Whole genome/exome sequencing in mood and psychotic disorders.

    PubMed

    Kato, Tadafumi

    2015-02-01

    Recent developments in DNA sequencing technologies have allowed for genetic studies using whole genome or exome analysis, and these have been applied in the study of mood and psychotic disorders, including bipolar disorder, depression, schizophrenia, and schizoaffective disorder. In this review, the current situation, recent findings, methodological problems, and future directions of whole genome/exome analysis studies of these disorders are summarized. Whole genome/exome studies of bipolar disorder have included pedigree analysis and case-control studies, demonstrating the role of previously implicated pathways, such as calcium signaling, cyclic adenosine monophosphate response element binding protein (CREB) signaling, and potassium channels. Extensive analysis of trio families and case-control studies showed that de novo mutations play a role in the genetic architecture of schizophrenia and indicated that mutations in several molecular pathways, including chromatin regulation, activity-regulated cytoskeleton, post-synaptic density, N-methyl-D-aspartate receptor, and targets of fragile X mental retardation protein, are associated with this disorder. Depression is a heterogeneous group of diseases and studies using exome analysis have been conducted to identify rare mutations causing Mendelian diseases that accompany depression. In the near future, clarification of the genetic architecture of bipolar disorder and schizophrenia is expected. Identification of causative mutations using these new technologies will facilitate neurobiological studies of these disorders. PMID:25319632

  2. Whole-genome sequence-based analysis of thyroid function.

    PubMed

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G

    2015-03-06

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.

  3. Whole-genome shotgun optical mapping of Rhodospirillum rubrum

    SciTech Connect

    Reslewic, S.; Zhou, S.; Place, M.; Zhang, Y.; Briska, A.; Goldstein, S.; Churas, C.; Runnheim, R.; Forrest, D.; Lim, A.; Lapidus, A.; Han, C. S.; Roberts, G. P.; Schwartz, D. C.

    2005-09-01

    Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.

  4. Whole-genome shotgun optical mapping of rhodospirillumrubrum

    SciTech Connect

    Reslewic, Susan; Zhou, Shiguo; Place, Mike; Zhang, Yaoping; Briska, Adam; Goldstein, Steve; Churas, Chris; Runnheim, Rod; Forrest,Dan; Lim, Alex; Lapidus, Alla; Han, Cliff S.; Roberts, Gary P.; Schwartz,David C.

    2004-07-01

    Rhodospirillum rubrum is a phototrophic purple non-sulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems, and as a source of hydrogen and biodegradable plastics production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction maps (Xba I, Nhe I, and Hind III) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction maps from randomly sheared genomic DNA molecules extracted directly from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the Hind III map acted as a scaffold for high resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and validation of genome sequence, our work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a ''molecular cytogenetics'' approach to solving problems in genomic analysis.

  5. Whole-genome transcriptional analysis of heavy metal stresses inCaulobacter crescentus

    SciTech Connect

    Hu, Ping; Brodie, Eoin L.; Suzuki, Yohey; McAdams, Harley H.; Andersen, Gary L.

    2005-09-21

    The bacterium Caulobacter crescentus and related stalkbacterial species are known for their distinctive ability to live in lownutrient environments, a characteristic of most heavy metal contaminatedsites. Caulobacter crescentus is a model organism for studying cell cycleregulation with well developed genetics. We have identified the pathwaysresponding to heavy metal toxicity in C. crescentus to provide insightsfor possible application of Caulobacter to environmental restoration. Weexposed C. crescentus cells to four heavy metals (chromium, cadmium,selenium and uranium) and analyzed genome wide transcriptional activitiespost exposure using a Affymetrix GeneChip microarray. C. crescentusshowed surprisingly high tolerance to uranium, a possible mechanism forwhich may be formation of extracellular calcium-uranium-phosphateprecipitates. The principal response to these metals was protectionagainst oxidative stress (up-regulation of manganese-dependent superoxidedismutase, sodA). Glutathione S-transferase, thioredoxin, glutaredoxinsand DNA repair enzymes responded most strongly to cadmium and chromate.The cadmium and chromium stress response also focused on reducing theintracellular metal concentration, with multiple efflux pumps employed toremove cadmium while a sulfate transporter was down-regulated to reducenon-specific uptake of chromium. Membrane proteins were also up-regulatedin response to most of the metals tested. A two-component signaltransduction system involved in the uranium response was identified.Several differentially regulated transcripts from regions previously notknown to encode proteins were identified, demonstrating the advantage ofevaluating the transcriptome using whole genome microarrays.

  6. Whole genome assessment of the retinal response to diabetes reveals a progressive neurovascular inflammatory response

    PubMed Central

    Brucklacher, Robert M; Patel, Kruti M; VanGuilder, Heather D; Bixler, Georgina V; Barber, Alistair J; Antonetti, David A; Lin, Cheng-Mao; LaNoue, Kathryn F; Gardner, Thomas W; Bronson, Sarah K; Freeman, Willard M

    2008-01-01

    Background Despite advances in the understanding of diabetic retinopathy, the nature and time course of molecular changes in the retina with diabetes are incompletely described. This study characterized the functional and molecular phenotype of the retina with increasing durations of diabetes. Results Using the streptozotocin-induced rat model of diabetes, levels of retinal permeability, caspase activity, and gene expression were examined after 1 and 3 months of diabetes. Gene expression changes were identified by whole genome microarray and confirmed by qPCR in the same set of animals as used in the microarray analyses and subsequently validated in independent sets of animals. Increased levels of vascular permeability and caspase-3 activity were observed at 3 months of diabetes, but not 1 month. Significantly more and larger magnitude gene expression changes were observed after 3 months than after 1 month of diabetes. Quantitative PCR validation of selected genes related to inflammation, microvasculature and neuronal function confirmed gene expression changes in multiple independent sets of animals. Conclusion These changes in permeability, apoptosis, and gene expression provide further evidence of progressive retinal malfunction with increasing duration of diabetes. The specific gene expression changes confirmed in multiple sets of animals indicate that pro-inflammatory, anti-vascular barrier, and neurodegenerative changes occur in tandem with functional increases in apoptosis and vascular permeability. These responses are shared with the clinically documented inflammatory response in diabetic retinopathy suggesting that this model may be used to test anti-inflammatory therapeutics. PMID:18554398

  7. Whole-genome amplification of single-cell genomes for next-generation sequencing.

    PubMed

    Korfhage, Christian; Fisch, Evelyn; Fricke, Evelyn; Baedker, Silke; Loeffert, Dirk

    2013-10-11

    DNA sequence analysis and genotyping of biological samples using next-generation sequencing (NGS), microarrays, or real-time PCR is often limited by the small amount of sample available. A single cell contains only one to four copies of the genomic DNA, depending on the organism (haploid or diploid organism) and the cell-cycle phase. The DNA content of a single cell ranges from a few femtograms in bacteria to picograms in mammalia. In contrast, a deep analysis of the genome currently requires a few hundred nanograms up to micrograms of genomic DNA for library formation necessary for NGS sequencing or labeling protocols (e.g., microarrays). Consequently, accurate whole-genome amplification (WGA) of single-cell DNA is required for reliable genetic analysis (e.g., NGS) and is particularly important when genomic DNA is limited. The use of single-cell WGA has enabled the analysis of genomic heterogeneity of individual cells (e.g., somatic genomic variation in tumor cells). This unit describes how the genome of single cells can be used for WGA for further genomic studies, such as NGS. Recommendations for isolation of single cells are given and common sources of errors are discussed.

  8. Whole genome comparison of donor and cloned dogs

    PubMed Central

    Kim, Hak-Min; Cho, Yun Sung; Kim, Hyunmin; Jho, Sungwoong; Son, Bongjun; Choi, Joung Yoon; Kim, Sangsoo; Lee, Byeong Chun; Bhak, Jong; Jang, Goo

    2013-01-01

    Cloning is a process that produces genetically identical organisms. However, the genomic degree of genetic resemblance in clones needs to be determined. In this report, the genomes of a cloned dog and its donor were compared. Compared with a human monozygotic twin, the genome of the cloned dog showed little difference from the genome of the nuclear donor dog in terms of single nucleotide variations, chromosomal instability, and telomere lengths. These findings suggest that cloning by somatic cell nuclear transfer produced an almost identical genome. The whole genome sequence data of donor and cloned dogs can provide a resource for further investigations on epigenetic contributions in phenotypic differences. PMID:24141358

  9. The potential of whole genome NGS for infectious disease diagnosis.

    PubMed

    Lecuit, Marc; Eloit, Marc

    2015-01-01

    Non-targeted identification of microbes is now possible directly in biological samples, based on whole-genome-NGS (WG-NGS) techniques that allow deep sequencing of nucleic acids, data mining and sorting out of sequences of pathogens without any a priori hypothesis. WG-NGS was first only used as a research tool due to its cost, complexity and lack of standardization. Recent improvements in sample preparation and bioinformatics pipelines and decrease in cost now allow actionable diagnostics in patients. The potency and limits of WG-NGS and possible future indications are discussed here. WG-NGS will likely soon become a standard procedure in microbiological diagnosis.

  10. Whole genome comparison of donor and cloned dogs.

    PubMed

    Kim, Hak-Min; Cho, Yun Sung; Kim, Hyunmin; Jho, Sungwoong; Son, Bongjun; Choi, Joung Yoon; Kim, Sangsoo; Lee, Byeong Chun; Bhak, Jong; Jang, Goo

    2013-01-01

    Cloning is a process that produces genetically identical organisms. However, the genomic degree of genetic resemblance in clones needs to be determined. In this report, the genomes of a cloned dog and its donor were compared. Compared with a human monozygotic twin, the genome of the cloned dog showed little difference from the genome of the nuclear donor dog in terms of single nucleotide variations, chromosomal instability, and telomere lengths. These findings suggest that cloning by somatic cell nuclear transfer produced an almost identical genome. The whole genome sequence data of donor and cloned dogs can provide a resource for further investigations on epigenetic contributions in phenotypic differences. PMID:24141358

  11. Whole genome amplification using single-primer PCR.

    PubMed

    Lao, Kaiqin; Xu, Nan Lan; Straus, Neil A

    2008-03-01

    Comprehensive genomic molecular analyses require relatively large DNA amounts that are often not available from forensic, clinical and other crucial biological samples. Numerous methods to amplify the whole genome have been proposed for cancer, forensic and taxonomic research. Unfortunately, when using truly random primers for the initial priming step, all of these procedures suffer from high background problems for sub-nanogram quantities of input DNA. Here we report an approach to eliminate this problem for PCR-based methods even at levels of DNA approaching that of a single cell.

  12. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    SciTech Connect

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  13. Performance Evaluation of NIPT in Detection of Chromosomal Copy Number Variants Using Low-Coverage Whole-Genome Sequencing of Plasma DNA

    PubMed Central

    Lin, Linhua; Yin, Xuyang; Wang, Jun; Chen, Dayang; Chen, Fang; Jiang, Hui; Ren, Jinghui; Wang, Wei

    2016-01-01

    Objectives The aim of this study was to assess the performance of noninvasively prenatal testing (NIPT) for fetal copy number variants (CNVs) in clinical samples, using a whole-genome sequencing method. Method A total of 919 archived maternal plasma samples with karyotyping/microarray results, including 33 CNVs samples and 886 normal samples from September 1, 2011 to May 31, 2013, were enrolled in this study. The samples were randomly rearranged and blindly sequenced by low-coverage (about 7M reads) whole-genome sequencing of plasma DNA. Fetal CNVs were detected by Fetal Copy-number Analysis through Maternal Plasma Sequencing (FCAPS) to compare to the karyotyping/microarray results. Sensitivity, specificity and were evaluated. Results 33 samples with deletions/duplications ranging from 1 to 129 Mb were detected with the consistent CNV size and location to karyotyping/microarray results in the study. Ten false positive results and two false negative results were obtained. The sensitivity and specificity of detection deletions/duplications were 84.21% and 98.42%, respectively. Conclusion Whole-genome sequencing-based NIPT has high performance in detecting genome-wide CNVs, in particular >10Mb CNVs using the current FCAPS algorithm. It is possible to implement the current method in NIPT to prenatally screening for fetal CNVs. PMID:27415003

  14. Whole genome scanning as a cytogenetic tool in hematologic malignancies

    PubMed Central

    Mufti, Ghulam J.

    2008-01-01

    Over the years, methods of cytogenetic analysis evolved and became part of routine laboratory testing, providing valuable diagnostic and prognostic information in hematologic disorders. Karyotypic aberrations contribute to the understanding of the molecular pathogenesis of disease and thereby to rational application of therapeutic modalities. Most of the progress in this field stems from the application of metaphase cytogenetics (MC), but recently, novel molecular technologies have been introduced that complement MC and overcome many of the limitations of traditional cytogenetics, including a need for cell culture. Whole genome scanning using comparative genomic hybridization and single nucleotide polymorphism arrays (CGH-A; SNP-A) can be used for analysis of somatic or clonal unbalanced chromosomal defects. In SNP-A, the combination of copy number detection and genotyping enables diagnosis of copy-neutral loss of heterozygosity, a lesion that cannot be detected using MC but may have important pathogenetic implications. Overall, whole genome scanning arrays, despite the drawback of an inability to detect balanced translocations, allow for discovery of chromosomal defects in a higher proportion of patients with hematologic malignancies. Newly detected chromosomal aberrations, including somatic uniparental disomy, may lead to more precise prognostic schemes in many diseases. PMID:18505780

  15. Whole-genome haplotyping approaches and genomic medicine.

    PubMed

    Glusman, Gustavo; Cox, Hannah C; Roach, Jared C

    2014-01-01

    Genomic information reported as haplotypes rather than genotypes will be increasingly important for personalized medicine. Current technologies generate diploid sequence data that is rarely resolved into its constituent haplotypes. Furthermore, paradigms for thinking about genomic information are based on interpreting genotypes rather than haplotypes. Nevertheless, haplotypes have historically been useful in contexts ranging from population genetics to disease-gene mapping efforts. The main approaches for phasing genomic sequence data are molecular haplotyping, genetic haplotyping, and population-based inference. Long-read sequencing technologies are enabling longer molecular haplotypes, and decreases in the cost of whole-genome sequencing are enabling the sequencing of whole-chromosome genetic haplotypes. Hybrid approaches combining high-throughput short-read assembly with strategic approaches that enable physical or virtual binning of reads into haplotypes are enabling multi-gene haplotypes to be generated from single individuals. These techniques can be further combined with genetic and population approaches. Here, we review advances in whole-genome haplotyping approaches and discuss the importance of haplotypes for genomic medicine. Clinical applications include diagnosis by recognition of compound heterozygosity and by phasing regulatory variation to coding variation. Haplotypes, which are more specific than less complex variants such as single nucleotide variants, also have applications in prognostics and diagnostics, in the analysis of tumors, and in typing tissue for transplantation. Future advances will include technological innovations, the application of standard metrics for evaluating haplotype quality, and the development of databases that link haplotypes to disease. PMID:25473435

  16. Whole genome sequence-based serogrouping of Listeria monocytogenes isolates.

    PubMed

    Hyden, Patrick; Pietzka, Ariane; Lennkh, Anna; Murer, Andrea; Springer, Burkhard; Blaschitz, Marion; Indra, Alexander; Huhulescu, Steliana; Allerberger, Franz; Ruppitsch, Werner; Sensen, Christoph W

    2016-10-10

    Whole genome sequencing (WGS) is currently becoming the method of choice for characterization of Listeria monocytogenes isolates in national reference laboratories (NRLs). WGS is superior with regards to accuracy, resolution and analysis speed in comparison to several other methods including serotyping, PCR, pulsed field gel electrophoresis (PFGE), multilocus sequence typing (MLST), multilocus variable number tandem repeat analysis (MLVA), and multivirulence-locus sequence typing (MVLST), which have been used thus far for the characterization of bacterial isolates (and are still important tools in reference laboratories today) to control and prevent listeriosis, one of the major sources of foodborne diseases for humans. Backward compatibility of WGS to former methods can be maintained by extraction of the respective information from WGS data. Serotyping was the first subtyping method for L. monocytogenes capable of differentiating 12 serovars and national reference laboratories still perform serotyping and PCR-based serogrouping as a first level classification method for Listeria monocytogenes surveillance. Whole genome sequence based core genome MLST analysis of a L. monocytogenes collection comprising 172 isolates spanning all 12 serotypes was performed for serogroup determination. These isolates clustered according to their serotypes and it was possible to group them either into the IIa, IIc, IVb or IIb clusters, respectively, which were generated by minimum spanning tree (MST) and neighbor joining (NJ) tree data analysis, demonstrating the power of the new approach.

  17. Allelic imbalance analysis by high-density single-nucleotide polymorphic allele (SNP) array with whole genome amplified DNA

    PubMed Central

    Wong, Kwong-Kwok; Tsang, Yvonne T. M.; Shen, Jianhe; Cheng, Rita S.; Chang, Yi-Mieng; Man, Tsz-Kwong; Lau, Ching C.

    2004-01-01

    Besides their use in mRNA expression profiling, oligonucleotide microarrays have also been applied to single-nucleotide polymorphism (SNP) and loss of heterozygosity (LOH) or allelic imbalance studies. In this report, we evaluate the reliability of using whole genome amplified DNA for analysis with an oligonucleotide microarray containing 11 560 SNPs to detect allelic imbalance and chromosomal copy number abnormalities. Whole genome SNP analyses were performed with DNA extracted from osteosarcoma tissues and patient-matched blood. SNP calls were then generated by Affymetrix® GeneChip® DNA Analysis Software. In two osteosarcoma cases, using unamplified DNA, we identified 793 and 1070 SNP loci with allelic imbalance, respectively. In a parallel experiment with amplified DNA, 78% and 83% of these SNP loci with allelic imbalance was detected. The average false-positive rate is 13.8%. Furthermore, using the Affymetrix® GeneChip® Chromosome Copy Number Tool to analyze the SNP array data, we were able to detect identical chromosomal regions with gain or loss in both amplified and unamplified DNA at cytoband resolution. PMID:15148342

  18. Polyploidy in fungi: evolution after whole-genome duplication

    PubMed Central

    Albertin, Warren; Marullo, Philippe

    2012-01-01

    Polyploidy is a major evolutionary process in eukaryotes—particularly in plants and, to a less extent, in animals, wherein several past and recent whole-genome duplication events have been described. Surprisingly, the incidence of polyploidy in other eukaryote kingdoms, particularly within fungi, remained largely disregarded by the scientific community working on the evolutionary consequences of polyploidy. Recent studies have significantly increased our knowledge of the occurrence and evolutionary significance of fungal polyploidy. The ecological, structural and functional consequences of polyploidy in fungi are reviewed here and compared with the knowledge acquired with conventional plant and animal models. In particular, the genus Saccharomyces emerges as a relevant model for polyploid studies, in addition to plant and animal models. PMID:22492065

  19. Origin of the Yeast Whole-Genome Duplication

    PubMed Central

    Wolfe, Kenneth H.

    2015-01-01

    Whole-genome duplications (WGDs) are rare evolutionary events with profound consequences. They double an organism’s genetic content, immediately creating a reproductive barrier between it and its ancestors and providing raw material for the divergence of gene functions between paralogs. Almost all eukaryotic genome sequences bear evidence of ancient WGDs, but the causes of these events and the timing of intermediate steps have been difficult to discern. One of the best-characterized WGDs occurred in the lineage leading to the baker’s yeast Saccharomyces cerevisiae. Marcet-Houben and Gabaldón now show that, rather than simply doubling the DNA of a single ancestor, the yeast WGD likely involved mating between two different ancestral species followed by a doubling of the genome to restore fertility. PMID:26252643

  20. Whole genome SNP scanning of snow sheep (Ovis nivicola).

    PubMed

    Deniskova, T E; Okhlopkov, I M; Sermyagin, A A; Gladyr', E A; Bagirov, V A; Sölkner, J; Mamaev, N V; Brem, G; Zinov'eva, N A

    2016-07-01

    This is the first report performing the whole genome SNP scanning of snow sheep (Ovis nivicola). Samples of snow sheep (n = 18) collected in six different regions of the Republic of Sakha (Yakutia) from 64° to 71° N. For SNP genotyping, we applied Ovine 50K SNP BeadChip (Illumina, United States), designed for domestic sheep. The total number of genotyped SNPs (call rate 90%) was 47796 (88.1% of total SNPs), wherein 1006 SNPs were polymorphic (2.1%). Principal component analysis (PCA) showed the clear differentiation within the species O. nivicola: studied individuals were distributed among five distinct arrays corresponding to the geographical locations of sampling points. Our results demonstrate that the DNA chip designed for domestic sheep can be successfully used to study the allele pool and the genetic structure of snow sheep populations. PMID:27599514

  1. Whole genome sequencing in clinical and public health microbiology.

    PubMed

    Kwong, J C; McCallum, N; Sintchenko, V; Howden, B P

    2015-04-01

    Genomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology.The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology.Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories.As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future.Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure.

  2. Whole genome duplications in plants: an overview from Arabidopsis.

    PubMed

    del Pozo, Juan Carlos; Ramirez-Parra, Elena

    2015-12-01

    Polyploidy is a common event in plants that involves the acquisition of more than two complete sets of chromosomes. Allopolyploidy originates from interspecies hybrids while autopolyploidy originates from intraspecies whole genome duplication (WGD) events. In spite of inconveniences derived from chromosomic rearrangement during polyploidization, natural plant polyploids species often exhibit improved growth vigour and adaptation to adverse environments, conferring evolutionary advantages. These advantages have also been incorporated into crop breeding programmes. Many tetraploid crops show increased stress tolerance, although the molecular mechanisms underlying these different adaptation abilities are poorly known. Understanding the physiological, cellular, and molecular mechanisms coupled to WGD, in both allo- and autopolyploidy, is a major challenge. Over the last few years, several studies, many of them in Arabidopsis, are shedding light on the basis of genetic, genomic, and epigenomic changes linked to WGD. In this review we summarize and discuss the latest advances made in Arabidopsis polyploidy, but also in other agronomic plant species.

  3. Whole-genome sequencing reveals oncogenic mutations in mycosis fungoides.

    PubMed

    McGirt, Laura Y; Jia, Peilin; Baerenwald, Devin A; Duszynski, Robert J; Dahlman, Kimberly B; Zic, John A; Zwerner, Jeffrey P; Hucks, Donald; Dave, Utpal; Zhao, Zhongming; Eischen, Christine M

    2015-07-23

    The pathogenesis of mycosis fungoides (MF), the most common cutaneous T-cell lymphoma (CTCL), is unknown. Although genetic alterations have been identified, none are considered consistently causative in MF. To identify potential drivers of MF, we performed whole-genome sequencing of MF tumors and matched normal skin. Targeted ultra-deep sequencing of MF samples and exome sequencing of CTCL cell lines were also performed. Multiple mutations were identified that affected the same pathways, including epigenetic, cell-fate regulation, and cytokine signaling, in MF tumors and CTCL cell lines. Specifically, interleukin-2 signaling pathway mutations, including activating Janus kinase 3 (JAK3) mutations, were detected. Treatment with a JAK3 inhibitor significantly reduced CTCL cell survival. Additionally, the mutation data identified 2 other potential contributing factors to MF, ultraviolet light, and a polymorphism in the tumor suppressor p53 (TP53). Therefore, genetic alterations in specific pathways in MF were identified that may be viable, effective new targets for treatment.

  4. Whole-genome CNV analysis: advances in computational approaches

    PubMed Central

    Pirooznia, Mehdi; Goes, Fernando S.; Zandi, Peter P.

    2015-01-01

    Accumulating evidence indicates that DNA copy number variation (CNV) is likely to make a significant contribution to human diversity and also play an important role in disease susceptibility. Recent advances in genome sequencing technologies have enabled the characterization of a variety of genomic features, including CNVs. This has led to the development of several bioinformatics approaches to detect CNVs from next-generation sequencing data. Here, we review recent advances in CNV detection from whole genome sequencing. We discuss the informatics approaches and current computational tools that have been developed as well as their strengths and limitations. This review will assist researchers and analysts in choosing the most suitable tools for CNV analysis as well as provide suggestions for new directions in future development. PMID:25918519

  5. Origin of the Yeast Whole-Genome Duplication.

    PubMed

    Wolfe, Kenneth H

    2015-08-01

    Whole-genome duplications (WGDs) are rare evolutionary events with profound consequences. They double an organism's genetic content, immediately creating a reproductive barrier between it and its ancestors and providing raw material for the divergence of gene functions between paralogs. Almost all eukaryotic genome sequences bear evidence of ancient WGDs, but the causes of these events and the timing of intermediate steps have been difficult to discern. One of the best-characterized WGDs occurred in the lineage leading to the baker's yeast Saccharomyces cerevisiae. Marcet-Houben and Gabaldón now show that, rather than simply doubling the DNA of a single ancestor, the yeast WGD likely involved mating between two different ancestral species followed by a doubling of the genome to restore fertility. PMID:26252643

  6. Whole genome sequencing in clinical and public health microbiology

    PubMed Central

    Kwong, J. C.; McCallum, N.; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laboratories. As WGS represents the pinnacle for strain characterisation and epidemiological analyses, it is likely to replace traditional typing methods, resistance gene detection and other sequence-based investigations (e.g., 16S rDNA PCR) in the near future. Although genomic technologies are rapidly evolving, widespread implementation in clinical and public health microbiology laboratories is limited by the need for effective semi-automated pipelines, standardised quality control and data interpretation, bioinformatics expertise, and infrastructure. PMID:25730631

  7. Whole Genome Phylogeny of Bacillus by Feature Frequency Profiles (FFP)

    PubMed Central

    Wang, Aisuo; Ash, Gavin J.

    2015-01-01

    Fifty complete Bacillus genome sequences and associated plasmids were compared using the “feature frequency profile” (FFP) method. The resulting whole-genome phylogeny supports the placement of three Bacillus species (B. thuringiensis, B. anthracis and B. cereus) as a single clade. The monophyletic status of B. anthracis was strongly supported by the analysis. FFP proved to be more effective in inferring the phylogeny of Bacillus than methods based on single gene sequences [16s rRNA gene, GryB (gyrase subunit B) and AroE (shikimate-5-dehydrogenase)] analyses. The findings of FFP analysis were verified using kSNP v2 (alignment-free sequence analysis method) and Harvest suite (core genome sequence alignment method).

  8. Concurrent Whole-Genome Haplotyping and Copy-Number Profiling of Single Cells

    PubMed Central

    Zamani Esteki, Masoud; Dimitriadou, Eftychia; Mateiu, Ligia; Melotte, Cindy; Van der Aa, Niels; Kumar, Parveen; Das, Rakhi; Theunis, Koen; Cheng, Jiqiu; Legius, Eric; Moreau, Yves; Debrock, Sophie; D’Hooghe, Thomas; Verdyck, Pieter; De Rycke, Martine; Sermon, Karen; Vermeesch, Joris R.; Voet, Thierry

    2015-01-01

    Methods for haplotyping and DNA copy-number typing of single cells are paramount for studying genomic heterogeneity and enabling genetic diagnosis. Before analyzing the DNA of a single cell by microarray or next-generation sequencing, a whole-genome amplification (WGA) process is required, but it substantially distorts the frequency and composition of the cell’s alleles. As a consequence, haplotyping methods suffer from error-prone discrete SNP genotypes (AA, AB, BB) and DNA copy-number profiling remains difficult because true DNA copy-number aberrations have to be discriminated from WGA artifacts. Here, we developed a single-cell genome analysis method that reconstructs genome-wide haplotype architectures as well as the copy-number and segregational origin of those haplotypes by employing phased parental genotypes and deciphering WGA-distorted SNP B-allele fractions via a process we coin haplarithmisis. We demonstrate that the method can be applied as a generic method for preimplantation genetic diagnosis on single cells biopsied from human embryos, enabling diagnosis of disease alleles genome wide as well as numerical and structural chromosomal anomalies. Moreover, meiotic segregation errors can be distinguished from mitotic ones. PMID:25983246

  9. Whole genome response in guinea pigs infected with the high virulence strain Mycobacterium tuberculosis TT372

    PubMed Central

    Aiyaz, Mohamed; Bipin, Chand; Pantulwar, Vinay; Mugasimangalam, Raja; Shanley, Crystal A.; Ordway, Diane J; Orme, Ian M.

    2014-01-01

    SUMMARY In this study we conducted a microarray-based whole genomic analysis of gene expression in the lungs after exposure of guinea pigs to a low dose aerosol of the Atypical Beijing Western Cape TT372 strain of Mycobacterium tuberculosis, after harvesting lung tissues three weeks after infection at a time that effector immunity is starting to peak. The infection resulted in a very large up-regulation of multiple genes at this time, particularly in the context of a “chemokine storm” in the lungs. Overall gene expression was considerably reduced in animals that had been vaccinated with BCG two months earlier, but in both cases strong signatures featuring gamma interferon [IFNγ] and tumor necrosis factor [TNFα] were observed indicating the potent TH1 response in these animals. Even though their effects are not seen until later in the infection, even at this early time point gene expression patterns associated with the potential emergence of regulatory T cells were observed. Genes involving lung repair, response to oxidative stress, and cell trafficking were strongly expressed, but interesting these gene patterns differed substantially between the infected and vaccinated/infected groups of animals. Given the importance of this species as a relevant and cost-effective small animal model of tuberculosis, this approach has the potential to provide new information regarding the effects of vaccination on control of the disease process. PMID:25621360

  10. Whole genome response in guinea pigs infected with the high virulence strain Mycobacterium tuberculosis TT372.

    PubMed

    Aiyaz, Mohamed; Bipin, Chand; Pantulwar, Vinay; Mugasimangalam, Raja; Shanley, Crystal A; Ordway, Diane J; Orme, Ian M

    2014-12-01

    In this study we conducted a microarray-based whole genomic analysis of gene expression in the lungs after exposure of guinea pigs to a low dose aerosol of the Atypical Beijing Western Cape TT372 strain of Mycobacterium tuberculosis, after harvesting lung tissues three weeks after infection at a time that effector immunity is starting to peak. The infection resulted in a very large up-regulation of multiple genes at this time, particularly in the context of a "chemokine storm" in the lungs. Overall gene expression was considerably reduced in animals that had been vaccinated with BCG two months earlier, but in both cases strong signatures featuring gamma interferon [IFNγ] and tumor necrosis factor [TNFα] were observed indicating the potent TH1 response in these animals. Even though their effects are not seen until later in the infection, even at this early time point gene expression patterns associated with the potential emergence of regulatory T cells were observed. Genes involving lung repair, response to oxidative stress, and cell trafficking were strongly expressed, but interesting these gene patterns differed substantially between the infected and vaccinated/infected groups of animals. Given the importance of this species as a relevant and cost-effective small animal model of tuberculosis, this approach has the potential to provide new information regarding the effects of vaccination on control of the disease process.

  11. DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Nguyen, C.; Gidrol, X.

    Genomics has revolutionised biological and biomedical research. This revolution was predictable on the basis of its two driving forces: the ever increasing availability of genome sequences and the development of new technology able to exploit them. Up until now, technical limitations meant that molecular biology could only analyse one or two parameters per experiment, providing relatively little information compared with the great complexity of the systems under investigation. This gene by gene approach is inadequate to understand biological systems containing several thousand genes. It is essential to have an overall view of the DNA, RNA, and relevant proteins. A simple inventory of the genome is not sufficient to understand the functions of the genes, or indeed the way that cells and organisms work. For this purpose, functional studies based on whole genomes are needed. Among these new large-scale methods of molecular analysis, DNA microarrays provide a way of studying the genome and the transcriptome. The idea of integrating a large amount of data derived from a support with very small area has led biologists to call these chips, borrowing the term from the microelectronics industry. At the beginning of the 1990s, the development of DNA chips on nylon membranes [1, 2], then on glass [3] and silicon [4] supports, made it possible for the first time to carry out simultaneous measurements of the equilibrium concentration of all the messenger RNA (mRNA) or transcribed RNA in a cell. These microarrays offer a wide range of applications, in both fundamental and clinical research, providing a method for genome-wide characterisation of changes occurring within a cell or tissue, as for example in polymorphism studies, detection of mutations, and quantitative assays of gene copies. With regard to the transcriptome, it provides a way of characterising differentially expressed genes, profiling given biological states, and identifying regulatory channels.

  12. Physical map-assisted whole-genome shotgun sequence assemblies

    PubMed Central

    Warren, René L.; Varabei, Dmitry; Platt, Darren; Huang, Xiaoqiu; Messina, David; Yang, Shiaw-Pyng; Kronstad, James W.; Krzywinski, Martin; Warren, Wesley C.; Wallis, John W.; Hillier, LaDeana W.; Chinwalla, Asif T.; Schein, Jacqueline E.; Siddiqui, Asim S.; Marra, Marco A.; Wilson, Richard K.; Jones, Steven J.M.

    2006-01-01

    We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the layout phase of WGS assemblies. This process is facilitated by FASSI, a stand-alone application that calculates BAC end and BAC overlap length constraints from clone fingerprint map contigs created by the FPC package. FASSI is designed to work with the assembly tool PCAP, but its output can be formatted to work with other WGS assembly algorithms able to use length constraints for individual clones. The FASSI method is simple to implement, potentially cost-effective, and has resulted in the increase of scaffold contiguity for both the Drosophila melanogaster and Cryptococcus gattii genomes when compared to a control assembly without map-derived constraints. A 6.5-fold coverage draft DNA sequence of the Pan troglodytes (chimpanzee) genome was assembled using map-derived constraints and resulted in a 26.1% increase in scaffold contiguity. PMID:16741162

  13. Current Developments in Prokaryotic Single Cell Whole Genome Amplification

    SciTech Connect

    Goudeau, Danielle; Nath, Nandita; Ciobanu, Doina; Cheng, Jan-Fang; Malmstrom, Rex

    2014-03-14

    Our approach to prokaryotic single-cell Whole Genome Amplification at the JGI continues to evolve. To increase both the quality and number of single-cell genomes produced, we explore all aspects of the process from cell sorting to sequencing. For example, we now utilize specialized reagents, acoustic liquid handling, and reduced reaction volumes eliminate non-target DNA contamination in WGA reactions. More specifically, we use a cleaner commercial WGA kit from Qiagen that employs a UV decontamination procedure initially developed at the JGI, and we use the Labcyte Echo for tip-less liquid transfer to set up 2uL reactions. Acoustic liquid handling also dramatically reduces reagent costs. In addition, we are exploring new cell lysis methods including treatment with Proteinase K, lysozyme, and other detergents, in order to complement standard alkaline lysis and allow for more efficient disruption of a wider range of cells. Incomplete lysis represents a major hurdle for WGA on some environmental samples, especially rhizosphere, peatland, and other soils. Finding effective lysis strategies that are also compatible with WGA is challenging, and we are currently assessing the impact of various strategies on genome recovery.

  14. Whole-genome sequencing of nine esophageal adenocarcinoma cell lines.

    PubMed

    Contino, Gianmarco; Eldridge, Matthew D; Secrier, Maria; Bower, Lawrence; Fels Elliott, Rachael; Weaver, Jamie; Lynch, Andy G; Edwards, Paul A W; Fitzgerald, Rebecca C

    2016-01-01

    Esophageal adenocarcinoma (EAC) is highly mutated and molecularly heterogeneous. The number of cell lines available for study is limited and their genome has been only partially characterized. The availability of an accurate annotation of their mutational landscape is crucial for accurate experimental design and correct interpretation of genotype-phenotype findings. We performed high coverage, paired end whole genome sequencing on eight EAC cell lines-ESO26, ESO51, FLO-1, JH-EsoAd1, OACM5.1 C, OACP4 C, OE33, SK-GT-4-all verified against original patient material, and one esophageal high grade dysplasia cell line, CP-D. We have made available the aligned sequence data and report single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number alterations, identified by comparison with the human reference genome and known single nucleotide polymorphisms (SNPs). We compare these putative mutations to mutations found in primary tissue EAC samples, to inform the use of these cell lines as a model of EAC. PMID:27594985

  15. Signatures of selection in tilapia revealed by whole genome resequencing.

    PubMed

    Xia, Jun Hong; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Wan, Zi Yi; Li, Jiale; Lin, Haoran; Yue, Gen Hua

    2015-09-16

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10-100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia.

  16. Whole genomes redefine the mutational landscape of pancreatic cancer.

    PubMed

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K; Kassahn, Karin S; Bailey, Peter; Johns, Amber L; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C J; Robertson, Alan J; Fadlullah, Muhammad Z H; Bruxner, Tim J C; Christ, Angelika N; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J; Fink, J Lynn; Holmes, Oliver; Kazakoff, Stephen H; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Lee, Hong C; Jones, Marc D; Nagrial, Adnan M; Humphris, Jeremy; Chantrill, Lorraine A; Chin, Venessa; Steinmann, Angela M; Mawson, Amanda; Humphrey, Emily S; Colvin, Emily K; Chou, Angela; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Pettitt, Jessica A; Merrett, Neil D; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B; Graham, Janet S; Niclou, Simone P; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A; Gill, Anthony J; Eshleman, James R; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A; Pearson, John V; Biankin, Andrew V; Grimmond, Sean M

    2015-02-26

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded. PMID:25719666

  17. Information recovery from low coverage whole-genome bisulfite sequencing.

    PubMed

    Libertini, Emanuele; Heath, Simon C; Hamoudi, Rifat A; Gut, Marta; Ziller, Michael J; Czyz, Agata; Ruotti, Victor; Stunnenberg, Hendrik G; Frontini, Mattia; Ouwehand, Willem H; Meissner, Alexander; Gut, Ivo G; Beck, Stephan

    2016-01-01

    The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future. PMID:27346250

  18. Information recovery from low coverage whole-genome bisulfite sequencing

    PubMed Central

    Libertini, Emanuele; Heath, Simon C.; Hamoudi, Rifat A.; Gut, Marta; Ziller, Michael J.; Czyz, Agata; Ruotti, Victor; Stunnenberg, Hendrik G.; Frontini, Mattia; Ouwehand, Willem H.; Meissner, Alexander; Gut, Ivo G.; Beck, Stephan

    2016-01-01

    The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future. PMID:27346250

  19. Cryptococcus gattii in the Age of Whole-Genome Sequencing.

    PubMed

    Meyer, Wieland

    2015-11-17

    Cryptococcus gattii, the sister species of Cryptococcus neoformans, is an emerging pathogen which gained importance in connection with the ongoing cryptococcosis outbreak on Vancouver Island. Many molecular studies have divided this species into for major lineages: VGI, VGII, VGIII, and VGIV. This commentary summarizes the whole-genome sequencing (WGS) studies that have been carried out with this species, re-emphasizing the phylogenetic relationships, showing chromosomal rearrangements between those four groups, and identifying VGII as ancestral population within C. gattii. In addition, WGS specific to VGII, containing the Vancouver Island outbreak genotypes and those from the Pacific Northwest region of the United States, has placed the origin of this lineage within South America and identified specific genes responsible for either brain or lung infection. It also showed, that many genotypes are spread across a number of different continents, as has been previously shown by multilocus sequence typing (MLST). In addition, it showed that recombination occurs more frequently between mitochondrial than nuclear genomes.

  20. Next-generation diagnostics: gene panel, exome, or whole genome?

    PubMed

    Sun, Yu; Ruivenkamp, Claudia A L; Hoffer, Mariëtte J V; Vrijenhoek, Terry; Kriek, Marjolein; van Asperen, Christi J; den Dunnen, Johan T; Santen, Gijs W E

    2015-06-01

    Although the benefits of next-generation sequencing (NGS) for the diagnosis of heterogeneous diseases such as intellectual disability (ID) are undisputed, there is little consensus on the relative merits of targeted enrichment, whole-exome sequencing (WES) or whole-genome sequencing (WGS). To answer this question, WES and WGS data from the same nine samples were compared, and WES was shown not to miss any variants identified by WGS in a gene panel including ∼500 genes linked to ID (500GP). Additionally, deeply sequenced WES data were shown to adequately cover ∼99% of the 500GP; thus, little additional benefit was to be expected from a targeted enrichment approach. To reduce costs, minimal sequencing criteria were determined by investigating the relation between sequenced reads and outcome parameters such as coverage and variant yield. Our analysis indicated that 60 million reads yielded a mean coverage of ∼60×: ∼97% of the 500GP sequences were sufficiently covered to exclude variants, whereas variant yield was ∼99.5% and false-positive and false-negative rates were controlled. Our findings indicate that WES is currently the optimal approach to ID diagnostics. This result depends on the capture kit and sequencing strategy used. The developed framework however is amenable to other sequencing approaches.

  1. Are physicians prepared for whole genome sequencing? a qualitative analysis.

    PubMed

    Christensen, K D; Vassy, J L; Jamal, L; Lehmann, L S; Slashinski, M J; Perry, D L; Robinson, J O; Blumenthal-Barby, J; Feuerman, L Z; Murray, M F; Green, R C; McGuire, A L

    2016-02-01

    Although the integration of whole genome sequencing (WGS) into standard medical practice is rapidly becoming feasible, physicians may be unprepared to use it. Primary care physicians (PCPs) and cardiologists enrolled in a randomized clinical trial of WGS received genomics education before completing semi-structured interviews. Themes about preparedness were identified in transcripts through team-based consensus-coding. Data from 11 PCPs and 9 cardiologists suggested that physicians enrolled in the trial primarily to prepare themselves for widespread use of WGS in the future. PCPs were concerned about their general genomic knowledge, while cardiologists were concerned about how to interpret specific types of results and secondary findings. Both cohorts anticipated preparing extensively before disclosing results to patients by using educational resources with which they were already familiar, and both cohorts anticipated making referrals to genetics specialists as needed. A lack of laboratory guidance, time pressures, and a lack of standards contributed to feeling unprepared. Physicians had specialty-specific concerns about their preparedness to use WGS. Findings identify specific policy changes that could help physicians feel more prepared, and highlight how providers of all types will need to become familiar with interpreting WGS results.

  2. Whole genomes redefine the mutational landscape of pancreatic cancer

    PubMed Central

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K.; Kassahn, Karin S.; Bailey, Peter; Johns, Amber L.; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C. J.; Robertson, Alan J.; Fadlullah, Muhammad Z. H.; Bruxner, Tim J. C.; Christ, Angelika N.; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J.; Fink, J. Lynn; Holmes, Oliver; Kazakoff, Stephen H.; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J.; Lee, Hong C.; Jones, Marc D.; Nagrial, Adnan M.; Humphris, Jeremy; Chantrill, Lorraine A.; Chin, Venessa; Steinmann, Angela M.; Mawson, Amanda; Humphrey, Emily S.; Colvin, Emily K.; Chou, Angela; Scarlett, Christopher J.; Pinho, Andreia V.; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S.; Kench, James G.; Pettitt, Jessica A.; Merrett, Neil D.; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q.; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B.; Graham, Janet S.; Niclou, Simone P.; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H.; Maitra, Anirban; Iacobuzio-Donahue, Christine A.; Wolfgang, Christopher L.; Morgan, Richard A.; Lawlor, Rita T.; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A.; Gill, Anthony J.; Eshleman, James R.; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A.; Pearson, John V.; Biankin, Andrew V.; Grimmond, Sean M.

    2015-01-01

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded. PMID:25719666

  3. Whole-genome sequencing of nine esophageal adenocarcinoma cell lines

    PubMed Central

    Contino, Gianmarco; Eldridge, Matthew D.; Secrier, Maria; Bower, Lawrence; Fels Elliott, Rachael; Weaver, Jamie; Lynch, Andy G.; Edwards, Paul A.W.; Fitzgerald, Rebecca C.

    2016-01-01

    Esophageal adenocarcinoma (EAC) is highly mutated and molecularly heterogeneous. The number of cell lines available for study is limited and their genome has been only partially characterized. The availability of an accurate annotation of their mutational landscape is crucial for accurate experimental design and correct interpretation of genotype-phenotype findings. We performed high coverage, paired end whole genome sequencing on eight EAC cell lines—ESO26, ESO51, FLO-1, JH-EsoAd1, OACM5.1 C, OACP4 C, OE33, SK-GT-4—all verified against original patient material, and one esophageal high grade dysplasia cell line, CP-D. We have made available the aligned sequence data and report single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number alterations, identified by comparison with the human reference genome and known single nucleotide polymorphisms (SNPs). We compare these putative mutations to mutations found in primary tissue EAC samples, to inform the use of these cell lines as a model of EAC. PMID:27594985

  4. Genomic V exons from whole genome shotgun data in reptiles.

    PubMed

    Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

    2014-08-01

    Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).

  5. Whole-genome sequencing of nine esophageal adenocarcinoma cell lines

    PubMed Central

    Contino, Gianmarco; Eldridge, Matthew D.; Secrier, Maria; Bower, Lawrence; Fels Elliott, Rachael; Weaver, Jamie; Lynch, Andy G.; Edwards, Paul A.W.; Fitzgerald, Rebecca C.

    2016-01-01

    Esophageal adenocarcinoma (EAC) is highly mutated and molecularly heterogeneous. The number of cell lines available for study is limited and their genome has been only partially characterized. The availability of an accurate annotation of their mutational landscape is crucial for accurate experimental design and correct interpretation of genotype-phenotype findings. We performed high coverage, paired end whole genome sequencing on eight EAC cell lines—ESO26, ESO51, FLO-1, JH-EsoAd1, OACM5.1 C, OACP4 C, OE33, SK-GT-4—all verified against original patient material, and one esophageal high grade dysplasia cell line, CP-D. We have made available the aligned sequence data and report single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number alterations, identified by comparison with the human reference genome and known single nucleotide polymorphisms (SNPs). We compare these putative mutations to mutations found in primary tissue EAC samples, to inform the use of these cell lines as a model of EAC.

  6. Alignathon: a competitive assessment of whole-genome alignment methods

    PubMed Central

    Earl, Dent; Nguyen, Ngan; Hickey, Glenn; Harris, Robert S.; Fitzgerald, Stephen; Beal, Kathryn; Seledtsov, Igor; Molodtsov, Vladimir; Raney, Brian J.; Clawson, Hiram; Kim, Jaebum; Kemena, Carsten; Chang, Jia-Ming; Erb, Ionas; Poliakov, Alexander; Hou, Minmei; Herrero, Javier; Kent, William James; Solovyev, Victor; Darling, Aaron E.; Ma, Jian; Notredame, Cedric; Brudno, Michael; Dubchak, Inna; Haussler, David; Paten, Benedict

    2014-01-01

    Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments. PMID:25273068

  7. Whole-genome sequencing reveals oncogenic mutations in mycosis fungoides

    PubMed Central

    McGirt, Laura Y.; Jia, Peilin; Baerenwald, Devin A.; Duszynski, Robert J.; Dahlman, Kimberly B.; Zic, John A.; Zwerner, Jeffrey P.; Hucks, Donald; Dave, Utpal; Zhao, Zhongming

    2015-01-01

    The pathogenesis of mycosis fungoides (MF), the most common cutaneous T-cell lymphoma (CTCL), is unknown. Although genetic alterations have been identified, none are considered consistently causative in MF. To identify potential drivers of MF, we performed whole-genome sequencing of MF tumors and matched normal skin. Targeted ultra-deep sequencing of MF samples and exome sequencing of CTCL cell lines were also performed. Multiple mutations were identified that affected the same pathways, including epigenetic, cell-fate regulation, and cytokine signaling, in MF tumors and CTCL cell lines. Specifically, interleukin-2 signaling pathway mutations, including activating Janus kinase 3 (JAK3) mutations, were detected. Treatment with a JAK3 inhibitor significantly reduced CTCL cell survival. Additionally, the mutation data identified 2 other potential contributing factors to MF, ultraviolet light, and a polymorphism in the tumor suppressor p53 (TP53). Therefore, genetic alterations in specific pathways in MF were identified that may be viable, effective new targets for treatment. PMID:26082451

  8. MIPS: analysis and annotation of proteins from whole genomes.

    PubMed

    Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de). PMID:14681354

  9. Information recovery from low coverage whole-genome bisulfite sequencing.

    PubMed

    Libertini, Emanuele; Heath, Simon C; Hamoudi, Rifat A; Gut, Marta; Ziller, Michael J; Czyz, Agata; Ruotti, Victor; Stunnenberg, Hendrik G; Frontini, Mattia; Ouwehand, Willem H; Meissner, Alexander; Gut, Ivo G; Beck, Stephan

    2016-06-27

    The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future.

  10. Signatures of selection in tilapia revealed by whole genome resequencing

    PubMed Central

    Hong Xia, Jun; Bai, Zhiyi; Meng, Zining; Zhang, Yong; Wang, Le; Liu, Feng; Jing, Wu; Yi Wan, Zi; Li, Jiale; Lin, Haoran; Hua Yue, Gen

    2015-01-01

    Natural selection and selective breeding for genetic improvement have left detectable signatures within the genome of a species. Identification of selection signatures is important in evolutionary biology and for detecting genes that facilitate to accelerate genetic improvement. However, selection signatures, including artificial selection and natural selection, have only been identified at the whole genome level in several genetically improved fish species. Tilapia is one of the most important genetically improved fish species in the world. Using next-generation sequencing, we sequenced the genomes of 47 tilapia individuals. We identified a total of 1.43 million high-quality SNPs and found that the LD block sizes ranged from 10–100 kb in tilapia. We detected over a hundred putative selective sweep regions in each line of tilapia. Most selection signatures were located in non-coding regions of the tilapia genome. The Wnt signaling, gonadotropin-releasing hormone receptor and integrin signaling pathways were under positive selection in all improved tilapia lines. Our study provides a genome-wide map of genetic variation and selection footprints in tilapia, which could be important for genetic studies and accelerating genetic improvement of tilapia. PMID:26373374

  11. Whole genomes redefine the mutational landscape of pancreatic cancer.

    PubMed

    Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K; Kassahn, Karin S; Bailey, Peter; Johns, Amber L; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C J; Robertson, Alan J; Fadlullah, Muhammad Z H; Bruxner, Tim J C; Christ, Angelika N; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J; Fink, J Lynn; Holmes, Oliver; Kazakoff, Stephen H; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Lee, Hong C; Jones, Marc D; Nagrial, Adnan M; Humphris, Jeremy; Chantrill, Lorraine A; Chin, Venessa; Steinmann, Angela M; Mawson, Amanda; Humphrey, Emily S; Colvin, Emily K; Chou, Angela; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Pettitt, Jessica A; Merrett, Neil D; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B; Graham, Janet S; Niclou, Simone P; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A; Gill, Anthony J; Eshleman, James R; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A; Pearson, John V; Biankin, Andrew V; Grimmond, Sean M

    2015-02-26

    Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded.

  12. Cryptococcus gattii in the Age of Whole-Genome Sequencing.

    PubMed

    Meyer, Wieland

    2015-01-01

    Cryptococcus gattii, the sister species of Cryptococcus neoformans, is an emerging pathogen which gained importance in connection with the ongoing cryptococcosis outbreak on Vancouver Island. Many molecular studies have divided this species into for major lineages: VGI, VGII, VGIII, and VGIV. This commentary summarizes the whole-genome sequencing (WGS) studies that have been carried out with this species, re-emphasizing the phylogenetic relationships, showing chromosomal rearrangements between those four groups, and identifying VGII as ancestral population within C. gattii. In addition, WGS specific to VGII, containing the Vancouver Island outbreak genotypes and those from the Pacific Northwest region of the United States, has placed the origin of this lineage within South America and identified specific genes responsible for either brain or lung infection. It also showed, that many genotypes are spread across a number of different continents, as has been previously shown by multilocus sequence typing (MLST). In addition, it showed that recombination occurs more frequently between mitochondrial than nuclear genomes. PMID:26578680

  13. Clinical use of whole genome sequencing for Mycobacterium tuberculosis.

    PubMed

    Witney, Adam A; Cosgrove, Catherine A; Arnold, Amber; Hinds, Jason; Stoker, Neil G; Butcher, Philip D

    2016-01-01

    Drug-resistant tuberculosis (TB) remains a major challenge to global health and to healthcare in the UK. In 2014, a total of 6,520 cases of TB were recorded in England, of which 1.4 % were multidrug-resistant TB (MDR-TB). Extensively drug-resistant TB (XDR-TB) occurs at a much lower rate, but the impact on the patient and hospital is severe. Current diagnostic methods such as drug susceptibility testing and targeted molecular tests are slow to return or examine only a limited number of target regions, respectively. Faster, more comprehensive diagnostics will enable earlier use of the most appropriate drug regimen, thus improving patient outcomes and reducing overall healthcare costs. Whole genome sequencing (WGS) has been shown to provide a rapid and comprehensive view of the genotype of the organism, and thus enable reliable prediction of the drug susceptibility phenotype within a clinically relevant timeframe. In addition, it provides the highest resolution when investigating transmission events in possible outbreak scenarios. However, robust software and database tools need to be developed for the full potential to be realized in this specialized area of medicine. PMID:27004841

  14. Application of Whole-Genome Sequencing to an Unusual Outbreak of Invasive Group A Streptococcal Disease

    PubMed Central

    Galloway-Peña, Jessica; Clement, Meredith E.; Sharma Kuinkel, Batu K.; Ruffin, Felicia; Flores, Anthony R.; Levinson, Howard; Shelburne, Samuel A.; Moore, Zack; Fowler, Vance G.

    2016-01-01

    Whole-genome analysis was applied to investigate atypical point-source transmission of 2 invasive group A streptococcal (GAS) infections. Isolates were serotype M4, ST39, and genetically indistinguishable. Comparison with MGAS10750 revealed nonsynonymous polymorphisms in ropB and increased speB transcription. This study demonstrates the usefulness of whole-genome analyses for GAS outbreaks. PMID:27006966

  15. Selected Insights from Application of Whole Genome Sequencing for Outbreak Investigations

    PubMed Central

    Le, Vien Thi Minh; Diep, Binh An

    2014-01-01

    Purpose of review The advent of high-throughput whole genome sequencing has the potential to revolutionize the conduct of outbreak investigation. Because of its ultimate pathogen strain resolution, whole genome sequencing could augment traditional epidemiologic investigations of infectious disease outbreaks. Recent findings The combination of whole genome sequencing and intensive epidemiologic analysis provided new insights on the sources and transmission dynamics of large-scale epidemics caused by Escherichia coli and Vibrio cholerae, nosocomial outbreaks caused by methicillin-resistant Staphylococcus aureus, Klebsiella pneumonia, and Mycobacterium abscessus, community-centered outbreaks caused by Mycobacterium tuberculosis, and natural disaster-associated outbreak caused by environmentally acquired molds. Summary When combined with traditional epidemiologic investigation, whole genome sequencing has proven useful for elucidating sources and transmission dynamics of disease outbreaks. Development of a fully automated bioinformatics pipeline for analysis of whole genome sequence data is much needed to make this powerful tool more widely accessible. PMID:23856896

  16. A whole-genome association study for pig reproductive traits.

    PubMed

    Onteru, S K; Fan, B; Du, Z-Q; Garrick, D J; Stalder, K J; Rothschild, M F

    2012-02-01

    A whole-genome association study was performed for reproductive traits in commercial sows using the PorcineSNP60 BeadChip and Bayesian statistical methods. The traits included total number born (TNB), number born alive (NBA), number of stillborn (SB), number of mummified foetuses at birth (MUM) and gestation length (GL) in each of the first three parities. We report the associations of informative QTL and the genes within the QTL for each reproductive trait in different parities. These results provide evidence of gene effects having temporal impacts on reproductive traits in different parities. Many QTL identified in this study are new for pig reproductive traits. Around 48% of total genes located in the identified QTL regions were predicted to be involved in placental functions. The genomic regions containing genes important for foetal developmental (e.g. MEF2C) and uterine functions (e.g. PLSCR4) were associated with TNB and NBA in the first two parities. Similarly, QTL in other foetal developmental (e.g. HNRNPD and AHR) and placental (e.g. RELL1 and CD96) genes were associated with SB and MUM in different parities. The QTL with genes related to utero-placental blood flow (e.g. VEGFA) and hematopoiesis (e.g. MAFB) were associated with GL differences among sows in this population. Pathway analyses using genes within QTL identified some modest underlying biological pathways, which are interesting candidates (e.g. the nucleotide metabolism pathway for SB) for pig reproductive traits in different parities. Further validation studies on large populations are warranted to improve our understanding of the complex genetic architecture for pig reproductive traits.

  17. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    SciTech Connect

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  18. Whole-genome sequencing targets drug-resistant bacterial infections.

    PubMed

    Punina, N V; Makridakis, N M; Remnev, M A; Topunov, A F

    2015-01-01

    During the past two decades, the technological progress of whole-genome sequencing (WGS) had changed the fields of Environmental Microbiology and Biotechnology, and, currently, is changing the underlying principles, approaches, and fundamentals of Public Health, Epidemiology, Health Economics, and national productivity. Today's WGS technologies are able to compete with conventional techniques in cost, speed, accuracy, and resolution for day-to-day control of infectious diseases and outbreaks in clinical laboratories and in long-term epidemiological investigations. WGS gives rise to an exciting future direction for personalized Genomic Epidemiology. One of the most vital and growing public health problems is the emerging and re-emerging of multidrug-resistant (MDR) bacterial infections in the communities and healthcare settings, reinforced by a decline in antimicrobial drug discovery. In recent years, retrospective analysis provided by WGS has had a great impact on the identification and tracking of MDR microorganisms in hospitals and communities. The obtained genomic data are also important for developing novel easy-to-use diagnostic assays for clinics, as well as for antibiotic and therapeutic development at both the personal and population levels. At present, this technology has been successfully applied as an addendum to the real-time diagnostic methods currently used in clinical laboratories. However, the significance of WGS for public health may increase if: (a) unified and user-friendly bioinformatics toolsets for easy data interpretation and management are established, and (b) standards for data validation and verification are developed. Herein, we review the current and future impact of this technology on diagnosis, prevention, treatment, and control of MDR infectious bacteria in clinics and on the global scale. PMID:26243131

  19. INTEGRATE: gene fusion discovery using whole genome and transcriptome data

    PubMed Central

    Zhang, Jin; White, Nicole M.; Schmidt, Heather K.; Fulton, Robert S.; Tomlinson, Chad; Warren, Wesley C.; Wilson, Richard K.; Maher, Christopher A.

    2016-01-01

    While next-generation sequencing (NGS) has become the primary technology for discovering gene fusions, we are still faced with the challenge of ensuring that causative mutations are not missed while minimizing false positives. Currently, there are many computational tools that predict structural variations (SV) and gene fusions using whole genome (WGS) and transcriptome sequencing (RNA-seq) data separately. However, as both WGS and RNA-seq have their limitations when used independently, we hypothesize that the orthogonal validation from integrating both data could generate a sensitive and specific approach for detecting high-confidence gene fusion predictions. Fortunately, decreasing NGS costs have resulted in a growing quantity of patients with both data available. Therefore, we developed a gene fusion discovery tool, INTEGRATE, that leverages both RNA-seq and WGS data to reconstruct gene fusion junctions and genomic breakpoints by split-read mapping. To evaluate INTEGRATE, we compared it with eight additional gene fusion discovery tools using the well-characterized breast cell line HCC1395 and peripheral blood lymphocytes derived from the same patient (HCC1395BL). The predictions subsequently underwent a targeted validation leading to the discovery of 131 novel fusions in addition to the seven previously reported fusions. Overall, INTEGRATE only missed six out of the 138 validated fusions and had the highest accuracy of the nine tools evaluated. Additionally, we applied INTEGRATE to 62 breast cancer patients from The Cancer Genome Atlas (TCGA) and found multiple recurrent gene fusions including a subset involving estrogen receptor. Taken together, INTEGRATE is a highly sensitive and accurate tool that is freely available for academic use. PMID:26556708

  20. Parent and Public Interest in Whole Genome Sequencing

    PubMed Central

    Dodson, Daniel S.; Goldenberg, Aaron J.; Davis, Matthew M.; Singer, Dianne C.; Tarini, Beth A.

    2015-01-01

    Objective To assess the baseline interest of the public in whole genome sequencing (WGS) for themselves, parents’ interest in WGS for their youngest children, and factors associated with such interest. Methods A random sample of adults from a probability-based nationally representative online panel was surveyed. All participants were provided basic information about WGS and then asked their interest in WGS for themselves. Those participants who self-identified as parents were asked about their interest in WGS for their children. The order in which parents were asked about their interest in WGS for themselves and their child was randomized. The relationship between parent/child characteristics and interest in WGS was examined. Results Overall response rate was 62% (55% among parents). 58.6% of the total population (parents and non-parents) was interested in WGS for themselves. Similarly, 61.8% of parents were interested in WGS for themselves and 57.8% were interested in WGS for their youngest children. Of note, 84.7% of parents showed an identical interest level in WGS for themselves and their youngest children. Mothers as a whole, and parents whose youngest children had ≥2 health conditions had significantly more interest in WGS for themselves and their youngest children, while those with conservative political ideologies had considerably less. Conclusions While U.S. adults have varying interest levels in WGS, parents appear to have similar interests in genome testing for themselves and their youngest children. As WGS technology becomes available in the clinic and private market, clinicians should be prepared to discuss WGS risks and benefits with their patients. PMID:25765282

  1. A Whole Genome Association Study on Meat Palatability in Hanwoo

    PubMed Central

    Hyeong, K.-E.; Lee, Y.-M.; Kim, Y.-S.; Nam, K. C.; Jo, C.; Lee, K.-H.; Lee, J.-E.; Kim, J.-J.

    2014-01-01

    A whole genome association (WGA) study was carried out to find quantitative trait loci (QTL) for sensory evaluation traits in Hanwoo. Carcass samples of 250 Hanwoo steers were collected from National Agricultural Cooperative Livestock Research Institute, Ansung, Gyeonggi province, Korea, between 2011 and 2012 and genotyped with the Affymetrix Bovine Axiom Array 640K single nucleotide polymorphism (SNP) chip. Among the SNPs in the chip, a total of 322,160 SNPs were chosen after quality control tests. After adjusting for the effects of age, slaughter-year-season, and polygenic effects using genome relationship matrix, the corrected phenotypes for the sensory evaluation measurements were regressed on each SNP using a simple linear regression additive based model. A total of 1,631 SNPs were detected for color, aroma, tenderness, juiciness and palatability at 0.1% comparison-wise level. Among the significant SNPs, the best set of 52 SNP markers were chosen using a forward regression procedure at 0.05 level, among which the sets of 8, 14, 11, 10, and 9 SNPs were determined for the respectively sensory evaluation traits. The sets of significant SNPs explained 18% to 31% of phenotypic variance. Three SNPs were pleiotropic, i.e. AX-26703353 and AX-26742891 that were located at 101 and 110 Mb of BTA6, respectively, influencing tenderness, juiciness and palatability, while AX-18624743 at 3 Mb of BTA10 affected tenderness and palatability. Our results suggest that some QTL for sensory measures are segregating in a Hanwoo steer population. Additional WGA studies on fatty acid and nutritional components as well as the sensory panels are in process to characterize genetic architecture of meat quality and palatability in Hanwoo. PMID:25178363

  2. Copy Number Variation Analysis by Array Analysis of Single Cells Following Whole Genome Amplification.

    PubMed

    Dimitriadou, Eftychia; Zamani Esteki, Masoud; Vermeesch, Joris Robert

    2015-01-01

    Whole genome amplification is required to ensure the availability of sufficient material for copy number variation analysis of a genome deriving from an individual cell. Here, we describe the protocols we use for copy number variation analysis of non-fixed single cells by array-based approaches following single-cell isolation and whole genome amplification. We are focusing on two alternative protocols, an isothermal and a PCR-based whole genome amplification method, followed by either comparative genome hybridization (aCGH) or SNP array analysis, respectively.

  3. Assessment of Whole-Genome Regression for Type II Diabetes

    PubMed Central

    Vazquez, Ana I.; Klimentidis, Yann C.; Dhurandhar, Emily J.; Veturi, Yogasudha C.; Paérez-Rodríguez, Paulino

    2015-01-01

    Lifestyle and genetic factors play a large role in the development of Type 2 Diabetes (T2D). Despite the important role of genetic factors, genetic information is not incorporated into the clinical assessment of T2D risk. We assessed and compared Whole Genome Regression methods to predict the T2D status of 5,245 subjects from the Framingham Heart Study. For evaluating each method we constructed the following set of regression models: A clinical baseline model (CBM) which included non-genetic covariates only. CBM was extended by adding the first two marker-derived principal components and 65 SNPs identified by a recent GWAS consortium for T2D (M-65SNPs). Subsequently, it was further extended by adding 249,798 genome-wide SNPs from a high-density array. The Bayesian models used to incorporate genome-wide marker information as predictors were: Bayes A, Bayes Cπ, Bayesian LASSO (BL), and the Genomic Best Linear Unbiased Prediction (G-BLUP). Results included estimates of the genetic variance and heritability, genetic scores for T2D, and predictive ability evaluated in a 10-fold cross-validation. The predictive AUC estimates for CBM and M-65SNPs were: 0.668 and 0.684, respectively. We found evidence of contribution of genetic effects in T2D, as reflected in the genomic heritability estimates (0.492±0.066). The highest predictive AUC among the genome-wide marker Bayesian models was 0.681 for the Bayesian LASSO. Overall, the improvement in predictive ability was moderate and did not differ greatly among models that included genetic information. Approximately 58% of the total number of genetic variants was found to contribute to the overall genetic variation, indicating a complex genetic architecture for T2D. Our results suggest that the Bayes Cπ and the G-BLUP models with a large set of genome-wide markers could be used for predicting risk to T2D, as an alternative to using high-density arrays when selected markers from large consortiums for a given complex trait or

  4. Whole-genome expression analysis reveals genes associated with treatment response to escitalopram in major depression.

    PubMed

    Pettai, Kristi; Milani, Lili; Tammiste, Anu; Võsa, Urmo; Kolde, Raivo; Eller, Triin; Nutt, David; Metspalu, Andres; Maron, Eduard

    2016-09-01

    The reasons for variability in treatment response in major depressive disorder (MDD) are not fully understood, but there is accumulating evidence suggesting that therapeutic outcomes of antidepressants can be influenced by genetic factors. In the present study we applied the microarray Illumina platform for whole genome expression profiling in depressive patients treated with escitalopram medication in order to identify genes underlying response to antidepressant treatment. The initial study sample consisted of 135 outpatients with major depressive disorder (mean age 31.1±11.6 years, 68% females) treated with escitalopram 10-20mg/day for 12 weeks, from which 87 patients (55 females) were included in gene expression analyzing. The gene expression profiles were measured on peripheral blood cells at baseline, at week 4 and at the end of treatment (week 12) using BeadChips Illumina. The fold change was used to demonstrate rate of changes in average gene expressions between studied groups. Statistical analyses were performed using the false discovery rate (FDR). The most interesting gene, which showed the predictive effect on treatment outcome by delineating low dose responders and treatment-resistant patients at the beginning of medication, was NLGN2, belonging to a family of neuronal cell surface proteins and involving in synapse formation. In addition, the several gene clusters, related to immune response, signal transduction and neurotrophin pathway, have distinguished responders from non-responders at the week 4 of treatment. After 4 weeks of escitalopram treatment (10mg/day), the YWHAZ gene has showed the highest transcriptional change in responders as compared with non-responders. Finally, at the end of the treatment we noticed that at least three genes (NR2C2, ZNF641, FKBP1A) have been strongly associated with resistance to escitalopram. Thus the results of this study support that exploration of peripheral gene expression is a useful tool in the further

  5. Whole-genome expression analysis reveals genes associated with treatment response to escitalopram in major depression.

    PubMed

    Pettai, Kristi; Milani, Lili; Tammiste, Anu; Võsa, Urmo; Kolde, Raivo; Eller, Triin; Nutt, David; Metspalu, Andres; Maron, Eduard

    2016-09-01

    The reasons for variability in treatment response in major depressive disorder (MDD) are not fully understood, but there is accumulating evidence suggesting that therapeutic outcomes of antidepressants can be influenced by genetic factors. In the present study we applied the microarray Illumina platform for whole genome expression profiling in depressive patients treated with escitalopram medication in order to identify genes underlying response to antidepressant treatment. The initial study sample consisted of 135 outpatients with major depressive disorder (mean age 31.1±11.6 years, 68% females) treated with escitalopram 10-20mg/day for 12 weeks, from which 87 patients (55 females) were included in gene expression analyzing. The gene expression profiles were measured on peripheral blood cells at baseline, at week 4 and at the end of treatment (week 12) using BeadChips Illumina. The fold change was used to demonstrate rate of changes in average gene expressions between studied groups. Statistical analyses were performed using the false discovery rate (FDR). The most interesting gene, which showed the predictive effect on treatment outcome by delineating low dose responders and treatment-resistant patients at the beginning of medication, was NLGN2, belonging to a family of neuronal cell surface proteins and involving in synapse formation. In addition, the several gene clusters, related to immune response, signal transduction and neurotrophin pathway, have distinguished responders from non-responders at the week 4 of treatment. After 4 weeks of escitalopram treatment (10mg/day), the YWHAZ gene has showed the highest transcriptional change in responders as compared with non-responders. Finally, at the end of the treatment we noticed that at least three genes (NR2C2, ZNF641, FKBP1A) have been strongly associated with resistance to escitalopram. Thus the results of this study support that exploration of peripheral gene expression is a useful tool in the further

  6. New perspectives on microbial community distortion after whole-genome amplification

    EPA Science Inventory

    Whole-genome amplification (WGA) has become an important tool to explore the genomic information of microorganisms in an environmental sample with limited biomass, however potential selective biases during the amplification processes are poorly understood. Here, we describe the e...

  7. TCGA's Pan-Cancer Efforts and Expansion to Include Whole Genome Sequence - TCGA

    Cancer.gov

    Carolyn Hutter, Ph.D., Program Director of NHGRI's Division of Genomic Medicine, discusses the expansion of TCGA's Pan-Cancer efforts to include the Pan-Cancer Analysis of Whole Genomes (PAWG) project.

  8. Whole-Genome Sequence of the Nitrogen-Fixing Symbiotic Rhizobium Mesorhizobium loti Strain TONO

    PubMed Central

    Hirakawa, Hideki; Sato, Shusei; Saeki, Kazuhiko; Hayashi, Makoto

    2016-01-01

    Mesorhizobium loti is the nitrogen-fixing microsymbiont for legumes of the genus Lotus. Here, we report the whole-genome sequence of a Mesorhizobium loti strain, TONO, which is used as a symbiont for the model legume Lotus japonicus. The whole-genome sequence of the strain TONO will be a solid platform for comparative genomics analyses and for the identification of genes responsible for the symbiotic properties of Mesorhizobium species. PMID:27795235

  9. Whole Genome Sequencing as a Genetic Test for Autism Spectrum Disorder: From Bench to Bedside and then Back Again

    PubMed Central

    Szego, Michael J.; Zawati, Ma’n H.

    2016-01-01

    Autism spectrum disorder (ASD) is characterized by repetitive patterns of behaviour and impairments in social interactions and communication abilities. Although ASD is a heterogeneous disorder, it is a highly genetic condition for which genetic testing is routinely performed. Microarray analysis is currently the standard of care genetic test for ASD, however whole genome sequencing offers several key advantages and will likely replace microarrays as a frontline genetic test in the near future. The 2nd Consultation on Translation of Genomic Advances into Health Applications took place in the spring of 2014 to broadly explore the current and potential impacts of genomic advances in supporting personalized and family-centered care for autism and related developmental conditions. In anticipation of WGS becoming a standard of care test, we examine the policy landscape and highlight the lack of consistency among guidelines regarding what genomic information should be returned to patients and their families. We also discuss the need to create the infrastructure to share clinical WGS data with researchers in a systematic and ethically defensible manner. PMID:27274747

  10. Anaerobic, Nitrate-Dependent Oxidation of Uraninite by the Chemolithoautotroph Thiobacillus denitrificans: Cell Suspension and Whole-Genome Transcriptional Studies

    NASA Astrophysics Data System (ADS)

    Beller, H. R.; Chakicherla, A.; Legler, T. C.; Letain, T. E.; Coleman, M.; Kane, S. R.

    2005-12-01

    Background: In-situ, reductive immobilization of uranium in aquifers, whereby relatively soluble U(VI) species are reduced to poorly soluble uraninite (UO2) by aquifer bacteria, has been the subject of intensive research effort recently. This study explored the possibility that a widespread soil bacterium, Thiobacillus denitrificans, could catalyze anaerobic U re-oxidation in the presence of nitrate, a common co-contaminant with uranium at U.S. DOE sites. Whole-genome, cDNA microarray studies (representing all 2832 ORFs of the 2.9 Mb genome) were conducted to identify genes upregulated during nitrate-dependent U(IV) oxidation (relative to control conditions of nitrate-dependent thiosulfate oxidation). Methods: Washed cell suspension experiments were carried out under strictly anaerobic conditions and at circumneutral pH with UO2 and T. denitrificans cells grown under denitrifying conditions and harvested in late exponential phase. Experiments included both sterile controls and live, no-nitrate controls. For microarray analysis, RNA was isolated from cells exposed to either UO2 or thiosulfate under strictly anaerobic, denitrifying conditions. For all samples analyzed with microarrays, chemical analyses were used to confirm that the applicable metabolic activity [i.e., denitrification and either U(IV) or thiosulfate oxidation] was occurring. Reverse transcription, quantitative PCR was used to confirm selected microarray results. Results: In the cell suspension experiments, T. denitrificans cells oxidatively dissolved UO2 in nitrate-dependent fashion: U(IV) oxidation required the presence of nitrate ( P<0.01) and was strongly correlated to nitrate consumption (r2 = 0.98). However, U(IV) oxidation and denitrification appeared to be dependent on H2. The microarrays identified 333 genes as upregulated under U(IV)-oxidizing conditions using RMA statistical analysis and a 2-fold ( P<0.0001) cutoff. Notably, 16 of these genes, which were upregulated 5- to 22-fold, were

  11. Whole Genome Sequencing Demonstrates Limited Transmission within Identified Mycobacterium tuberculosis Clusters in New South Wales, Australia

    PubMed Central

    Gurjav, Ulziijargal; Outhred, Alexander C.; Jelfs, Peter; McCallum, Nadine; Wang, Qinning; Hill-Cawthorne, Grant A.; Marais, Ben J.; Sintchenko, Vitali

    2016-01-01

    Australia has a low tuberculosis incidence rate with most cases occurring among recent immigrants. Given suboptimal cluster resolution achieved with 24-locus mycobacterium interspersed repetitive unit (MIRU-24) genotyping, the added value of whole genome sequencing was explored. MIRU-24 profiles of all Mycobacterium tuberculosis culture-confirmed tuberculosis cases diagnosed between 2009 and 2013 in New South Wales (NSW), Australia, were examined and clusters identified. The relatedness of cases within the largest MIRU-24 clusters was assessed using whole genome sequencing and phylogenetic analyses. Of 1841 culture-confirmed TB cases, 91.9% (1692/1841) had complete demographic and genotyping data. East-African Indian (474; 28.0%) and Beijing (470; 27.8%) lineage strains predominated. The overall rate of MIRU-24 clustering was 20.1% (340/1692) and was highest among Beijing lineage strains (35.7%; 168/470). One Beijing and three East-African Indian (EAI) clonal complexes were responsible for the majority of observed clusters. Whole genome sequencing of the 4 largest clusters (30 isolates) demonstrated diverse single nucleotide polymorphisms (SNPs) within identified clusters. All sequenced EAI strains and 70% of Beijing lineage strains clustered by MIRU-24 typing demonstrated distinct SNP profiles. The superior resolution provided by whole genome sequencing demonstrated limited M. tuberculosis transmission within NSW, even within identified MIRU-24 clusters. Routine whole genome sequencing could provide valuable public health guidance in low burden settings. PMID:27737005

  12. Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture.

    PubMed

    Seth-Smith, Helena M B; Harris, Simon R; Skilton, Rachel J; Radebe, Frans M; Golparian, Daniel; Shipitsyna, Elena; Duy, Pham Thanh; Scott, Paul; Cutcliffe, Lesley T; O'Neill, Colette; Parmar, Surendra; Pitt, Rachel; Baker, Stephen; Ison, Catherine A; Marsh, Peter; Jalal, Hamid; Lewis, David A; Unemo, Magnus; Clarke, Ian N; Parkhill, Julian; Thomson, Nicholas R

    2013-05-01

    The use of whole-genome sequencing as a tool for the study of infectious bacteria is of growing clinical interest. Chlamydia trachomatis is responsible for sexually transmitted infections and the blinding disease trachoma, which affect hundreds of millions of people worldwide. Recombination is widespread within the genome of C. trachomatis, thus whole-genome sequencing is necessary to understand the evolution, diversity, and epidemiology of this pathogen. Culture of C. trachomatis has, until now, been a prerequisite to obtain DNA for whole-genome sequencing; however, as C. trachomatis is an obligate intracellular pathogen, this procedure is technically demanding and time consuming. Discarded clinical samples represent a large resource for sequencing the genomes of pathogens, yet clinical swabs frequently contain very low levels of C. trachomatis DNA and large amounts of contaminating microbial and human DNA. To determine whether it is possible to obtain whole-genome sequences from bacteria without the need for culture, we have devised an approach that combines immunomagnetic separation (IMS) for targeted bacterial enrichment with multiple displacement amplification (MDA) for whole-genome amplification. Using IMS-MDA in conjunction with high-throughput multiplexed Illumina sequencing, we have produced the first whole bacterial genome sequences direct from clinical samples. We also show that this method can be used to generate genome data from nonviable archived samples. This method will prove a useful tool in answering questions relating to the biology of many difficult-to-culture or fastidious bacteria of clinical concern. PMID:23525359

  13. Whole-genome sequences of Chlamydia trachomatis directly from clinical samples without culture

    PubMed Central

    Seth-Smith, Helena M.B.; Harris, Simon R.; Skilton, Rachel J.; Radebe, Frans M.; Golparian, Daniel; Shipitsyna, Elena; Duy, Pham Thanh; Scott, Paul; Cutcliffe, Lesley T.; O’Neill, Colette; Parmar, Surendra; Pitt, Rachel; Baker, Stephen; Ison, Catherine A.; Marsh, Peter; Jalal, Hamid; Lewis, David A.; Unemo, Magnus; Clarke, Ian N.; Parkhill, Julian; Thomson, Nicholas R.

    2013-01-01

    The use of whole-genome sequencing as a tool for the study of infectious bacteria is of growing clinical interest. Chlamydia trachomatis is responsible for sexually transmitted infections and the blinding disease trachoma, which affect hundreds of millions of people worldwide. Recombination is widespread within the genome of C. trachomatis, thus whole-genome sequencing is necessary to understand the evolution, diversity, and epidemiology of this pathogen. Culture of C. trachomatis has, until now, been a prerequisite to obtain DNA for whole-genome sequencing; however, as C. trachomatis is an obligate intracellular pathogen, this procedure is technically demanding and time consuming. Discarded clinical samples represent a large resource for sequencing the genomes of pathogens, yet clinical swabs frequently contain very low levels of C. trachomatis DNA and large amounts of contaminating microbial and human DNA. To determine whether it is possible to obtain whole-genome sequences from bacteria without the need for culture, we have devised an approach that combines immunomagnetic separation (IMS) for targeted bacterial enrichment with multiple displacement amplification (MDA) for whole-genome amplification. Using IMS-MDA in conjunction with high-throughput multiplexed Illumina sequencing, we have produced the first whole bacterial genome sequences direct from clinical samples. We also show that this method can be used to generate genome data from nonviable archived samples. This method will prove a useful tool in answering questions relating to the biology of many difficult-to-culture or fastidious bacteria of clinical concern. PMID:23525359

  14. Whole-genome shotgun optical mapping of Rhodobacter sphaeroides strain 2.4. 1 and its use for whole-genome shotgun sequence assembly

    SciTech Connect

    Shou, S.; Kvikstad, E.; Kile, A.; Severin, J.; Forrest, D.; Runnheim, R.; Churas, C.; Hickman, J. W.; Mackenzie, C.; Choudhary, M.; Donohue, T.; Kaplan, S.; Schwartz, D. C.

    2003-09-01

    Rhodobacter sphaeroides 2.4.1 is a facultative photoheterotrophic bacterium with tremendous metabolic diversity, which has significantly contributed to our understanding of the molecular genetics of photosynthesis, photoheterotrophy, nitrogen fixation, hydrogen metabolism, carbon dioxide fixation, taxis, and tetrapyrrole biosynthesis. To further understand this remarkable bacterium, and to accelerate an ongoing sequencing project, two whole-genome restriction maps (EcoRI and HindIII) of R. sphaeroides strain 2.4.1 were constructed using shotgun optical mapping. The approach directly mapped genomic DNA by the random mapping of single molecules. The two maps were used to facilitate sequence assembly by providing an optical scaffold for high-resolution alignment and verification of sequence contigs. Our results show that such maps facilitated the closure of sequence gaps by the early detection of nascent sequence contigs during the course of the whole-genome shotgun sequencing process.

  15. Whole Genome Amplification of Labeled Viable Single Cells Suited for Array-Comparative Genomic Hybridization.

    PubMed

    Kroneis, Thomas; El-Heliebi, Amin

    2015-01-01

    Understanding details of a complex biological system makes it necessary to dismantle it down to its components. Immunostaining techniques allow identification of several distinct cell types thereby giving an inside view of intercellular heterogeneity. Often staining reveals that the most remarkable cells are the rarest. To further characterize the target cells on a molecular level, single cell techniques are necessary. Here, we describe the immunostaining, micromanipulation, and whole genome amplification of single cells for the purpose of genomic characterization. First, we exemplify the preparation of cell suspensions from cultured cells as well as the isolation of peripheral mononucleated cells from blood. The target cell population is then subjected to immunostaining. After cytocentrifugation target cells are isolated by micromanipulation and forwarded to whole genome amplification. For whole genome amplification, we use GenomePlex(®) technology allowing downstream genomic analysis such as array-comparative genomic hybridization.

  16. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    PubMed

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  17. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    PubMed

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities.

  18. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    PubMed

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. PMID:27006240

  19. Whole-Genome Sequences of Two Borrelia afzelii and Two Borrelia garinii Lyme Disease Agent Isolates

    PubMed Central

    Casjens, Sherwood R.; Mongodin, Emmanuel F.; Qiu, Wei-Gang; Dunn, John J.; Luft, Benjamin J.; Fraser-Liggett, Claire M.; Schutzer, Steve E.

    2011-01-01

    Human Lyme disease is commonly caused by several species of spirochetes in the Borrelia genus. In Eurasia these species are largely Borrelia afzelii, B. garinii, B. burgdorferi, and B. bavariensis sp. nov. Whole-genome sequencing is an excellent tool for investigating and understanding the influence of bacterial diversity on the pathogenesis and etiology of Lyme disease. We report here the whole-genome sequences of four isolates from two of the Borrelia species that cause human Lyme disease, B. afzelii isolates ACA-1 and PKo and B. garinii isolates PBr and Far04. PMID:22123755

  20. Whole-Genome Sequences of Two Borrelia afzelii and Two Borrelia garinii Lyme Disease Agent Isolates

    SciTech Connect

    Casjens, S.R.; Dunn, J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Fraser-Liggett, C. M.; Schutzer, S. E.

    2011-12-01

    Human Lyme disease is commonly caused by several species of spirochetes in the Borrelia genus. In Eurasia these species are largely Borrelia afzelii, B. garinii, B. burgdorferi, and B. bavariensis sp. nov. Whole-genome sequencing is an excellent tool for investigating and understanding the influence of bacterial diversity on the pathogenesis and etiology of Lyme disease. We report here the whole-genome sequences of four isolates from two of the Borrelia species that cause human Lyme disease, B. afzelii isolates ACA-1 and PKo and B. garinii isolates PBr and Far04.

  1. Whole-Genome Expression Analysis and Signal Pathway Screening of Synovium-Derived Mesenchymal Stromal Cells in Rheumatoid Arthritis

    PubMed Central

    Hou, Jingyi; Ouyang, Yi; Deng, Haiquan; Chen, Zhong; Song, Bin; Xie, Zhongyu; Wang, Peng; Li, Jinteng

    2016-01-01

    Synovium-derived mesenchymal stromal cells (SMSCs) may play an important role in the pathogenesis of rheumatoid arthritis (RA) and show promise for therapeutic applications in RA. In this study, a whole-genome microarray analysis was used to detect differential gene expression in SMSCs from RA patients and healthy donors (HDs). Our results showed that there were 4828 differentially expressed genes in the RA group compared to the HD group; 3117 genes were upregulated, and 1711 genes were downregulated. A Gene Ontology analysis showed significantly enriched terms of differentially expressed genes in the biological process, cellular component, and molecular function domains. A Kyoto Encyclopedia of Genes and Genomes analysis showed that the MAPK signaling and rheumatoid arthritis pathways were upregulated and that the p53 signaling pathway was downregulated in RA SMSCs. Quantitative real-time polymerase chain reaction was applied to verify the expression variations of the partial genes mentioned above, and a western blot analysis was used to determine the expression levels of p53, p-JNK, p-ERK, and p-p38. Our study found that differentially expressed genes in the MAPK signaling, rheumatoid arthritis, and p53 signaling pathways may help to explain the pathogenic mechanism of RA and lead to therapeutic RA SMSC applications.

  2. Whole-Genome Expression Analysis and Signal Pathway Screening of Synovium-Derived Mesenchymal Stromal Cells in Rheumatoid Arthritis.

    PubMed

    Hou, Jingyi; Ouyang, Yi; Deng, Haiquan; Chen, Zhong; Song, Bin; Xie, Zhongyu; Wang, Peng; Li, Jinteng; Li, Weiping; Yang, Rui

    2016-01-01

    Synovium-derived mesenchymal stromal cells (SMSCs) may play an important role in the pathogenesis of rheumatoid arthritis (RA) and show promise for therapeutic applications in RA. In this study, a whole-genome microarray analysis was used to detect differential gene expression in SMSCs from RA patients and healthy donors (HDs). Our results showed that there were 4828 differentially expressed genes in the RA group compared to the HD group; 3117 genes were upregulated, and 1711 genes were downregulated. A Gene Ontology analysis showed significantly enriched terms of differentially expressed genes in the biological process, cellular component, and molecular function domains. A Kyoto Encyclopedia of Genes and Genomes analysis showed that the MAPK signaling and rheumatoid arthritis pathways were upregulated and that the p53 signaling pathway was downregulated in RA SMSCs. Quantitative real-time polymerase chain reaction was applied to verify the expression variations of the partial genes mentioned above, and a western blot analysis was used to determine the expression levels of p53, p-JNK, p-ERK, and p-p38. Our study found that differentially expressed genes in the MAPK signaling, rheumatoid arthritis, and p53 signaling pathways may help to explain the pathogenic mechanism of RA and lead to therapeutic RA SMSC applications. PMID:27642302

  3. Natural genetic variation in whole-genome expression in Arabidopsis thaliana: the impact of physiological QTL introgression.

    PubMed

    Juenger, Thomas E; Wayne, Tierney; Boles, Sandra; Symonds, V Vaughan; McKay, John; Coughlan, Sean J

    2006-04-01

    A long-standing and fundamental question in biology is how genes influence complex phenotypes. Combining near-isogenic line mapping with genome expression profiling offers a unique opportunity for exploring the functional relationship between genotype and phenotype and for generating candidate genes for future study. We used a whole-genome microarray produced with ink-jet technology to measure the relative expression level of over 21,500 genes from an Arabidopsis thaliana near-isogenic line (NIL) and its recurrent parent. The NIL material contained two introgressions (bottom of chromosome II and top of chromosome III) of the Cvi-1 ecotype in a Ler-2 ecotype genome background. Each introgression 'captures' a Cvi allele of a physiological quantitative trait loci (QTL) that our previous studies have shown increases transpiration and reduces water-use efficiency at the whole-plant level. We used a mixed model anova framework for assessing sources of expression variability and for evaluating statistical significance in our array experiment. We discovered 25 differentially expressed genes in the introgression at a false-discovery rate (FDR) cut-off of 0.20 and identified new candidate genes for both QTL regions. Several differentially expressed genes were confirmed with QRT-PCR (quantitative reverse transcription-polymerase chain reaction) assays. In contrast, we found no statistically significant differentially expressed genes outside of the QTL introgressions after controlling for multiple tests. We discuss these results in the context of candidate genes, cloning QTL, and phenotypic evolution.

  4. Whole-Genome Expression Analysis and Signal Pathway Screening of Synovium-Derived Mesenchymal Stromal Cells in Rheumatoid Arthritis

    PubMed Central

    Hou, Jingyi; Ouyang, Yi; Deng, Haiquan; Chen, Zhong; Song, Bin; Xie, Zhongyu; Wang, Peng; Li, Jinteng

    2016-01-01

    Synovium-derived mesenchymal stromal cells (SMSCs) may play an important role in the pathogenesis of rheumatoid arthritis (RA) and show promise for therapeutic applications in RA. In this study, a whole-genome microarray analysis was used to detect differential gene expression in SMSCs from RA patients and healthy donors (HDs). Our results showed that there were 4828 differentially expressed genes in the RA group compared to the HD group; 3117 genes were upregulated, and 1711 genes were downregulated. A Gene Ontology analysis showed significantly enriched terms of differentially expressed genes in the biological process, cellular component, and molecular function domains. A Kyoto Encyclopedia of Genes and Genomes analysis showed that the MAPK signaling and rheumatoid arthritis pathways were upregulated and that the p53 signaling pathway was downregulated in RA SMSCs. Quantitative real-time polymerase chain reaction was applied to verify the expression variations of the partial genes mentioned above, and a western blot analysis was used to determine the expression levels of p53, p-JNK, p-ERK, and p-p38. Our study found that differentially expressed genes in the MAPK signaling, rheumatoid arthritis, and p53 signaling pathways may help to explain the pathogenic mechanism of RA and lead to therapeutic RA SMSC applications. PMID:27642302

  5. Whole-Genome Transcriptional Analysis of Chemolithoautotrophic Thiosulfate Oxidation by Thiobacillus denitrificans Under Aerobic vs. Denitrifying Conditions

    SciTech Connect

    Beller, H R; Letain, T E; Chakicherla, A; Kane, S R; Legler, T C; Coleman, M A

    2006-04-22

    Thiobacillus denitrificans is one of the few known obligate chemolithoautotrophic bacteria capable of energetically coupling thiosulfate oxidation to denitrification as well as aerobic respiration. As very little is known about the differential expression of genes associated with ke chemolithoautotrophic functions (such as sulfur-compound oxidation and CO2 fixation) under aerobic versus denitrifying conditions, we conducted whole-genome, cDNA microarray studies to explore this topic systematically. The microarrays identified 277 genes (approximately ten percent of the genome) as differentially expressed using Robust Multi-array Average statistical analysis and a 2-fold cutoff. Genes upregulated (ca. 6- to 150-fold) under aerobic conditions included a cluster of genes associated with iron acquisition (e.g., siderophore-related genes), a cluster of cytochrome cbb3 oxidase genes, cbbL and cbbS (encoding the large and small subunits of form I ribulose 1,5-bisphosphate carboxylase/oxygenase, or RubisCO), and multiple molecular chaperone genes. Genes upregulated (ca. 4- to 95-fold) under denitrifying conditions included nar, nir, and nor genes (associated respectively with nitrate reductase, nitrite reductase, and nitric oxide reductase, which catalyze successive steps of denitrification), cbbM (encoding form II RubisCO), and genes involved with sulfur-compound oxidation (including two physically separated but highly similar copies of sulfide:quinone oxidoreductase and of dsrC, associated with dissimilatory sulfite reductase). Among genes associated with denitrification, relative expression levels (i.e., degree of upregulation with nitrate) tended to decrease in the order nar > nir > nor > nos. Reverse transcription, quantitative PCR analysis was used to validate these trends.

  6. Animal selection for whole genome sequencing by quantifying the unique contribution of homozygous haplotypes sequenced

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Major whole genome sequencing projects promise to identify rare and causal variants within livestock species; however, the efficient selection of animals for sequencing remains a major problem within these surveys. The goal of this project was to develop a library of high accuracy genetic variants f...

  7. Software tool for the analysis and visualization of whole genome alignments

    2011-08-01

    GenomeVISTA is a tool which performs and displays pairwise and multiple whole genome DNA alignments. The tools provides a graphical user interface by which users can navigate alignments and multiple levels of resolution and get imformation about individual aligned regions. Users can load their own sequences into GenomeVISTA or view pre-computed alignments for genomes in the VISTA database.

  8. Whole-Genome Sequence of Aeromonas hydrophila Strain AH-1 (Serotype O11)

    PubMed Central

    Forn-Cuní, Gabriel; Tomás, Juan M.

    2016-01-01

    Aeromonas hydrophila is an emerging pathogen of aquatic and terrestrial animals, including humans. Here, we report the whole-genome sequence of the septicemic A. hydrophila AH-1 strain, belonging to the serotype O11, and the first mesophilic Aeromonas with surface layer (S-layer) to be sequenced. PMID:27587829

  9. Draft Whole-Genome Sequence of the Type Strain Bacillus aquimaris TF12T

    PubMed Central

    Hernández-González, Ismael L.

    2016-01-01

    Bacillus aquimaris TF12 is a Gram-positive bacteria isolated from a tidal flat of the Yellow Sea in South Korea. We report the draft whole-genome sequence of Bacillus aquimaris TF12, the type strain of a set of bacteria typically associated with marine habitats and with a potentially high biotechnology value. PMID:27417832

  10. Whole-Genome Analysis of Quorum-Sensing Burkholderia sp. Strain A9

    PubMed Central

    Chen, Jian Woon; Tee, Kok Keng; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Xin-Yue

    2015-01-01

    Burkholderia spp. rely on N-acyl homoserine lactone as quorum-sensing signal molecules which coordinate their phenotype at the population level. In this work, we present the whole genome of Burkholderia sp. strain A9, which enables the discovery of its N-acyl homoserine lactone synthase gene. PMID:25745000

  11. Clinical Application of Whole Genome Sequencing In Patients with Primary Immunodeficiency

    PubMed Central

    Mousallem, Talal; Urban, Thomas J.; McSweeney, K. Melodi; Kleinstein, Sarah E.; Zhu, Mingfu; Adeli, Mehdi; Parrott, Roberta E.; Roberts, Joseph L.; Krueger, Brian; Buckley, Rebecca H.; Goldstein, David B

    2016-01-01

    Summary This report illustrates the value of whole genome sequencing (WGS) in elucidating the genetic cause of disease in patients with primary immunodeficiency (PID). As sequencing costs decline, we predict that utilization of next generation sequencing (NGS) in the clinical setting will increase. PMID:25981738

  12. Whole-Genome Sequence of the Cheese Isolate Streptococcus macedonicus 679.

    PubMed

    Papadimitriou, Konstantinos; Mavrogonatou, Eleni; Bolotin, Alexander; Tsakalidou, Effie; Renault, Pierre

    2016-09-22

    It is well recognized that Streptococcus macedonicus can populate artisanal fermented foods, especially those of dairy origin. However, the safety of S. macedonicus remains to be established. Here, we present the whole-genome sequence of strain 679, which was isolated from a French uncooked semihard cheese made with cow milk.

  13. Whole-genome sequencing and cancer therapy: is too much ever enough?

    PubMed

    Garraway, Levi A; Baselga, José

    2012-09-01

    This issue of Cancer Discovery features an article that describes the use of whole-genome sequencing to discover an actionable genetic alteration that was not detected using a lower resolution diagnostic approach. This finding highlights the growing debate surrounding the optimal deployment of powerful new genomics technologies in the clinical oncology arena.

  14. Whole-Genome Sequence of Aeromonas hydrophila Strain AH-1 (Serotype O11).

    PubMed

    Forn-Cuní, Gabriel; Tomás, Juan M; Merino, Susana

    2016-09-01

    Aeromonas hydrophila is an emerging pathogen of aquatic and terrestrial animals, including humans. Here, we report the whole-genome sequence of the septicemic A. hydrophila AH-1 strain, belonging to the serotype O11, and the first mesophilic Aeromonas with surface layer (S-layer) to be sequenced.

  15. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    PubMed Central

    Chan, Kok-Gan; Yin, Wai-Fong; Chan, Xin-Yue

    2015-01-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000. PMID:26981378

  16. Whole-Genome Sequences of Nonencapsulated Haemophilus influenzae Strains Isolated in Italy

    PubMed Central

    Giufrè, Maria; De Chiara, Matteo; Censini, Stefano; Guidotti, Silvia; Torricelli, Giulia; De Angelis, Gabriella; Cardines, Rita; Pizza, Mariagrazia; Muzzi, Alessandro; Soriani, Marco

    2015-01-01

    Haemophilus influenzae is an important human pathogen involved in invasive disease. Here, we report the whole-genome sequences of 11 nonencapsulated H. influenzae (ncHi) strains isolated from both invasive disease and healthy carriers in Italy. This genomic information will enrich our understanding of the molecular basis of ncHi pathogenesis. PMID:25814593

  17. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity.

    PubMed

    Chan, Kok-Gan; Yin, Wai-Fong; Chan, Xin-Yue

    2016-03-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  18. Whole-genome sequencing of Borrelia garinii BgVir, isolated from Taiga ticks (Ixodes persulcatus).

    PubMed

    Brenner, Evgeniy V; Kurilshikov, Alexander M; Stronin, Oleg V; Fomenko, Nataliya V

    2012-10-01

    Most Lyme borreliosis cases in Russia result from Borrelia garinii NT29 group infection. Borrelias of this group circulate exclusively in Ixodes persulcatus ticks, which are seldom found beyond Russia and the far east. Here we report the whole-genome sequence of Borrelia garinii BgVir isolated from an I. persulcatus female.

  19. Analysis of genetic systems using experimental evolution and whole-genome sequencing

    PubMed Central

    Hegreness, Matthew; Kishony, Roy

    2007-01-01

    The application of whole-genome sequencing to the study of microbial evolution promises to reveal the complex functional networks of mutations that underlie adaptation. A recent study of parallel evolution in populations of Escherichia coli shows how adaptation involves both functional changes to specific proteins as well as global changes in regulation. PMID:17274841

  20. Draft Whole-Genome Sequences of 10 Enterotoxigenic Escherichia coli Serogroup O6 Strains.

    PubMed

    Pattabiraman, Vaishnavi; Bopp, Cheryl A

    2015-06-04

    Enterotoxigenic Escherichia coli (ETEC) is an important cause of diarrhea in children under the age of 5 years and in adults living in developing countries, as well as in travelers to these countries. In this announcement, we release the draft whole-genome sequences of 10 ETEC serogroup O6 strains.

  1. Draft Whole-Genome Sequences of 10 Enterotoxigenic Escherichia coli Serogroup O6 Strains

    PubMed Central

    Bopp, Cheryl A.

    2015-01-01

    Enterotoxigenic Escherichia coli (ETEC) is an important cause of diarrhea in children under the age of 5 years and in adults living in developing countries, as well as in travelers to these countries. In this announcement, we release the draft whole-genome sequences of 10 ETEC serogroup O6 strains. PMID:26044422

  2. Whole-genome resequencing: changing the paradigms of SNP detection, molecular mapping and gene discovery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The next generation sequencing (NGS) technologies have opened a wealth of opportunities for plant breeding and genomics research, and changed the paradigms of marker detection, genotyping, and gene discovery. Abundant genomic resources have been generated using a whole genome resequencing (WGR) str...

  3. Whole-Genome Sequencing Detection of Ongoing Listeria Contamination at a Restaurant, Rhode Island, USA, 2014.

    PubMed

    Barkley, Jonathan S; Gosciminski, Michael; Miller, Adam

    2016-08-01

    In November 2014, the Rhode Island Department of Health investigated a cluster of 3 listeriosis cases. Using whole-genome sequencing to support epidemiologic, laboratory, and environmental investigations, the department identified 1 restaurant as the likely source of the outbreak and also linked the establishment to a listeriosis case that occurred in 2013. PMID:27434089

  4. Whole-Genome Shotgun Sequencing of Lactobacillus rhamnosus MTCC 5462, a Strain with Probiotic Potential

    PubMed Central

    Prajapati, J. B.; Khedkar, C. D.; Chitra, J.; Suja, Senan; Mishra, V.; Sreeja, V.; Patel, R. K.; Ahir, V. B.; Bhatt, V. D.; Sajnani, M. R.; Koringa, P. G.; Joshi, C. G.

    2012-01-01

    Lactobacillus rhamnosus MTCC 5462 was isolated from infant gastrointestinal flora. The strain exhibited an ability to reduce cholesterol and stimulate immunity. The strain has exhibited positive results in alleviating gastrointestinal discomfort and good potential as a probiotic. We sequenced the whole genome of the strain and compared it to the published genome sequence of Lactobacillus rhamnosus GG (ATCC 53103). PMID:22328760

  5. Whole-genome shotgun sequencing of Lactobacillus rhamnosus MTCC 5462, a strain with probiotic potential.

    PubMed

    Prajapati, J B; Khedkar, C D; Chitra, J; Suja, Senan; Mishra, V; Sreeja, V; Patel, R K; Ahir, V B; Bhatt, V D; Sajnani, M R; Jakhesara, S J; Koringa, P G; Joshi, C G

    2012-03-01

    Lactobacillus rhamnosus MTCC 5462 was isolated from infant gastrointestinal flora. The strain exhibited an ability to reduce cholesterol and stimulate immunity. The strain has exhibited positive results in alleviating gastrointestinal discomfort and good potential as a probiotic. We sequenced the whole genome of the strain and compared it to the published genome sequence of Lactobacillus rhamnosus GG (ATCC 53103). PMID:22328760

  6. Whole-Genome Sequence of the Cheese Isolate Streptococcus macedonicus 679

    PubMed Central

    Mavrogonatou, Eleni; Bolotin, Alexander; Tsakalidou, Effie

    2016-01-01

    It is well recognized that Streptococcus macedonicus can populate artisanal fermented foods, especially those of dairy origin. However, the safety of S. macedonicus remains to be established. Here, we present the whole-genome sequence of strain 679, which was isolated from a French uncooked semihard cheese made with cow milk. PMID:27660795

  7. Whole-Genome Sequence of Aeromonas hydrophila Strain AH-1 (Serotype O11).

    PubMed

    Forn-Cuní, Gabriel; Tomás, Juan M; Merino, Susana

    2016-01-01

    Aeromonas hydrophila is an emerging pathogen of aquatic and terrestrial animals, including humans. Here, we report the whole-genome sequence of the septicemic A. hydrophila AH-1 strain, belonging to the serotype O11, and the first mesophilic Aeromonas with surface layer (S-layer) to be sequenced. PMID:27587829

  8. Whole-Genome Sequence of the Cheese Isolate Streptococcus macedonicus 679.

    PubMed

    Papadimitriou, Konstantinos; Mavrogonatou, Eleni; Bolotin, Alexander; Tsakalidou, Effie; Renault, Pierre

    2016-01-01

    It is well recognized that Streptococcus macedonicus can populate artisanal fermented foods, especially those of dairy origin. However, the safety of S. macedonicus remains to be established. Here, we present the whole-genome sequence of strain 679, which was isolated from a French uncooked semihard cheese made with cow milk. PMID:27660795

  9. Draft Whole-Genome Sequence of the Type Strain Bacillus aquimaris TF12T.

    PubMed

    Hernández-González, Ismael L; Olmedo-Álvarez, Gabriela

    2016-07-14

    Bacillus aquimaris TF12 is a Gram-positive bacteria isolated from a tidal flat of the Yellow Sea in South Korea. We report the draft whole-genome sequence of Bacillus aquimaris TF12, the type strain of a set of bacteria typically associated with marine habitats and with a potentially high biotechnology value.

  10. Draft Whole-Genome Sequence of the Type Strain Bacillus horikoshii DSM 8719

    PubMed Central

    Hernández-González, Ismael L.

    2016-01-01

    Members of the Bacillus genus have been extensively studied because of their ability to produce enzymes with high biotechnological value. Here, we report the draft of the whole-genome sequence of the type strain Bacillus horikoshii DSM 8719, an alkali-tolerant strain. PMID:27417833

  11. Draft Whole-Genome Sequence of the Type Strain Bacillus horikoshii DSM 8719.

    PubMed

    Hernández-González, Ismael L; Olmedo-Álvarez, Gabriela

    2016-07-14

    Members of the Bacillus genus have been extensively studied because of their ability to produce enzymes with high biotechnological value. Here, we report the draft of the whole-genome sequence of the type strain Bacillus horikoshii DSM 8719, an alkali-tolerant strain.

  12. Whole-Genome Sequencing Detection of Ongoing Listeria Contamination at a Restaurant, Rhode Island, USA, 2014

    PubMed Central

    Gosciminski, Michael; Miller, Adam

    2016-01-01

    In November 2014, the Rhode Island Department of Health investigated a cluster of 3 listeriosis cases. Using whole-genome sequencing to support epidemiologic, laboratory, and environmental investigations, the department identified 1 restaurant as the likely source of the outbreak and also linked the establishment to a listeriosis case that occurred in 2013. PMID:27434089

  13. Whole-Genome Sequences of Nonencapsulated Haemophilus influenzae Strains Isolated in Italy.

    PubMed

    Giufrè, Maria; De Chiara, Matteo; Censini, Stefano; Guidotti, Silvia; Torricelli, Giulia; De Angelis, Gabriella; Cardines, Rita; Pizza, Mariagrazia; Muzzi, Alessandro; Cerquetti, Marina; Soriani, Marco

    2015-01-01

    Haemophilus influenzae is an important human pathogen involved in invasive disease. Here, we report the whole-genome sequences of 11 nonencapsulated H. influenzae (ncHi) strains isolated from both invasive disease and healthy carriers in Italy. This genomic information will enrich our understanding of the molecular basis of ncHi pathogenesis.

  14. Diagnosis of Capnocytophaga canimorsus Sepsis by Whole-Genome Next-Generation Sequencing

    PubMed Central

    Abril, Maria K.; Barnett, Adam S.; Wegermann, Kara; Fountain, Eric; Strand, Andrew; Heyman, Benjamin M.; Blough, Britton A.; Swaminathan, Aparna C.; Sharma-Kuinkel, Batu; Ruffin, Felicia; Alexander, Barbara D.; McCall, Chad M.; Costa, Sylvia F.; Arcasoy, Murat O.; Hong, David K.; Blauwkamp, Timothy A.; Kertesz, Michael; Fowler, Vance G.; Kraft, Bryan D.

    2016-01-01

    We report the case of a 60-year-old man with septic shock due to Capnocytophaga canimorsus that was diagnosed in 24 hours by a novel whole-genome next-generation sequencing assay. This technology shows great promise in identifying fastidious pathogens, and, if validated, it has profound implications for infectious disease diagnosis. PMID:27704003

  15. Spiked GBS: A unified, open platform for single marker genotyping and whole-genome profiling

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In plant breeding, there are two primary applications for DNA markers in selection: 1) selection of known genes using a single marker assay (marker-assisted selection; MAS); and 2) whole-genome profiling and prediction (genomic selection; GS). Typically, marker platforms have addressed only one of t...

  16. Isolation and enrichment of Cryptosporidium DNA and verification of DNA purity for whole-genome sequencing.

    PubMed

    Guo, Yaqiong; Li, Na; Lysén, Colleen; Frace, Michael; Tang, Kevin; Sammons, Scott; Roellig, Dawn M; Feng, Yaoyu; Xiao, Lihua

    2015-02-01

    Whole-genome sequencing of Cryptosporidium spp. is hampered by difficulties in obtaining sufficient, highly pure genomic DNA from clinical specimens. In this study, we developed procedures for the isolation and enrichment of Cryptosporidium genomic DNA from fecal specimens and verification of DNA purity for whole-genome sequencing. The isolation and enrichment of genomic DNA were achieved by a combination of three oocyst purification steps and whole-genome amplification (WGA) of DNA from purified oocysts. Quantitative PCR (qPCR) analysis of WGA products was used as an initial quality assessment of amplified genomic DNA. The purity of WGA products was assessed by Sanger sequencing of cloned products. Next-generation sequencing tools were used in final evaluations of genome coverage and of the extent of contamination. Altogether, 24 fecal specimens of Cryptosporidium parvum, C. hominis, C. andersoni, C. ubiquitum, C. tyzzeri, and Cryptosporidium chipmunk genotype I were processed with the procedures. As expected, WGA products with low (<16.0) threshold cycle (CT) values yielded mostly Cryptosporidium sequences in Sanger sequencing. The cloning-sequencing analysis, however, showed significant contamination in 5 WGA products (proportion of positive colonies derived from Cryptosporidium genomic DNA, ≤25%). Following this strategy, 20 WGA products from six Cryptosporidium species or genotypes with low (mostly <14.0) CT values were submitted to whole-genome sequencing, generating sequence data covering 94.5% to 99.7% of Cryptosporidium genomes, with mostly minor contamination from bacterial, fungal, and host DNA. These results suggest that the described strategy can be used effectively for the isolation and enrichment of Cryptosporidium DNA from fecal specimens for whole-genome sequencing.

  17. Whole-genome conditional two-locus analysis identifies novel candidate genes for late-onset Parkinson's disease.

    PubMed

    González-Pérez, A; Gayán, J; Marín, J; Galán, J J; Sáez, M E; Real, L M; Antúnez, C; Ruiz, A

    2009-07-01

    Whole-genome epistasis analysis may add a new layer of knowledge to whole-genome association studies, permitting the identification of new candidate genes which are completely transparent during conventional single-locus analysis. We present the first whole-genome conditional two-locus analysis in Parkinson's disease (PD). We scanned the entire genome and selected markers that interacted with a set of well-known loci previously associated to PD (SNCA, Parkin, LRRK2, UCHL1, DJ-1, PINK and MAPT). Our work describes several loci potentially related to PD risk which interact with SNCA, PARK1 and LRRK2 markers. We propose conditional whole-genome two-locus association analysis as a valuable method that might be helpful in re-analysing and re-interpreting data from whole-genome association studies.

  18. Whole genome sequencing of Mycobacterium tuberculosis SB24 isolated from Sabah, Malaysia.

    PubMed

    Philip, Noraini; Rodrigues, Kenneth Francis; William, Timothy; John, Daisy Vanitha

    2016-09-01

    Mycobacterium tuberculosis (M. tuberculosis) is the causative agent of tuberculosis (TB) that causes millions of death every year. We have sequenced the genome of M. tuberculosis isolated from cerebrospinal fluid (CSF) of a patient diagnosed with tuberculous meningitis (TBM). The isolated strain was referred as M. tuberculosis SB24. Genomic DNA of the M. tuberculosis SB24 was extracted and subjected to whole genome sequencing using PacBio platform. The draft genome size of M. tuberculosis SB24 was determined to be 4,452,489 bp with a G + C content of 65.6%. The whole genome shotgun project has been deposited in NCBI SRA under the accession number SRP076503. PMID:27556011

  19. Whole genome multilocus sequence typing as an epidemiologic tool for Yersinia pestis.

    PubMed

    Kingry, Luke C; Rowe, Lori A; Respicio-Kingry, Laurel B; Beard, Charles B; Schriefer, Martin E; Petersen, Jeannine M

    2016-04-01

    Human plague is a severe and often fatal zoonotic disease caused by Yersinia pestis. For public health investigations of human cases, nonintensive whole genome molecular typing tools, capable of defining epidemiologic relationships, are advantageous. Whole genome multilocus sequence typing (wgMLST) is a recently developed methodology that simplifies genomic analyses by transforming millions of base pairs of sequence into character data for each gene. We sequenced 13 US Y. pestis isolates with known epidemiologic relationships. Sequences were assembled de novo, and multilocus sequence typing alleles were assigned by comparison against 3979 open reading frames from the reference strain CO92. Allele-based cluster analysis accurately grouped the 13 isolates, as well as 9 publicly available Y. pestis isolates, by their epidemiologic relationships. Our findings indicate wgMLST is a simplified, sensitive, and scalable tool for epidemiologic analysis of Y. pestis strains. PMID:26778487

  20. A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease.

    PubMed

    Smedley, Damian; Schubach, Max; Jacobsen, Julius O B; Köhler, Sebastian; Zemojtel, Tomasz; Spielmann, Malte; Jäger, Marten; Hochheiser, Harry; Washington, Nicole L; McMurry, Julie A; Haendel, Melissa A; Mungall, Christopher J; Lewis, Suzanna E; Groza, Tudor; Valentini, Giorgio; Robinson, Peter N

    2016-09-01

    The interpretation of non-coding variants still constitutes a major challenge in the application of whole-genome sequencing in Mendelian disease, especially for single-nucleotide and other small non-coding variants. Here we present Genomiser, an analysis framework that is able not only to score the relevance of variation in the non-coding genome, but also to associate regulatory variants to specific Mendelian diseases. Genomiser scores variants through either existing methods such as CADD or a bespoke machine learning method and combines these with allele frequency, regulatory sequences, chromosomal topological domains, and phenotypic relevance to discover variants associated to specific Mendelian disorders. Overall, Genomiser is able to identify causal regulatory variants as the top candidate in 77% of simulated whole genomes, allowing effective detection and discovery of regulatory variants in Mendelian disease. PMID:27569544

  1. Real-time investigation of a Legionella pneumophila outbreak using whole genome sequencing.

    PubMed

    Graham, R M A; Doyle, C J; Jennison, A V

    2014-11-01

    Legionella pneumophila is the main pathogen responsible for outbreaks of Legionnaires' disease, which can be related to contaminated water supplies such as cooling towers or water pipes. We combined conventional molecular methods and whole genome sequence (WGS) analysis to investigate an outbreak of L. pneumophila in a large Australian hospital. Typing of these isolates using sequence-based typing and virulence gene profiling, was unable to discriminate between outbreak and non-outbreak isolates. WGS analysis was performed on isolates during the outbreak, as well as on unlinked isolates from the Public Health Microbiology reference collection. The more powerful resolution provided by analysis of whole genome sequences allowed outbreak isolates to be distinguished from isolates that were temporally and spatially unassociated with the outbreak, demonstrating that this technology can be used in real-time to investigate L. pneumophila outbreaks. PMID:24576553

  2. A whole-genome, radiation hybrid mapping resource of hexaploid wheat.

    PubMed

    Tiwari, Vijay K; Heesacker, Adam; Riera-Lizarazu, Oscar; Gunn, Hilary; Wang, Shichen; Wang, Yi; Gu, Young Q; Paux, Etienne; Koo, Dal-Hoe; Kumar, Ajay; Luo, Ming-Cheng; Lazo, Gerard; Zemetra, Robert; Akhunov, Eduard; Friebe, Bernd; Poland, Jesse; Gill, Bikram S; Kianian, Shahryar; Leonard, Jeffrey M

    2016-04-01

    Generating a contiguous, ordered reference sequence of a complex genome such as hexaploid wheat (2n = 6x = 42; approximately 17 GB) is a challenging task due to its large, highly repetitive, and allopolyploid genome. In wheat, ordering of whole-genome or hierarchical shotgun sequencing contigs is primarily based on recombination and comparative genomics-based approaches. However, comparative genomics approaches are limited to syntenic inference and recombination is suppressed within the pericentromeric regions of wheat chromosomes, thus, precise ordering of physical maps and sequenced contigs across the whole-genome using these approaches is nearly impossible. We developed a whole-genome radiation hybrid (WGRH) resource and tested it by genotyping a set of 115 randomly selected lines on a high-density single nucleotide polymorphism (SNP) array. At the whole-genome level, 26 299 SNP markers were mapped on the RH panel and provided an average mapping resolution of approximately 248 Kb/cR1500 with a total map length of 6866 cR1500 . The 7296 unique mapping bins provided a five- to eight-fold higher resolution than genetic maps used in similar studies. Most strikingly, the RH map had uniform bin resolution across the entire chromosome(s), including pericentromeric regions. Our research provides a valuable and low-cost resource for anchoring and ordering sequenced BAC and next generation sequencing (NGS) contigs. The WGRH developed for reference wheat line Chinese Spring (CS-WGRH), will be useful for anchoring and ordering sequenced BAC and NGS based contigs for assembling a high-quality, reference sequence of hexaploid wheat. Additionally, this study provides an excellent model for developing similar resources for other polyploid species.

  3. Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors

    PubMed Central

    MacLeod, Iona M.; Larkin, Denis M.; Lewin, Harris A.; Hayes, Ben J.; Goddard, Mike E.

    2013-01-01

    Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (Ne) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493–496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in Ne. The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the Ne around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with Ne of between 3,500 and 6,000. The most recent reduction of Ne to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals. PMID:23842528

  4. Whole-Genome Sequence of Chlamydia gallinacea Type Strain 08-1274/3

    PubMed Central

    Hölzer, Martin; Laroucau, Karine; Creasy, Heather Huot; Ott, Sandra; Vorimore, Fabien; Bavoil, Patrik M.; Marz, Manja

    2016-01-01

    The recently introduced bacterial species Chlamydia gallinacea is known to occur in domestic poultry and other birds. Its potential as an avian pathogen and zoonotic agent is under investigation. The whole-genome sequence of its type strain, 08-1274/3, consists of a 1,059,583-bp chromosome with 914 protein-coding sequences (CDSs) and a plasmid (p1274) comprising 7,619 bp with 9 CDSs. PMID:27445388

  5. Detection and phylogenetic assessment of conserved synteny derived from whole genome duplications.

    PubMed

    Kuraku, Shigehiro; Meyer, Axel

    2012-01-01

    Identification of intragenomic conservation of gene compositions in multiple chromosomal segments led to evidence of whole genome (WGDs) duplications. The process by which WGDs have been maintained and decayed provides us with clues for understanding how the genome evolves. In this chapter, we summarize current understanding of phylogenetic distribution and evolutionary impact of WGDs, introduce basic procedures to detect conserved synteny, and discuss typical pitfalls, as well as biological insights. PMID:22407717

  6. Whole-Genome Sequence of Rummeliibacillus stabekisii Strain PP9 Isolated from Antarctic Soil

    PubMed Central

    da Mota, Fábio Faria; Vollú, Renata Estebanez; Jurelevicius, Diogo

    2016-01-01

    The whole genome of Rummeliibacillus stabekisii PP9, isolated from a soil sample from Antarctica, consists of a circular chromosome of 3,412,092 bp and a circular plasmid of 8,647 bp, with 3,244 protein-coding genes, 12 copies of the 16S-23S-5S rRNA operon, 101 tRNA genes, and 6 noncoding RNAs (ncRNAs). PMID:27231360

  7. Whole-Genome Sequencing in Microbial Forensic Analysis of Gamma-Irradiated Microbial Materials

    PubMed Central

    Broomall, Stacey M.; Ait Ichou, Mohamed; Krepps, Michael D.; Johnsky, Lauren A.; Karavis, Mark A.; Hubbard, Kyle S.; Insalaco, Joseph M.; Betters, Janet L.; Redmond, Brady W.; Rivers, Bryan A.; Liem, Alvin T.; Hill, Jessica M.; Fochler, Edward T.; Roth, Pierce A.; Rosenzweig, C. Nicole; Skowronski, Evan W.

    2015-01-01

    Effective microbial forensic analysis of materials used in a potential biological attack requires robust methods of morphological and genetic characterization of the attack materials in order to enable the attribution of the materials to potential sources and to exclude other potential sources. The genetic homogeneity and potential intersample variability of many of the category A to C bioterrorism agents offer a particular challenge to the generation of attributive signatures, potentially requiring whole-genome or proteomic approaches to be utilized. Currently, irradiation of mail is standard practice at several government facilities judged to be at particularly high risk. Thus, initial forensic signatures would need to be recovered from inactivated (nonviable) material. In the study described in this report, we determined the effects of high-dose gamma irradiation on forensic markers of bacterial biothreat agent surrogate organisms with a particular emphasis on the suitability of genomic DNA (gDNA) recovered from such sources as a template for whole-genome analysis. While irradiation of spores and vegetative cells affected the retention of Gram and spore stains and sheared gDNA into small fragments, we found that irradiated material could be utilized to generate accurate whole-genome sequence data on the Illumina and Roche 454 sequencing platforms. PMID:26567301

  8. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    PubMed Central

    Shi, Jingsong; Jiang, Song; Qiu, Dandan; Le, Weibo; Wang, Xiao; Lu, Yinhui; Liu, Zhihong

    2016-01-01

    Objective. To investigate potential drugs for diabetic nephropathy (DN) using whole-genome expression profiles and the Connectivity Map (CMAP). Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs) between late stage and early stage DN samples and the CMAP database were used to identify potential drugs for DN using bioinformatics methods. Results. (1) A total of 1065 DEGs (FDR < 0.05 and fold change > 1.5) were found in late stage DN patients compared with early stage DN patients. (2) Piperlongumine, 15d-PGJ2 (15-delta prostaglandin J2), vorinostat, and trichostatin A were predicted to be the most promising potential drugs for DN, acting as NF-κB inhibitors, histone deacetylase inhibitors (HDACIs), PI3K pathway inhibitors, or PPARγ agonists, respectively. Conclusion. Using whole-genome expression profiles and the CMAP database, we rapidly predicted potential DN drugs, and therapeutic potential was confirmed by previously published studies. Animal experiments and clinical trials are needed to confirm both the safety and efficacy of these drugs in the treatment of DN. PMID:27069916

  9. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    PubMed Central

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-01-01

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. PMID:27172215

  10. Whole-genome sequencing and the clinician: a tale of two cities

    PubMed Central

    Foley, A Reghan; Pitceathly, Robert D S; He, Jie; Kim, Jihee; Pearson, Nathaniel M; Muntoni, Francesco; Hanna, Michael G

    2014-01-01

    Background Clinicians are faced with unprecedented opportunities to identify the genetic aetiologies of hitherto molecularly uncharacterised conditions via the use of high-throughput sequencing. Access to genomic technology and resultant data is no longer limited to clinicians, geneticists and bioinformaticians, however; ongoing commercialisation gives patients themselves ever greater access to sequencing services. We report an increasingly common medical scenario by describing two neuromuscular patients—a mother and adult son—whose consumer access to whole-genome sequencing affected their diagnostic journey. Results Whole-genome sequencing initiated by the patients—to predict their risk of common diseases—revealed that they share several variants potentially relevant to neuromuscular diseases, which initially sidetracked diagnostic efforts. Since eventual clinical reassessment, including muscle imaging, pointed towards Bethlem myopathy, a collagen VI-related myopathy, we pursued Sanger sequencing of COL6A1, COL6A2 and COL6A3. This targeted approach revealed a heterozygous causative variant in COL6A3 (c.6365G>T (p.Gly2122Val)), shared by both individuals, that was not flagged by the interpretation of the whole-genome sequencing data. Conclusions This report highlights the essential interplay of clinical and genomic expertise in realising the potential of high-throughput sequencing. In an era when patients themselves may bring their own data to the table, definitively identifying clinically significant genomic variants will require close collaboration among clinicians, geneticists and bioinformaticians. PMID:24706943

  11. FIGG: Simulating populations of whole genome sequences for heterogeneous data analyses

    PubMed Central

    2014-01-01

    Background High-throughput sequencing has become one of the primary tools for investigation of the molecular basis of disease. The increasing use of sequencing in investigations that aim to understand both individuals and populations is challenging our ability to develop analysis tools that scale with the data. This issue is of particular concern in studies that exhibit a wide degree of heterogeneity or deviation from the standard reference genome. The advent of population scale sequencing studies requires analysis tools that are developed and tested against matching quantities of heterogeneous data. Results We developed a large-scale whole genome simulation tool, FIGG, which generates large numbers of whole genomes with known sequence characteristics based on direct sampling of experimentally known or theorized variations. For normal variations we used publicly available data to determine the frequency of different mutation classes across the genome. FIGG then uses this information as a background to generate new sequences from a parent sequence with matching frequencies, but different actual mutations. The background can be normal variations, known disease variations, or a theoretical frequency distribution of variations. Conclusion In order to enable the creation of large numbers of genomes, FIGG generates simulated sequences from known genomic variation and iteratively mutates each genome separately. The result is multiple whole genome sequences with unique variations that can primarily be used to provide different reference genomes, model heterogeneous populations, and can offer a standard test environment for new analysis algorithms or bioinformatics tools. PMID:24885193

  12. Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

    PubMed

    Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

    2016-09-01

    Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. PMID:27307293

  13. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data

    PubMed Central

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences. PMID:27611682

  14. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data.

    PubMed

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences.

  15. Mudi, a web tool for identifying mutations by bioinformatics analysis of whole-genome sequence.

    PubMed

    Iida, Naoko; Yamao, Fumiaki; Nakamura, Yasukazu; Iida, Tetsushi

    2014-06-01

    In forward genetics, identification of mutations is a time-consuming and laborious process. Modern whole-genome sequencing, coupled with bioinformatics analysis, has enabled fast and cost-effective mutation identification. However, for many experimental researchers, bioinformatics analysis is still a difficult aspect of whole-genome sequencing. To address this issue, we developed a browser-accessible and easy-to-use bioinformatics tool called Mutation discovery (Mudi; http://naoii.nig.ac.jp/mudi_top.html), which enables 'one-click' identification of causative mutations from whole-genome sequence data. In this study, we optimized Mudi for pooled-linkage analysis aimed at identifying mutants in yeast model systems. After raw sequencing data are uploaded, Mudi performs sequential analysis, including mapping, detection of variant alleles, filtering and removal of background polymorphisms, prioritization, and annotation. In an example study of suppressor mutants of ptr1-1 in the fission yeast Schizosaccharomyces pombe, pooled-linkage analysis with Mudi identified mip1(+) , a component of Target of Rapamycin Complex 1 (TORC1), as a novel component involved in RNA interference (RNAi)-related cell-cycle control. The accessibility of Mudi will accelerate systematic mutation analysis in forward genetics.

  16. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    PubMed

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  17. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data.

    PubMed

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences. PMID:27611682

  18. Assessment of Whole Genome Amplification for Sequence Capture and Massively Parallel Sequencing

    PubMed Central

    Hasmats, Johanna; Gréen, Henrik; Orear, Cedric; Validire, Pierre; Huss, Mikael; Käller, Max; Lundeberg, Joakim

    2014-01-01

    Exome sequence capture and massively parallel sequencing can be combined to achieve inexpensive and rapid global analyses of the functional sections of the genome. The difficulties of working with relatively small quantities of genetic material, as may be necessary when sharing tumor biopsies between collaborators for instance, can be overcome using whole genome amplification. However, the potential drawbacks of using a whole genome amplification technology based on random primers in combination with sequence capture followed by massively parallel sequencing have not yet been examined in detail, especially in the context of mutation discovery in tumor material. In this work, we compare mutations detected in sequence data for unamplified DNA, whole genome amplified DNA, and RNA originating from the same tumor tissue samples from 16 patients diagnosed with non-small cell lung cancer. The results obtained provide a comprehensive overview of the merits of these techniques for mutation analysis. We evaluated the identified genetic variants, and found that most (74%) of them were observed in both the amplified and the unamplified sequence data. Eighty-nine percent of the variations found by WGA were shared with unamplified DNA. We demonstrate a strategy for avoiding allelic bias by including RNA-sequencing information. PMID:24409309

  19. Assessment of whole genome amplification for sequence capture and massively parallel sequencing.

    PubMed

    Hasmats, Johanna; Gréen, Henrik; Orear, Cedric; Validire, Pierre; Huss, Mikael; Käller, Max; Lundeberg, Joakim

    2014-01-01

    Exome sequence capture and massively parallel sequencing can be combined to achieve inexpensive and rapid global analyses of the functional sections of the genome. The difficulties of working with relatively small quantities of genetic material, as may be necessary when sharing tumor biopsies between collaborators for instance, can be overcome using whole genome amplification. However, the potential drawbacks of using a whole genome amplification technology based on random primers in combination with sequence capture followed by massively parallel sequencing have not yet been examined in detail, especially in the context of mutation discovery in tumor material. In this work, we compare mutations detected in sequence data for unamplified DNA, whole genome amplified DNA, and RNA originating from the same tumor tissue samples from 16 patients diagnosed with non-small cell lung cancer. The results obtained provide a comprehensive overview of the merits of these techniques for mutation analysis. We evaluated the identified genetic variants, and found that most (74%) of them were observed in both the amplified and the unamplified sequence data. Eighty-nine percent of the variations found by WGA were shared with unamplified DNA. We demonstrate a strategy for avoiding allelic bias by including RNA-sequencing information.

  20. Whole-genome sequencing of uropathogenic Escherichia coli reveals long evolutionary history of diversity and virulence.

    PubMed

    Lo, Yancy; Zhang, Lixin; Foxman, Betsy; Zöllner, Sebastian

    2015-08-01

    Uropathogenic Escherichia coli (UPEC) are phenotypically and genotypically very diverse. This diversity makes it challenging to understand the evolution of UPEC adaptations responsible for causing urinary tract infections (UTI). To gain insight into the relationship between evolutionary divergence and adaptive paths to uropathogenicity, we sequenced at deep coverage (190×) the genomes of 19 E. coli strains from urinary tract infection patients from the same geographic area. Our sample consisted of 14 UPEC isolates and 5 non-UTI-causing (commensal) rectal E. coli isolates. After identifying strain variants using de novo assembly-based methods, we clustered the strains based on pairwise sequence differences using a neighbor-joining algorithm. We examined evolutionary signals on the whole-genome phylogeny and contrasted these signals with those found on gene trees constructed based on specific uropathogenic virulence factors. The whole-genome phylogeny showed that the divergence between UPEC and commensal E. coli strains without known UPEC virulence factors happened over 32 million generations ago. Pairwise diversity between any two strains was also high, suggesting multiple genetic origins of uropathogenic strains in a small geographic region. Contrasting the whole-genome phylogeny with three gene trees constructed from common uropathogenic virulence factors, we detected no selective advantage of these virulence genes over other genomic regions. These results suggest that UPEC acquired uropathogenicity long time ago and used it opportunistically to cause extraintestinal infections.

  1. Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.

    PubMed

    Wong, Lai-Ping; Lai, Jason Kuan-Han; Saw, Woei-Yuh; Ong, Rick Twee-Hee; Cheng, Anthony Youzhi; Pillai, Nisha Esakimuthu; Liu, Xuanyao; Xu, Wenting; Chen, Peng; Foo, Jia-Nee; Tan, Linda Wei-Lin; Koo, Seok-Hwee; Soong, Richie; Wenk, Markus Rene; Lim, Wei-Yen; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2014-05-01

    South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.

  2. Whole genome transcription profiling of Anaplasma phagocytophilum in human and tick host cells by tiling array analysis

    PubMed Central

    Nelson, Curtis M; Herron, Michael J; Felsheim, Roderick F; Schloeder, Brian R; Grindle, Suzanne M; Chavez, Adela Oliva; Kurtti, Timothy J; Munderloh, Ulrike G

    2008-01-01

    Background Anaplasma phagocytophilum (Ap) is an obligate intracellular bacterium and the agent of human granulocytic anaplasmosis, an emerging tick-borne disease. Ap alternately infects ticks and mammals and a variety of cell types within each. Understanding the biology behind such versatile cellular parasitism may be derived through the use of tiling microarrays to establish high resolution, genome-wide transcription profiles of the organism as it infects cell lines representative of its life cycle (tick; ISE6) and pathogenesis (human; HL-60 and HMEC-1). Results Detailed, host cell specific transcriptional behavior was revealed. There was extensive differential Ap gene transcription between the tick (ISE6) and the human (HL-60 and HMEC-1) cell lines, with far fewer differentially transcribed genes between the human cell lines, and all disproportionately represented by membrane or surface proteins. There were Ap genes exclusively transcribed in each cell line, apparent human- and tick-specific operons and paralogs, and anti-sense transcripts that suggest novel expression regulation processes. Seven virB2 paralogs (of the bacterial type IV secretion system) showed human or tick cell dependent transcription. Previously unrecognized genes and coding sequences were identified, as were the expressed p44/msp2 (major surface proteins) paralogs (of 114 total), through elevated signal produced to the unique hypervariable region of each – 2/114 in HL-60, 3/114 in HMEC-1, and none in ISE6. Conclusion Using these methods, whole genome transcription profiles can likely be generated for Ap, as well as other obligate intracellular organisms, in any host cells and for all stages of the cell infection process. Visual representation of comprehensive transcription data alongside an annotated map of the genome renders complex transcription into discernable patterns. PMID:18671858

  3. Analysis of Campylobacter jejuni whole genome DNA microarrays: significance of prophage and hypervariable regions for discriminating isolates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Introduction: Campylobacter jejuni is a major cause of gastroenteritis in humans and is carried in many common food animals. In order to reduce human infections a better understanding of Campylobacter epidemiology is needed. Identifying genes that enable discriminating between isolates is an importa...

  4. Whole-Genome Sequence of Enteractinococcus helveticum sp. nov. Strain UASWS1574 Isolated from Industrial Used Waters

    PubMed Central

    Crovadore, Julien; Calmin, Gautier; Chablais, Romain; Cochard, Bastien

    2016-01-01

    We report here the whole-genome shotgun sequences of the strain UASWS1574 of the undescribed Enteractinococcus helveticum sp. nov., isolated from used water. This is the first genome registered for the whole genus. PMID:27469945

  5. A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)

    SciTech Connect

    FitzGerald, Michael

    2012-06-01

    Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  6. A rapid whole genome sequencing and analysis system supporting genomic epidemiology (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    FitzGerald, Michael [Broad Institute

    2016-07-12

    Michael FitzGerald on "A rapid whole genome sequencing and analysis system supporting genomic epidemiology" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  7. A generic assay for whole-genome amplification and deep sequencing of enterovirus A71

    PubMed Central

    Tan, Le Van; Tuyen, Nguyen Thi Kim; Thanh, Tran Tan; Ngan, Tran Thuy; Van, Hoang Minh Tu; Sabanathan, Saraswathy; Van, Tran Thi My; Thanh, Le Thi My; Nguyet, Lam Anh; Geoghegan, Jemma L.; Ong, Kien Chai; Perera, David; Hang, Vu Thi Ty; Ny, Nguyen Thi Han; Anh, Nguyen To; Ha, Do Quang; Qui, Phan Tu; Viet, Do Chau; Tuan, Ha Manh; Wong, Kum Thong; Holmes, Edward C.; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H. Rogier

    2015-01-01

    Enterovirus A71 (EV-A71) has emerged as the most important cause of large outbreaks of severe and sometimes fatal hand, foot and mouth disease (HFMD) across the Asia-Pacific region. EV-A71 outbreaks have been associated with (sub)genogroup switches, sometimes accompanied by recombination events. Understanding EV-A71 population dynamics is therefore essential for understanding this emerging infection, and may provide pivotal information for vaccine development. Despite the public health burden of EV-A71, relatively few EV-A71 complete-genome sequences are available for analysis and from limited geographical localities. The availability of an efficient procedure for whole-genome sequencing would stimulate effort to generate more viral sequence data. Herein, we report for the first time the development of a next-generation sequencing based protocol for whole-genome sequencing of EV-A71 directly from clinical specimens. We were able to sequence viruses of subgenogroup C4 and B5, while RNA from culture materials of diverse EV-A71 subgenogroups belonging to both genogroup B and C was successfully amplified. The nature of intra-host genetic diversity was explored in 22 clinical samples, revealing 107 positions carrying minor variants (ranging from 0 to 15 variants per sample). Our analysis of EV-A71 strains sampled in 2013 showed that they all belonged to subgenogroup B5, representing the first report of this subgenogroup in Vietnam. In conclusion, we have successfully developed a high-throughput next-generation sequencing-based assay for whole-genome sequencing of EV-A71 from clinical samples. PMID:25704598

  8. Using Whole Genome Analysis to Examine Recombination across Diverse Sequence Types of Staphylococcus aureus

    PubMed Central

    Driebe, Elizabeth M.; Sahl, Jason W.; Roe, Chandler; Bowers, Jolene R.; Schupp, James M.; Gillece, John D.; Kelley, Erin; Price, Lance B.; Pearson, Talima R.; Hepp, Crystal M.; Brzoska, Pius M.; Cummings, Craig A.; Furtado, Manohar R.; Andersen, Paal S.; Stegger, Marc; Engelthaler, David M.; Keim, Paul S.

    2015-01-01

    Staphylococcus aureus is an important clinical pathogen worldwide and understanding this organism's phylogeny and, in particular, the role of recombination, is important both to understand the overall spread of virulent lineages and to characterize outbreaks. To further elucidate the phylogeny of S. aureus, 35 diverse strains were sequenced using whole genome sequencing. In addition, 29 publicly available whole genome sequences were included to create a single nucleotide polymorphism (SNP)-based phylogenetic tree encompassing 11 distinct lineages. All strains of a particular sequence type fell into the same clade with clear groupings of the major clonal complexes of CC8, CC5, CC30, CC45 and CC1. Using a novel analysis method, we plotted the homoplasy density and SNP density across the whole genome and found evidence of recombination throughout the entire chromosome, but when we examined individual clonal lineages we found very little recombination. However, when we analyzed three branches of multiple lineages, we saw intermediate and differing levels of recombination between them. These data demonstrate that in S. aureus, recombination occurs across major lineages that subsequently expand in a clonal manner. Estimated mutation rates for the CC8 and CC5 lineages were different from each other. While the CC8 lineage rate was similar to previous studies, the CC5 lineage was 100-fold greater. Fifty known virulence genes were screened in all genomes in silico to determine their distribution across major clades. Thirty-three genes were present variably across clades, most of which were not constrained by ancestry, indicating horizontal gene transfer or gene loss. PMID:26161978

  9. A generic assay for whole-genome amplification and deep sequencing of enterovirus A71.

    PubMed

    Tan, Le Van; Tuyen, Nguyen Thi Kim; Thanh, Tran Tan; Ngan, Tran Thuy; Van, Hoang Minh Tu; Sabanathan, Saraswathy; Van, Tran Thi My; Thanh, Le Thi My; Nguyet, Lam Anh; Geoghegan, Jemma L; Ong, Kien Chai; Perera, David; Hang, Vu Thi Ty; Ny, Nguyen Thi Han; Anh, Nguyen To; Ha, Do Quang; Qui, Phan Tu; Viet, Do Chau; Tuan, Ha Manh; Wong, Kum Thong; Holmes, Edward C; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H Rogier

    2015-04-01

    Enterovirus A71 (EV-A71) has emerged as the most important cause of large outbreaks of severe and sometimes fatal hand, foot and mouth disease (HFMD) across the Asia-Pacific region. EV-A71 outbreaks have been associated with (sub)genogroup switches, sometimes accompanied by recombination events. Understanding EV-A71 population dynamics is therefore essential for understanding this emerging infection, and may provide pivotal information for vaccine development. Despite the public health burden of EV-A71, relatively few EV-A71 complete-genome sequences are available for analysis and from limited geographical localities. The availability of an efficient procedure for whole-genome sequencing would stimulate effort to generate more viral sequence data. Herein, we report for the first time the development of a next-generation sequencing based protocol for whole-genome sequencing of EV-A71 directly from clinical specimens. We were able to sequence viruses of subgenogroup C4 and B5, while RNA from culture materials of diverse EV-A71 subgenogroups belonging to both genogroup B and C was successfully amplified. The nature of intra-host genetic diversity was explored in 22 clinical samples, revealing 107 positions carrying minor variants (ranging from 0 to 15 variants per sample). Our analysis of EV-A71 strains sampled in 2013 showed that they all belonged to subgenogroup B5, representing the first report of this subgenogroup in Vietnam. In conclusion, we have successfully developed a high-throughput next-generation sequencing-based assay for whole-genome sequencing of EV-A71 from clinical samples. PMID:25704598

  10. Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

    SciTech Connect

    Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa; Fahrenbach, John P.; Nelakuditi, Viswateja; Pesce, Lorenzo L.; Pytel, Peter; McNally, Elizabeth M.

    2014-09-01

    Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused on 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.

  11. Whole-genome mapping reveals a large chromosomal inversion on Iberian Brucella suis biovar 2 strains.

    PubMed

    Ferreira, Ana Cristina; Dias, Ricardo; de Sá, Maria Inácia Corrêa; Tenreiro, Rogério

    2016-08-30

    Optical mapping is a technology able to quickly generate high resolution ordered whole-genome restriction maps of bacteria, being a proven approach to search for diversity among bacterial isolates. In this work, optical whole-genome maps were used to compare closely-related Brucella suis biovar 2 strains. This biovar is the unique isolated in domestic pigs and wild boars in Portugal and Spain and most of the strains share specific molecular characteristics establishing an Iberian clonal lineage that can be differentiated from another lineage mainly isolated in several Central European countries. We performed the BamHI whole-genome optical maps of five B. suis biovar 2 field strains, isolated from wild boars in Portugal and Spain (three from the Iberian lineage and two from the Central European one) as well as of the reference strain B. suis biovar 2 ATCC 23445 (Central European lineage, Denmark). Each strain showed a distinct, highly individual configuration of 228-231 BamHI fragments. Nevertheless, a low divergence was globally observed in chromosome II (1.6%) relatively to chromosome I (2.4%). Optical mapping also disclosed genomic events associated with B. suis strains in chromosome I, namely one indel (3.5kb) and one large inversion (944kb). By using targeted-PCR in a set of 176 B. suis strains, including all biovars and haplotypes, the indel was found to be specific of the reference strain ATCC 23445 and the large inversion was shown to be an exclusive genomic marker of the Iberian clonal lineage of biovar 2.

  12. Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

    PubMed Central

    Dellefave-Castillo, Lisa; Fahrenbach, John P; Nelakuditi, Viswateja; Pesce, Lorenzo L; Pytel, Peter; McNally, Elizabeth M

    2014-01-01

    Background Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused on 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. Conclusions These pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes. PMID:25179549

  13. Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

    DOE PAGESBeta

    Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa; Fahrenbach, John P.; Nelakuditi, Viswateja; Pesce, Lorenzo L.; Pytel, Peter; McNally, Elizabeth M.

    2014-09-01

    Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less

  14. A generic assay for whole-genome amplification and deep sequencing of enterovirus A71.

    PubMed

    Tan, Le Van; Tuyen, Nguyen Thi Kim; Thanh, Tran Tan; Ngan, Tran Thuy; Van, Hoang Minh Tu; Sabanathan, Saraswathy; Van, Tran Thi My; Thanh, Le Thi My; Nguyet, Lam Anh; Geoghegan, Jemma L; Ong, Kien Chai; Perera, David; Hang, Vu Thi Ty; Ny, Nguyen Thi Han; Anh, Nguyen To; Ha, Do Quang; Qui, Phan Tu; Viet, Do Chau; Tuan, Ha Manh; Wong, Kum Thong; Holmes, Edward C; Chau, Nguyen Van Vinh; Thwaites, Guy; van Doorn, H Rogier

    2015-04-01

    Enterovirus A71 (EV-A71) has emerged as the most important cause of large outbreaks of severe and sometimes fatal hand, foot and mouth disease (HFMD) across the Asia-Pacific region. EV-A71 outbreaks have been associated with (sub)genogroup switches, sometimes accompanied by recombination events. Understanding EV-A71 population dynamics is therefore essential for understanding this emerging infection, and may provide pivotal information for vaccine development. Despite the public health burden of EV-A71, relatively few EV-A71 complete-genome sequences are available for analysis and from limited geographical localities. The availability of an efficient procedure for whole-genome sequencing would stimulate effort to generate more viral sequence data. Herein, we report for the first time the development of a next-generation sequencing based protocol for whole-genome sequencing of EV-A71 directly from clinical specimens. We were able to sequence viruses of subgenogroup C4 and B5, while RNA from culture materials of diverse EV-A71 subgenogroups belonging to both genogroup B and C was successfully amplified. The nature of intra-host genetic diversity was explored in 22 clinical samples, revealing 107 positions carrying minor variants (ranging from 0 to 15 variants per sample). Our analysis of EV-A71 strains sampled in 2013 showed that they all belonged to subgenogroup B5, representing the first report of this subgenogroup in Vietnam. In conclusion, we have successfully developed a high-throughput next-generation sequencing-based assay for whole-genome sequencing of EV-A71 from clinical samples.

  15. Identification by whole-genome resequencing of gene defect responsible for severe hypercholesterolemia.

    PubMed

    Rios, Jonathan; Stein, Evan; Shendure, Jay; Hobbs, Helen H; Cohen, Jonathan C

    2010-11-15

    Whole-genome sequencing is a potentially powerful tool for the diagnosis of genetic diseases. Here, we used sequencing-by-ligation to sequence the genome of an 11-month-old breast-fed girl with xanthomas and very high plasma cholesterol levels (1023 mg/dl). Her parents had normal plasma cholesterol levels and reported no family history of hypercholesterolemia, suggesting either an autosomal recessive disorder or a de novo mutation. Known genetic causes of severe hypercholesterolemia were ruled out by sequencing the responsible genes (LDLRAP, LDLR, PCSK9, APOE and APOB), and sitosterolemia was ruled out by documenting a normal plasma sitosterol:cholesterol ratio. Sequencing revealed 3 797 207 deviations from the reference sequence, of which 9726 were nonsynonymous single-nucleotide substitutions. A total of 9027 of the nonsynonymous substitutions were present in dbSNP or in 21 additional individuals from whom complete exonic sequences were available. The 699 novel nonsynonymous substitutions were distributed among 604 genes, 23 of which were single-copy genes that each contained 2 nonsynonymous substitutions consistent with an autosomal recessive model. One gene, ABCG5, had two nonsense mutations (Q16X and R446X). This finding indicated that the infant has sitosterolemia. Thus, whole-genome sequencing led to the diagnosis of a known disease with an atypical presentation. Diagnosis was confirmed by the finding of severe sitosterolemia in a blood sample obtained after the infant had been weaned. These findings demonstrate that whole-genome (or exome) sequencing can be a valuable aid to diagnose genetic diseases, even in individual patients. PMID:20719861

  16. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    PubMed Central

    Huang, August Y; Xu, Xiaojing; Ye, Adam Y; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Zhao, Han-Qing; Wang, Meng; Gao, Hua; Gao, Ge; Zhang, Zhichao; Yang, Xiaoling; Wu, Xiru; Zhang, Yuehua; Wei, Liping

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of effective error filters, using which we were able to identify 17 SNM sites from ∼80× whole-genome sequencing of peripheral blood DNAs from three clinically unremarkable adults. The pSNMs were thoroughly validated using pyrosequencing, Sanger sequencing of individual cloned fragments, and multiplex ligation-dependent probe amplification. The mutant allele fraction ranged from 5%-31%. We found that C→T and C→A were the predominant types of postzygotic mutations, similar to the somatic mutation profile in tumor tissues. Simulation data showed that the overall mutation rate was an order of magnitude lower than that in cancer. We detected varied allele fractions of the pSNMs among multiple samples obtained from the same individuals, including blood, saliva, hair follicle, buccal mucosa, urine, and semen samples, indicating that pSNMs could affect multiple sources of somatic cells as well as germ cells. Two of the adults have children who were diagnosed with Dravet syndrome. We identified two non-synonymous pSNMs in SCN1A, a causal gene for Dravet syndrome, from these two unrelated adults and found that the mutant alleles were transmitted to their children, highlighting the clinical importance of detecting pSNMs in genetic counseling. PMID:25312340

  17. Whole-genome mapping reveals a large chromosomal inversion on Iberian Brucella suis biovar 2 strains.

    PubMed

    Ferreira, Ana Cristina; Dias, Ricardo; de Sá, Maria Inácia Corrêa; Tenreiro, Rogério

    2016-08-30

    Optical mapping is a technology able to quickly generate high resolution ordered whole-genome restriction maps of bacteria, being a proven approach to search for diversity among bacterial isolates. In this work, optical whole-genome maps were used to compare closely-related Brucella suis biovar 2 strains. This biovar is the unique isolated in domestic pigs and wild boars in Portugal and Spain and most of the strains share specific molecular characteristics establishing an Iberian clonal lineage that can be differentiated from another lineage mainly isolated in several Central European countries. We performed the BamHI whole-genome optical maps of five B. suis biovar 2 field strains, isolated from wild boars in Portugal and Spain (three from the Iberian lineage and two from the Central European one) as well as of the reference strain B. suis biovar 2 ATCC 23445 (Central European lineage, Denmark). Each strain showed a distinct, highly individual configuration of 228-231 BamHI fragments. Nevertheless, a low divergence was globally observed in chromosome II (1.6%) relatively to chromosome I (2.4%). Optical mapping also disclosed genomic events associated with B. suis strains in chromosome I, namely one indel (3.5kb) and one large inversion (944kb). By using targeted-PCR in a set of 176 B. suis strains, including all biovars and haplotypes, the indel was found to be specific of the reference strain ATCC 23445 and the large inversion was shown to be an exclusive genomic marker of the Iberian clonal lineage of biovar 2. PMID:27527786

  18. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  19. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  20. Increase of ethanol tolerance of Saccharomyces cerevisiae by error-prone whole genome amplification.

    PubMed

    Luhe, Annette Lin; Tan, Lily; Wu, Jinchuan; Zhao, Hua

    2011-05-01

    Saccharomyces cerevisiae was transformed for higher ethanol tolerance by error-prone whole genome amplification. The resulting PCR products were transformed back to the parental strain for homologous recombination to create a library of mutants with the perturbed genomic networks. A few rounds of transformation led to the isolation of mutants that grew in 9% (v/v) ethanol and 100 g glucose l(-1) compared to untransformed yeast which grew only at 6% (v/v) ethanol and 100 g glucose l(-1).

  1. Sequence determination from overlapping fragments: a simple model of whole-genome shotgun sequencing.

    PubMed

    Derrida, Bernard; Fink, Thomas M A

    2002-02-11

    Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general problem we apply two assembly strategies and give the probability that the assembly puzzle can be solved in the limit of infinitely many fragments. PMID:11863859

  2. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango.

    PubMed

    Rakhashiya, Purvi M; Patel, Pooja P; Thaker, Vrinda S

    2015-12-01

    Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E), Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S). The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000. PMID:26697318

  3. On-site manipulation of single whole-genome DNA molecules using optical tweezers

    NASA Astrophysics Data System (ADS)

    Oana, Hidehiro; Kubo, Koji; Yoshikawa, Kenichi; Atomi, Haruyuki; Imanaka, Tadayuki

    2004-11-01

    In this letter, we describe a noninvasive methodology for manipulating single Mb-size whole-genome DNA molecules. Cells were subjected to osmotic shock and the genome DNA released from the burst cells was transferred to a region of higher salt concentration using optical tweezers. The transferred genome DNA exhibits a conformational transition from a compact state into an elongated state, accompanied by the change in its environment. The applicability of optical tweezers to the on-site manipulation of giant genome DNA is suggested, i.e., lab-on-a-plate.

  4. Whole genome sequencing provides an unambiguous link between Salmonella Dublin outbreak strain and a historical isolate.

    PubMed

    Mohammed, M; Delappe, N; O'Connor, J; McKeown, P; Garvey, P; Cormican, M

    2016-02-01

    Salmonella enterica subsp. enterica serovar Dublin is an uncommon cause of human salmonellosis; however, a relatively high proportion of cases are associated with invasive disease. The serotype is associated with cattle. A geographically diffuse outbreak of S. Dublin involving nine patients occurred in Ireland in 2013. The source of infection was not identified. Typing of outbreak associated isolates by pulsed-field gel electrophoresis (PFGE) was of limited value because PFGE has limited discriminatory power for S. Dublin. Whole genome sequencing (WGS) showed conclusively that the isolates were closely related to each other, to an apparently unrelated isolate from 2011 and distinct from other isolates that were not readily distinguishable by PFGE.

  5. Elucidating the phylodynamics of endemic rabies virus in eastern Africa using whole-genome sequencing

    PubMed Central

    Brunker, Kirstyn; Marston, Denise A; Horton, Daniel L; Cleaveland, Sarah; Fooks, Anthony R; Kazwala, Rudovick; Ngeleja, Chanasa; Lembo, Tiziana; Sambo, Maganga; Mtema, Zacharia J; Sikana, Lwitiko; Wilkie, Gavin; Biek, Roman; Hampson, Katie

    2015-01-01

    Many of the pathogens perceived to pose the greatest risk to humans are viral zoonoses, responsible for a range of emerging and endemic infectious diseases. Phylogeography is a useful tool to understand the processes that give rise to spatial patterns and drive dynamics in virus populations. Increasingly, whole-genome information is being used to uncover these patterns, but the limits of phylogenetic resolution that can be achieved with this are unclear. Here, whole-genome variation was used to uncover fine-scale population structure in endemic canine rabies virus circulating in Tanzania. This is the first whole-genome population study of rabies virus and the first comprehensive phylogenetic analysis of rabies virus in East Africa, providing important insights into rabies transmission in an endemic system. In addition, sub-continental scale patterns of population structure were identified using partial gene data and used to determine population structure at larger spatial scales in Africa. While rabies virus has a defined spatial structure at large scales, increasingly frequent levels of admixture were observed at regional and local levels. Discrete phylogeographic analysis revealed long-distance dispersal within Tanzania, which could be attributed to human-mediated movement, and we found evidence of multiple persistent, co-circulating lineages at a very local scale in a single district, despite on-going mass dog vaccination campaigns. This may reflect the wider endemic circulation of these lineages over several decades alongside increased admixture due to human-mediated introductions. These data indicate that successful rabies control in Tanzania could be established at a national level, since most dispersal appears to be restricted within the confines of country borders but some coordination with neighbouring countries may be required to limit transboundary movements. Evidence of complex patterns of rabies circulation within Tanzania necessitates the use of whole-genome

  6. Reflections on the cost of "low-cost" whole genome sequencing: framing the health policy debate.

    PubMed

    Caulfield, Timothy; Evans, Jim; McGuire, Amy; McCabe, Christopher; Bubela, Tania; Cook-Deegan, Robert; Fishman, Jennifer; Hogarth, Stuart; Miller, Fiona A; Ravitsky, Vardit; Biesecker, Barbara; Borry, Pascal; Cho, Mildred K; Carroll, June C; Etchegary, Holly; Joly, Yann; Kato, Kazuto; Lee, Sandra Soo-Jin; Rothenberg, Karen; Sankar, Pamela; Szego, Michael J; Ossorio, Pilar; Pullman, Daryl; Rousseau, Francois; Ungar, Wendy J; Wilson, Brenda

    2013-11-01

    The cost of whole genome sequencing is dropping rapidly. There has been a great deal of enthusiasm about the potential for this technological advance to transform clinical care. Given the interest and significant investment in genomics, this seems an ideal time to consider what the evidence tells us about potential benefits and harms, particularly in the context of health care policy. The scale and pace of adoption of this powerful new technology should be driven by clinical need, clinical evidence, and a commitment to put patients at the centre of health care policy.

  7. Return of genetic testing results in the era of whole-genome sequencing.

    PubMed

    Knoppers, Bartha Maria; Zawati, Ma'n H; Sénécal, Karine

    2015-09-01

    Genetic testing based on whole-genome sequencing (WGS) often returns results that are not directly clinically actionable as well as raising the possibility of incidental (secondary) findings. In this article, we first survey the laws and policies guiding both researchers and clinicians in the return of results for WGS-based genetic testing. We then provide an overview of the landscape of international legislation and policies for return of these results, including considerations for return of incidental findings. Finally, we consider a range of approaches for the return of results.

  8. When aging meets microgravity: whole genome promoters and enchancers transcription landscape in zebrafish onboard ISS

    NASA Astrophysics Data System (ADS)

    Arshanovskii, Kirill; Gusev, Oleg; Sychev, Vladimir; Poddubko, Svetlana; Deviatiiarov, Ruslan

    2016-07-01

    In order to gen new insights of gene regulation changes under conditions of real spaceflight, we have conducted whole-genome analysis of dynamic of promotes and enhancers transcriptional changes in zebrafish during prolonged exposure to real spaceflight. In the frame of Russia-Japan joint experiments "Aquatic Habitat"-"Aquarium" we have conducted Cap Analysis of Gene Expression (CAGE) assay of zebrafish in the rage from 7 to 40 days of real spaceflight onboard ISS. The analysis showed that both gene expression patterns and architecture of shapes and types of the promoters are affected by spaceflight environment.

  9. Identification of low abundance microbiome in clinical samples using whole genome sequencing.

    PubMed

    Zhang, Chao; Cleveland, Kyle; Schnoll-Sussman, Felice; McClure, Bridget; Bigg, Michelle; Thakkar, Prashant; Schultz, Nikolaus; Shah, Manish A; Betel, Doron

    2015-01-01

    Identifying the microbiome composition from primary tissues directly affords an opportunity to study the causative relationships between the host microbiome and disease. However, this is challenging due the low abundance of microbial DNA relative to the host. We present a systematic evaluation of microbiome profiling directly from endoscopic biopsies by whole genome sequencing. We compared our methods with other approaches on datasets with previously identified microbial composition. We applied this approach to identify the microbiome from 27 stomach biopsies, and validated the presence of Helicobacter pylori by quantitative PCR. Finally, we profiled the microbial composition in The Cancer Genome Atlas gastric adenocarcinoma cohort. PMID:26614063

  10. A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome.

    PubMed

    Chapman, Jarrod A; Mascher, Martin; Buluç, Aydın; Barry, Kerrie; Georganas, Evangelos; Session, Adam; Strnadova, Veronika; Jenkins, Jerry; Sehgal, Sunish; Oliker, Leonid; Schmutz, Jeremy; Yelick, Katherine A; Scholz, Uwe; Waugh, Robbie; Poland, Jesse A; Muehlbauer, Gary J; Stein, Nils; Rokhsar, Daniel S

    2015-01-31

    Polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible to construct a mapping population.

  11. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    PubMed

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L).

  12. A green-cotyledon/stay-green mutant exemplifies the ancient whole-genome duplications in soybean.

    PubMed

    Nakano, Michiharu; Yamada, Tetsuya; Masuda, Yu; Sato, Yutaka; Kobayashi, Hideki; Ueda, Hiroaki; Morita, Ryouhei; Nishimura, Minoru; Kitamura, Keisuke; Kusaba, Makoto

    2014-10-01

    The recent whole-genome sequencing of soybean (Glycine max) revealed that soybean experienced whole-genome duplications 59 million and 13 million years ago, and it has an octoploid-like genome in spite of its diploid nature. We analyzed a natural green-cotyledon mutant line, Tenshin-daiseitou. The physiological analysis revealed that Tenshin-daiseitou shows a non-functional stay-green phenotype in senescent leaves, which is similar to that of the mutant of Mendel's green-cotyledon gene I, the ortholog of SGR in pea. The identification of gene mutations and genetic segregation analysis suggested that defects in GmSGR1 and GmSGR2 were responsible for the green-cotyledon/stay-green phenotype of Tenshin-daiseitou, which was confirmed by RNA interference (RNAi) transgenic soybean experiments using GmSGR genes. The characterized green-cotyledon double mutant d1d2 was found to have the same mutations, suggesting that GmSGR1 and GmSGR2 are D1 and D2. Among the examined d1d2 strains, the d1d2 strain K144a showed a lower Chl a/b ratio in mature seeds than other strains but not in senescent leaves, suggesting a seed-specific genetic factor of the Chl composition in K144a. Analysis of the soybean genome sequence revealed four genomic regions with microsynteny to the Arabidopsis SGR1 region, which included the GmSGR1 and GmSGR2 regions. The other two regions contained GmSGR3a/GmSGR3b and GmSGR4, respectively, which might be pseudogenes or genes with a function that is unrelated to Chl degradation during seed maturation and leaf senescence. These GmSGR genes were thought to be produced by the two whole-genome duplications, and they provide a good example of such whole-genome duplication events in the evolution of the soybean genome.

  13. Genetic linkage analysis in the age of whole-genome sequencing.

    PubMed

    Ott, Jurg; Wang, Jing; Leal, Suzanne M

    2015-05-01

    For many years, linkage analysis was the primary tool used for the genetic mapping of Mendelian and complex traits with familial aggregation. Linkage analysis was largely supplanted by the wide adoption of genome-wide association studies (GWASs). However, with the recent increased use of whole-genome sequencing (WGS), linkage analysis is again emerging as an important and powerful analysis method for the identification of genes involved in disease aetiology, often in conjunction with WGS filtering approaches. Here, we review the principles of linkage analysis and provide practical guidelines for carrying out linkage studies using WGS data. PMID:25824869

  14. Whole genome shotgun sequence of Bacillus amyloliquefaciens TF28, a biocontrol entophytic bacterium.

    PubMed

    Zhang, Shumei; Jiang, Wei; Li, Jing; Meng, Liqiang; Cao, Xu; Hu, Jihua; Liu, Yushuai; Chen, Jingyu; Sha, Changqing

    2016-01-01

    Bacillus amyloliquefaciens TF28 is a biocontrol endophytic bacterium that is capable of inhibition of a broad range of plant pathogenic fungi. The strain has the potential to be developed into a biocontrol agent for use in agriculture. Here we report the whole-genome shotgun sequence of the strain. The genome size of B. amyloliquefaciens TF28 is 3,987,635 bp which consists of 3754 protein-coding genes, 65 tandem repeat sequences, 47 minisatellite DNA, 2 microsatellite DNA, 63 tRNA, 7rRNA, 6 sRNA, 3 prophage and CRISPR domains. PMID:27688836

  15. Contact investigations for outbreaks of Mycobacterium tuberculosis: advances through whole genome sequencing.

    PubMed

    Walker, T M; Monk, P; Smith, E Grace; Peto, T E A

    2013-09-01

    The control of tuberculosis depends on the identification and treatment of infectious patients and their contacts, who are currently identified through a combined approach of genotyping and epidemiological investigation. However, epidemiological data are often challenging to obtain, and genotyping data are difficult to interpret without them. Whole genome sequencing (WGS) technology is increasingly affordable, and offers the prospect of identifying plausible transmission events between patients without prior recourse to epidemiological data. We discuss the current approaches to tuberculosis control, and how WGS might advance public health efforts in the future. PMID:23432709

  16. Identification of emergent blaCMY-2-carrying Proteus mirabilis lineages by whole-genome sequencing

    PubMed Central

    Mac Aogáin, M.; Rogers, T.R.; Crowley, B.

    2015-01-01

    Whole-genome sequencing of 24 Proteus mirabilis isolates revealed the clonal expansion of two cefoxitin-resistant strains among patients with community-onset infection. These strains harboured blaCMY-2 within a chromosomally located integrative and conjugative element and exhibited multidrug resistance phenotypes. A predominant strain, identified in 18 patients, also harboured the PGI-1 genomic island and associated resistance genes, accounting for its broader antibiotic resistance profile. The identification of these novel multidrug-resistant strains among community-onset infections suggests that they are endemic to this region and represent emergent P. mirabilis lineages of clinical significance. PMID:26865983

  17. Molecular characterization of avian polyomavirus isolated from psittacine birds based on the whole genome sequence analysis.

    PubMed

    Katoh, Hiroshi; Ohya, Kenji; Une, Yumi; Yamaguchi, Tsuyoshi; Fukushi, Hideto

    2009-07-01

    Seven avian polyomaviruses (APVs) were isolated from seven psittacine birds of four species. Their whole genome sequences were genetically analyzed. Comparing with the sequence of BFDV1 strain, nucleotide substitutions in the sequences of seven APV isolates were found at 63 loci and a high level of conservation of amino acid sequence in each viral protein (VP1, VP2, VP3, VP4, and t/T antigen) was predicted. An A-to-T nucleotide substitution was observed in non-control region of all seven APV sequences in comparison with BFDV1 strain. Two C-to-T nucleotide substitutions were also detected in non-coding regions of one isolate. A phylogenetic analysis of the whole genome sequences indicated that the sequences from the same species of bird were closely related. APV has been reported to have distinct tropism for cell cultures of various avian species. The present study indicated that a single amino acid substitution at position 221 in VP2 was essential for propagating in chicken embryonic fibroblast culture and this substitution was promoted by propagation on budgerigar embryonic fibroblast culture. For two isolates, three serial amino acids appeared to be deleted in VP4. However, this deletion had little effect on virus propagation.

  18. Whole-Genome Sequencing Analysis of Sapovirus Detected in South Korea.

    PubMed

    Choi, Hye Lim; Suh, Chang-Il; Park, Seung-Won; Jin, Ji-Young; Cho, Han-Gil; Paik, Soon-Young

    2015-01-01

    Sapovirus (SaV), a virus residing in the intestines, is one of the important causes of gastroenteritis in human beings. Human SaV genomes are classified into various genogroups and genotypes. Whole-genome analysis and phylogenetic analysis of ROK62, the SaV isolated in South Korea, were carried out. The ROK62 genome of 7429 nucleotides contains 3 open-reading frames (ORF). The genotype of ROK62 is SaV GI-1, and 94% of its nucleotide sequence is identical with other SaVs, namely Manchester and Mc114. Recently, SaV infection has been on the rise throughout the world, particularly in countries neighboring South Korea; however, very few academic studies have been done nationally. As the first whole-genome sequence analysis of SaV in South Korea, this research will help provide reference for the detection of recombination, tracking of epidemic spread, and development of diagnosis methods for SaV.

  19. Multiplex Degenerate Primer Design for Targeted Whole Genome Amplification of Many Viral Genomes

    DOE PAGESBeta

    Gardner, Shea N.; Jaing, Crystal J.; Elsheikh, Maher M.; Peña, José; Hysom, David A.; Borucki, Monica K.

    2014-01-01

    Background . Targeted enrichment improves coverage of highly mutable viruses at low concentration in complex samples. Degenerate primers that anneal to conserved regions can facilitate amplification of divergent, low concentration variants, even when the strain present is unknown. Results . A tool for designing multiplex sets of degenerate sequencing primers to tile overlapping amplicons across multiple whole genomes is described. The new script, run_tiled_primers, is part of the PriMux software. Primers were designed for each segment of South American hemorrhagic fever viruses, tick-borne encephalitis, Henipaviruses, Arenaviruses, Filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus, and Japanese encephalitis virus.more » Each group is highly diverse with as little as 5% genome consensus. Primer sets were computationally checked for nontarget cross reactions against the NCBI nucleotide sequence database. Primers for murine hepatitis virus were demonstrated in the lab to specifically amplify selected genes from a laboratory cultured strain that had undergone extensive passage in vitro and in vivo. Conclusions . This software should help researchers design multiplex sets of primers for targeted whole genome enrichment prior to sequencing to obtain better coverage of low titer, divergent viruses. Applications include viral discovery from a complex background and improved sensitivity and coverage of rapidly evolving strains or variants in a gene family.« less

  20. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    SciTech Connect

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  1. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics

    PubMed Central

    Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K.; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-01-01

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C·G > T·A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes. PMID:26305677

  2. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    PubMed

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes.

  3. Genetic Mapping of Millions of SNPs in Safflower (Carthamus tinctorius L.) via Whole-Genome Resequencing

    PubMed Central

    Bowers, John E.; Pearl, Stephanie A.; Burke, John M.

    2016-01-01

    Accurate assembly of complete genomes is facilitated by very high density genetic maps. We performed low-coverage, whole-genome shotgun sequencing on 96 F6 recombinant inbred lines (RILs) of a cross between safflower (Carthamus tinctorius L.) and its wild progenitor (C. palaestinus Eig). We also produced a draft genome assembly of C. tinctorius covering 866 million bp (∼two-thirds) of the expected 1.35 Gbp genome after sequencing a single, short insert library to ∼21 × depth. Sequence reads from the RILs were mapped to this genome assembly to facilitate SNP identification, and the resulting polymorphisms were used to construct a genetic map. The resulting map included 2,008,196 genetically located SNPs in 1178 unique positions. A total of 57,270 scaffolds, each containing five or more mapped SNPs, were anchored to the map. This resulted in the assignment of sequence covering 14% of the expected genome length to a genetic position. Comparison of this safflower map to genetic maps of sunflower and lettuce revealed numerous chromosomal rearrangements, and the resulting patterns were consistent with a whole-genome duplication event in the lineage leading to sunflower. This sequence-based genetic map provides a powerful tool for the assembly of a low-cost draft genome of safflower, and the same general approach is expected to work for other species. PMID:27226165

  4. Kernel-based whole-genome prediction of complex traits: a review

    PubMed Central

    Morota, Gota; Gianola, Daniel

    2014-01-01

    Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics. PMID:25360145

  5. Systematic evaluation of bias in microbial community profiles induced by whole genome amplification.

    PubMed

    Direito, Susana O L; Zaura, Egija; Little, Miranda; Ehrenfreund, Pascale; Röling, Wilfred F M

    2014-03-01

    Whole genome amplification methods facilitate the detection and characterization of microbial communities in low biomass environments. We examined the extent to which the actual community structure is reliably revealed and factors contributing to bias. One widely used [multiple displacement amplification (MDA)] and one new primer-free method [primase-based whole genome amplification (pWGA)] were compared using a polymerase chain reaction (PCR)-based method as control. Pyrosequencing of an environmental sample and principal component analysis revealed that MDA impacted community profiles more strongly than pWGA and indicated that this related to species GC content, although an influence of DNA integrity could not be excluded. Subsequently, biases by species GC content, DNA integrity and fragment size were separately analysed using defined mixtures of DNA from various species. We found significantly less amplification of species with the highest GC content for MDA-based templates and, to a lesser extent, for pWGA. DNA fragmentation also interfered severely: species with more fragmented DNA were less amplified with MDA and pWGA. pWGA was unable to amplify low molecular weight DNA (< 1.5 kb), whereas MDA was inefficient. We conclude that pWGA is the most promising method for characterization of microbial communities in low-biomass environments and for currently planned astrobiological missions to Mars.

  6. Genetic Mapping of Millions of SNPs in Safflower (Carthamus tinctorius L.) via Whole-Genome Resequencing.

    PubMed

    Bowers, John E; Pearl, Stephanie A; Burke, John M

    2016-07-07

    Accurate assembly of complete genomes is facilitated by very high density genetic maps. We performed low-coverage, whole-genome shotgun sequencing on 96 F6 recombinant inbred lines (RILs) of a cross between safflower (Carthamus tinctorius L.) and its wild progenitor (C. palaestinus Eig). We also produced a draft genome assembly of C. tinctorius covering 866 million bp (∼two-thirds) of the expected 1.35 Gbp genome after sequencing a single, short insert library to ∼21 × depth. Sequence reads from the RILs were mapped to this genome assembly to facilitate SNP identification, and the resulting polymorphisms were used to construct a genetic map. The resulting map included 2,008,196 genetically located SNPs in 1178 unique positions. A total of 57,270 scaffolds, each containing five or more mapped SNPs, were anchored to the map. This resulted in the assignment of sequence covering 14% of the expected genome length to a genetic position. Comparison of this safflower map to genetic maps of sunflower and lettuce revealed numerous chromosomal rearrangements, and the resulting patterns were consistent with a whole-genome duplication event in the lineage leading to sunflower. This sequence-based genetic map provides a powerful tool for the assembly of a low-cost draft genome of safflower, and the same general approach is expected to work for other species.

  7. Phased Whole-Genome Genetic Risk in a Family Quartet Using a Major Allele Reference Sequence

    PubMed Central

    Dewey, Frederick E.; Chen, Rong; Cordero, Sergio P.; Ormond, Kelly E.; Caleshu, Colleen; Karczewski, Konrad J.; Whirl-Carrillo, Michelle; Wheeler, Matthew T.; Dudley, Joel T.; Byrnes, Jake K.; Cornejo, Omar E.; Knowles, Joshua W.; Woon, Mark; Sangkuhl, Katrin; Gong, Li; Thorn, Caroline F.; Hebert, Joan M.; Capriotti, Emidio; David, Sean P.; Pavlovic, Aleksandra; West, Anne; Thakuria, Joseph V.; Ball, Madeleine P.; Zaranek, Alexander W.; Rehm, Heidi L.; Church, George M.; West, John S.; Bustamante, Carlos D.; Snyder, Michael; Altman, Russ B.; Klein, Teri E.; Butte, Atul J.; Ashley, Euan A.

    2011-01-01

    Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (<1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing. PMID:21935354

  8. The Mutational Landscape in Pediatric Acute Lymphoblastic Leukemia Deciphered by Whole Genome Sequencing

    PubMed Central

    Lindqvist, Carl Mårten; Nordlund, Jessica; Ekman, Diana; Johansson, Anna; Moghadam, Behrooz Torabi; Raine, Amanda; Övernäs, Elin; Dahlberg, Johan; Wahlberg, Per; Henriksson, Niklas; Abrahamsson, Jonas; Frost, Britt-Marie; Grandér, Dan; Heyman, Mats; Larsson, Rolf; Palle, Josefine; Söderhäll, Stefan; Forestier, Erik; Lönnerholm, Gudmar; Syvänen, Ann-Christine; Berglund, Eva C

    2015-01-01

    Genomic characterization of pediatric acute lymphoblastic leukemia (ALL) has identified distinct patterns of genes and pathways altered in patients with well-defined genetic aberrations. To extend the spectrum of known somatic variants in ALL, we performed whole genome and transcriptome sequencing of three B-cell precursor patients, of which one carried the t(12;21)ETV6-RUNX1 translocation and two lacked a known primary genetic aberration, and one T-ALL patient. We found that each patient had a unique genome, with a combination of well-known and previously undetected genomic aberrations. By targeted sequencing in 168 patients, we identified KMT2D and KIF1B as novel putative driver genes. We also identified a putative regulatory non-coding variant that coincided with overexpression of the growth factor MDK. Our results contribute to an increased understanding of the biological mechanisms that lead to ALL and suggest that regulatory variants may be more important for cancer development than recognized to date. The heterogeneity of the genetic aberrations in ALL renders whole genome sequencing particularly well suited for analysis of somatic variants in both research and diagnostic applications. PMID:25355294

  9. Whole-genome sequence comparison as a method for improving bacterial species definition.

    PubMed

    Zhang, Wen; Du, Pengcheng; Zheng, Han; Yu, Weiwen; Wan, Li; Chen, Chen

    2014-01-01

    We compared pairs of 1,226 bacterial strains with whole genome sequences and calculated their average nucleotide identity (ANI) between genomes to determine whether whole genome comparison can be directly used for bacterial species definition. We found that genome comparisons of two bacterial strains from the same species (SGC) have a significantly higher ANI than those of two strains from different species (DGC), and that the ANI between the query and the reference genomes can be used to determine whether two genomes come from the same species. Bacterial species definition based on ANI with a cut-off value of 0.92 matched well (81.5%) with the current bacterial species definition. The ANI value was shown to be consistent with the standard for traditional bacterial species definition, and it could be used in bacterial taxonomy for species definition. A new bioinformatics program (ANItools) was also provided in this study for users to obtain the ANI value of any two bacterial genome pairs (http://genome.bioinfo-icdc.org/). This program can match a query strain to all bacterial genomes, and identify the highest ANI value of the strain at the species, genus and family levels respectively, providing valuable insights for species definition.

  10. Whole genome sequence typing to investigate the Apophysomyces outbreak following a tornado in Joplin, Missouri, 2011.

    PubMed

    Etienne, Kizee A; Gillece, John; Hilsabeck, Remy; Schupp, Jim M; Colman, Rebecca; Lockhart, Shawn R; Gade, Lalitha; Thompson, Elizabeth H; Sutton, Deanna A; Neblett-Fanfair, Robyn; Park, Benjamin J; Turabelidze, George; Keim, Paul; Brandt, Mary E; Deak, Eszter; Engelthaler, David M

    2012-01-01

    Case reports of Apophysomyces spp. in immunocompetent hosts have been a result of traumatic deep implantation of Apophysomyces spp. spore-contaminated soil or debris. On May 22, 2011 a tornado occurred in Joplin, MO, leaving 13 tornado victims with Apophysomyces trapeziformis infections as a result of lacerations from airborne material. We used whole genome sequence typing (WGST) for high-resolution phylogenetic SNP analysis of 17 outbreak Apophysomyces isolates and five additional temporally and spatially diverse Apophysomyces control isolates (three A. trapeziformis and two A. variabilis isolates). Whole genome SNP phylogenetic analysis revealed three clusters of genotypically related or identical A. trapeziformis isolates and multiple distinct isolates among the Joplin group; this indicated multiple genotypes from a single or multiple sources. Though no linkage between genotype and location of exposure was observed, WGST analysis determined that the Joplin isolates were more closely related to each other than to the control isolates, suggesting local population structure. Additionally, species delineation based on WGST demonstrated the need to reassess currently accepted taxonomic classifications of phylogenetic species within the genus Apophysomyces.

  11. From days to hours: reporting clinically actionable variants from whole genome sequencing.

    PubMed

    Middha, Sumit; Baheti, Saurabh; Hart, Steven N; Kocher, Jean-Pierre A

    2014-01-01

    As the cost of whole genome sequencing (WGS) decreases, clinical laboratories will be looking at broadly adopting this technology to screen for variants of clinical significance. To fully leverage this technology in a clinical setting, results need to be reported quickly, as the turnaround rate could potentially impact patient care. The latest sequencers can sequence a whole human genome in about 24 hours. However, depending on the computing infrastructure available, the processing of data can take several days, with the majority of computing time devoted to aligning reads to genomics regions that are to date not clinically interpretable. In an attempt to accelerate the reporting of clinically actionable variants, we have investigated the utility of a multi-step alignment algorithm focused on aligning reads and calling variants in genomic regions of clinical relevance prior to processing the remaining reads on the whole genome. This iterative workflow significantly accelerates the reporting of clinically actionable variants with no loss of accuracy when compared to genotypes obtained with the OMNI SNP platform or to variants detected with a standard workflow that combines Novoalign and GATK. PMID:24505267

  12. A comprehensive whole-genome integrated cytogenetic map for the alpaca (Lama pacos).

    PubMed

    Avila, Felipe; Baily, Malorie P; Perelman, Polina; Das, Pranab J; Pontius, Joan; Chowdhary, Renuka; Owens, Elaine; Johnson, Warren E; Merriwether, David A; Raudsepp, Terje

    2014-01-01

    Genome analysis of the alpaca (Lama pacos, LPA) has progressed slowly compared to other domestic species. Here, we report the development of the first comprehensive whole-genome integrated cytogenetic map for the alpaca using fluorescence in situ hybridization (FISH) and CHORI-246 BAC library clones. The map is comprised of 230 linearly ordered markers distributed among all 36 alpaca autosomes and the sex chromosomes. For the first time, markers were assigned to LPA14, 21, 22, 28, and 36. Additionally, 86 genes from 15 alpaca chromosomes were mapped in the dromedary camel (Camelus dromedarius, CDR), demonstrating exceptional synteny and linkage conservation between the 2 camelid genomes. Cytogenetic mapping of 191 protein-coding genes improved and refined the known Zoo-FISH homologies between camelids and humans: we discovered new homologous synteny blocks (HSBs) corresponding to HSA1-LPA/CDR11, HSA4-LPA/CDR31 and HSA7-LPA/CDR36, and revised the location of breakpoints for others. Overall, gene mapping was in good agreement with the Zoo-FISH and revealed remarkable evolutionary conservation of gene order within many human-camelid HSBs. Most importantly, 91 FISH-mapped markers effectively integrated the alpaca whole-genome sequence and the radiation hybrid maps with physical chromosomes, thus facilitating the improvement of the sequence assembly and the discovery of genes of biological importance.

  13. Computel: computation of mean telomere length from whole-genome next-generation sequencing data.

    PubMed

    Nersisyan, Lilit; Arakelyan, Arsen

    2015-01-01

    Telomeres are the ends of eukaryotic chromosomes, consisting of consecutive short repeats that protect chromosome ends from degradation. Telomeres shorten with each cell division, leading to replicative cell senescence. Deregulation of telomere length homeostasis is associated with the development of various age-related diseases and cancers. A number of experimental techniques exist for telomere length measurement; however, until recently, the absence of tools for extracting telomere lengths from high-throughput sequencing data has significantly obscured the association of telomere length with molecular processes in normal and diseased conditions. We have developed Computel, a program in R for computing mean telomere length from whole-genome next-generation sequencing data. Computel is open source, and is freely available at https://github.com/lilit-nersisyan/computel. It utilizes a short-read alignment-based approach and integrates various popular tools for sequencing data analysis. We validated it with synthetic and experimental data, and compared its performance with the previously available software. The results have shown that Computel outperforms existing software in accuracy, independence of results from sequencing conditions, stability against inherent sequencing errors, and better ability to distinguish pure telomeric sequences from interstitial telomeric repeats. By providing a highly reliable methodology for determining telomere lengths from whole-genome sequencing data, Computel should help to elucidate the role of telomeres in cellular health and disease.

  14. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    PubMed Central

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  15. Whole-genome sequencing of a malignant granular cell tumor with metabolic response to pazopanib

    PubMed Central

    Wei, Lei; Liu, Song; Conroy, Jeffrey; Wang, Jianmin; Papanicolau-Sengos, Antonios; Glenn, Sean T.; Murakami, Mitsuko; Liu, Lu; Hu, Qiang; Conroy, Jacob; Miles, Kiersten Marie; Nowak, David E.; Liu, Biao; Qin, Maochun; Bshara, Wiam; Omilian, Angela R.; Head, Karen; Bianchi, Michael; Burgher, Blake; Darlak, Christopher; Kane, John; Merzianu, Mihai; Cheney, Richard; Fabiano, Andrew; Salerno, Kilian; Talati, Chetasi; Khushalani, Nikhil I.; Trump, Donald L.; Johnson, Candace S.; Morrison, Carl D.

    2015-01-01

    Granular cell tumors are an uncommon soft tissue neoplasm. Malignant granular cell tumors comprise <2% of all granular cell tumors, are associated with aggressive behavior and poor clinical outcome, and are poorly understood in terms of tumor etiology and systematic treatment. Because of its rarity, the genetic basis of malignant granular cell tumor remains unknown. We performed whole-genome sequencing of one malignant granular cell tumor with metabolic response to pazopanib. This tumor exhibited a very low mutation rate and an overall stable genome with local complex rearrangements. The mutation signature was dominated by C>T transitions, particularly when immediately preceded by a 5′ G. A loss-of-function mutation was detected in a newly recognized tumor suppressor candidate, BRD7. No mutations were found in known targets of pazopanib. However, we identified a receptor tyrosine kinase pathway mutation in GFRA2 that warrants further evaluation. To the best of our knowledge, this is only the second reported case of a malignant granular cell tumor exhibiting a response to pazopanib, and the first whole-genome sequencing of this uncommon tumor type. The findings provide insight into the genetic basis of malignant granular cell tumors and identify potential targets for further investigation. PMID:27148567

  16. Sequence-based physical mapping of complex genomes by whole genome profiling.

    PubMed

    van Oeveren, Jan; de Ruiter, Marjo; Jesse, Taco; van der Poel, Hein; Tang, Jifeng; Yalcin, Feyruz; Janssen, Antoine; Volpin, Hanne; Stormo, Keith E; Bogden, Robert; van Eijk, Michiel J T; Prins, Marcel

    2011-04-01

    We present whole genome profiling (WGP), a novel next-generation sequencing-based physical mapping technology for construction of bacterial artificial chromosome (BAC) contigs of complex genomes, using Arabidopsis thaliana as an example. WGP leverages short read sequences derived from restriction fragments of two-dimensionally pooled BAC clones to generate sequence tags. These sequence tags are assigned to individual BAC clones, followed by assembly of BAC contigs based on shared regions containing identical sequence tags. Following in silico analysis of WGP sequence tags and simulation of a map of Arabidopsis chromosome 4 and maize, a WGP map of Arabidopsis thaliana ecotype Columbia was constructed de novo using a six-genome equivalent BAC library. Validation of the WGP map using the Columbia reference sequence confirmed that 350 BAC contigs (98%) were assembled correctly, spanning 97% of the 102-Mb calculated genome coverage. We demonstrate that WGP maps can also be generated for more complex plant genomes and will serve as excellent scaffolds to anchor genetic linkage maps and integrate whole genome sequence data.

  17. Epigenetic regulation of subgenome dominance following whole genome triplication in Brassica rapa.

    PubMed

    Cheng, Feng; Sun, Chao; Wu, Jian; Schnable, James; Woodhouse, Margaret R; Liang, Jianli; Cai, Chengcheng; Freeling, Michael; Wang, Xiaowu

    2016-07-01

    Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species. PMID:26871271

  18. Whole Genome Duplication Affects Evolvability of Flowering Time in an Autotetraploid Plant

    PubMed Central

    Martin, Sara L.; Husband, Brian C.

    2012-01-01

    Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed). We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids ( = 0.31) than diploids ( = 0.40). Neotetraploids exhibited the highest evolutionary response ( = 0.55). The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes. PMID:23028620

  19. Whole genome duplication affects evolvability of flowering time in an autotetraploid plant.

    PubMed

    Martin, Sara L; Husband, Brian C

    2012-01-01

    Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed). We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids (^b(T) =  0.31) than diploids (^b(T) =  0.40). Neotetraploids exhibited the highest evolutionary response (^b(T)  =  0.55). The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes. PMID:23028620

  20. Independent Evolution of Winner Traits without Whole Genome Duplication in Dekkera Yeasts.

    PubMed

    Guo, Yi-Cheng; Zhang, Lin; Dai, Shao-Xing; Li, Wen-Xing; Zheng, Jun-Juan; Li, Gong-Hua; Huang, Jing-Fei

    2016-01-01

    Dekkera yeasts have often been considered as alternative sources of ethanol production that could compete with S. cerevisiae. The two lineages of yeasts independently evolved traits that include high glucose and ethanol tolerance, aerobic fermentation, and a rapid ethanol fermentation rate. The Saccharomyces yeasts attained these traits mainly through whole genome duplication approximately 100 million years ago (Mya). However, the Dekkera yeasts, which were separated from S. cerevisiae approximately 200 Mya, did not undergo whole genome duplication (WGD) but still occupy a niche similar to S. cerevisiae. Upon analysis of two Dekkera yeasts and five closely related non-WGD yeasts, we found that a massive loss of cis-regulatory elements occurred in an ancestor of the Dekkera yeasts, which led to improved mitochondrial functions similar to the S. cerevisiae yeasts. The evolutionary analysis indicated that genes involved in the transcription and translation process exhibited faster evolution in the Dekkera yeasts. We detected 90 positively selected genes, suggesting that the Dekkera yeasts evolved an efficient translation system to facilitate adaptive evolution. Moreover, we identified that 12 vacuolar H+-ATPase (V-ATPase) function genes that were under positive selection, which assists in developing tolerance to high alcohol and high sugar stress. We also revealed that the enzyme PGK1 is responsible for the increased rate of glycolysis in the Dekkera yeasts. These results provide important insights to understand the independent adaptive evolution of the Dekkera yeasts and provide tools for genetic modification promoting industrial usage. PMID:27152421

  1. Epigenetic regulation of subgenome dominance following whole genome triplication in Brassica rapa.

    PubMed

    Cheng, Feng; Sun, Chao; Wu, Jian; Schnable, James; Woodhouse, Margaret R; Liang, Jianli; Cai, Chengcheng; Freeling, Michael; Wang, Xiaowu

    2016-07-01

    Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species.

  2. Comparative Genometrics (CG): a database dedicated to biometric comparisons of whole genomes

    PubMed Central

    Roten, Claude-Alain H.; Gamba, Patrick; Barblan, Jean-Luc; Karamata, Dimitri

    2002-01-01

    The ever increasing rate at which whole genome sequences are becoming accessible to the scientific community has created an urgent need for tools enabling comparison of chromosomes of different species. We have applied biometric methods to available chromosome sequences and posted the results on our Comparative Genometrics (CG) web site. By genometrics, a term coined by Elston and Wilson [Genet. Epidemiol. (1990), 7, 17–19], we understand a biometric analysis of chromosomes. During the initial phase, our web site displays, for all completely sequenced prokaryotic genomes, three genometric analyses: the DNA walk [Lobry (1999) Microbiology Today, 26, 164–165] and two complementary representations, i.e. the cumulative GC- and TA-skew analyses, capable of identifying, at the level of whole genomes, features inherent to chromosome organization and functioning. It appears that the latter features are taxon-specific. Although primarily focused on prokaryotic chromosomes, the CG web site contains genometric information on paradigm plasmids, phages, viruses and eukaryotic organelles. Relevant data and methods can be readily used by the scientific community for further analyses as well as for tutorial purposes. Our data posted at the CG web site are freely available on the World Wide Web at http://www.unil.ch/comparativegenometrics. PMID:11752276

  3. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    PubMed

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes. PMID:26305677

  4. Genetic Mapping of Millions of SNPs in Safflower (Carthamus tinctorius L.) via Whole-Genome Resequencing.

    PubMed

    Bowers, John E; Pearl, Stephanie A; Burke, John M

    2016-01-01

    Accurate assembly of complete genomes is facilitated by very high density genetic maps. We performed low-coverage, whole-genome shotgun sequencing on 96 F6 recombinant inbred lines (RILs) of a cross between safflower (Carthamus tinctorius L.) and its wild progenitor (C. palaestinus Eig). We also produced a draft genome assembly of C. tinctorius covering 866 million bp (∼two-thirds) of the expected 1.35 Gbp genome after sequencing a single, short insert library to ∼21 × depth. Sequence reads from the RILs were mapped to this genome assembly to facilitate SNP identification, and the resulting polymorphisms were used to construct a genetic map. The resulting map included 2,008,196 genetically located SNPs in 1178 unique positions. A total of 57,270 scaffolds, each containing five or more mapped SNPs, were anchored to the map. This resulted in the assignment of sequence covering 14% of the expected genome length to a genetic position. Comparison of this safflower map to genetic maps of sunflower and lettuce revealed numerous chromosomal rearrangements, and the resulting patterns were consistent with a whole-genome duplication event in the lineage leading to sunflower. This sequence-based genetic map provides a powerful tool for the assembly of a low-cost draft genome of safflower, and the same general approach is expected to work for other species. PMID:27226165

  5. Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast

    PubMed Central

    Huang, Mingtao; Bai, Yunpeng; Sjostrom, Staffan L.; Hallström, Björn M.; Liu, Zihe; Petranovic, Dina; Uhlén, Mathias; Joensson, Haakan N.; Andersson-Svahn, Helene; Nielsen, Jens

    2015-01-01

    There is an increasing demand for biotech-based production of recombinant proteins for use as pharmaceuticals in the food and feed industry and in industrial applications. Yeast Saccharomyces cerevisiae is among preferred cell factories for recombinant protein production, and there is increasing interest in improving its protein secretion capacity. Due to the complexity of the secretory machinery in eukaryotic cells, it is difficult to apply rational engineering for construction of improved strains. Here we used high-throughput microfluidics for the screening of yeast libraries, generated by UV mutagenesis. Several screening and sorting rounds resulted in the selection of eight yeast clones with significantly improved secretion of recombinant α-amylase. Efficient secretion was genetically stable in the selected clones. We performed whole-genome sequencing of the eight clones and identified 330 mutations in total. Gene ontology analysis of mutated genes revealed many biological processes, including some that have not been identified before in the context of protein secretion. Mutated genes identified in this study can be potentially used for reverse metabolic engineering, with the objective to construct efficient cell factories for protein secretion. The combined use of microfluidics screening and whole-genome sequencing to map the mutations associated with the improved phenotype can easily be adapted for other products and cell types to identify novel engineering targets, and this approach could broadly facilitate design of novel cell factories. PMID:26261321

  6. The mutational landscape in pediatric acute lymphoblastic leukemia deciphered by whole genome sequencing.

    PubMed

    Lindqvist, Carl Mårten; Nordlund, Jessica; Ekman, Diana; Johansson, Anna; Moghadam, Behrooz Torabi; Raine, Amanda; Övernäs, Elin; Dahlberg, Johan; Wahlberg, Per; Henriksson, Niklas; Abrahamsson, Jonas; Frost, Britt-Marie; Grandér, Dan; Heyman, Mats; Larsson, Rolf; Palle, Josefine; Söderhäll, Stefan; Forestier, Erik; Lönnerholm, Gudmar; Syvänen, Ann-Christine; Berglund, Eva C

    2015-01-01

    Genomic characterization of pediatric acute lymphoblastic leukemia (ALL) has identified distinct patterns of genes and pathways altered in patients with well-defined genetic aberrations. To extend the spectrum of known somatic variants in ALL, we performed whole genome and transcriptome sequencing of three B-cell precursor patients, of which one carried the t(12;21)ETV6-RUNX1 translocation and two lacked a known primary genetic aberration, and one T-ALL patient. We found that each patient had a unique genome, with a combination of well-known and previously undetected genomic aberrations. By targeted sequencing in 168 patients, we identified KMT2D and KIF1B as novel putative driver genes. We also identified a putative regulatory non-coding variant that coincided with overexpression of the growth factor MDK. Our results contribute to an increased understanding of the biological mechanisms that lead to ALL and suggest that regulatory variants may be more important for cancer development than recognized to date. The heterogeneity of the genetic aberrations in ALL renders whole genome sequencing particularly well suited for analysis of somatic variants in both research and diagnostic applications. PMID:25355294

  7. Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer.

    PubMed

    Fujimoto, Akihiro; Furuta, Mayuko; Totoki, Yasushi; Tsunoda, Tatsuhiko; Kato, Mamoru; Shiraishi, Yuichi; Tanaka, Hiroko; Taniguchi, Hiroaki; Kawakami, Yoshiiku; Ueno, Masaki; Gotoh, Kunihito; Ariizumi, Shun-Ichi; Wardell, Christopher P; Hayami, Shinya; Nakamura, Toru; Aikata, Hiroshi; Arihiro, Koji; Boroevich, Keith A; Abe, Tetsuo; Nakano, Kaoru; Maejima, Kazuhiro; Sasaki-Oku, Aya; Ohsawa, Ayako; Shibuya, Tetsuo; Nakamura, Hiromi; Hama, Natsuko; Hosoda, Fumie; Arai, Yasuhito; Ohashi, Shoko; Urushidate, Tomoko; Nagae, Genta; Yamamoto, Shogo; Ueda, Hiroki; Tatsuno, Kenji; Ojima, Hidenori; Hiraoka, Nobuyoshi; Okusaka, Takuji; Kubo, Michiaki; Marubashi, Shigeru; Yamada, Terumasa; Hirano, Satoshi; Yamamoto, Masakazu; Ohdan, Hideki; Shimada, Kazuaki; Ishikawa, Osamu; Yamaue, Hiroki; Chayama, Kazuki; Miyano, Satoru; Aburatani, Hiroyuki; Shibata, Tatsuhiro; Nakagawa, Hidewaki

    2016-05-01

    Liver cancer, which is most often associated with virus infection, is prevalent worldwide, and its underlying etiology and genomic structure are heterogeneous. Here we provide a whole-genome landscape of somatic alterations in 300 liver cancers from Japanese individuals. Our comprehensive analysis identified point mutations, structural variations (STVs), and virus integrations, in noncoding and coding regions. We discovered mutational signatures related to liver carcinogenesis and recurrently mutated coding and noncoding regions, such as long intergenic noncoding RNA genes (NEAT1 and MALAT1), promoters, CTCF-binding sites, and regulatory regions. STV analysis found a significant association with replication timing and identified known (CDKN2A, CCND1, APC, and TERT) and new (ASH1L, NCOR1, and MACROD2) cancer-related genes that were recurrently affected by STVs, leading to altered expression. These results emphasize the value of whole-genome sequencing analysis in discovering cancer driver mutations and understanding comprehensive molecular profiles of liver cancer, especially with regard to STVs and noncoding mutations. PMID:27064257

  8. Whole genome sequence typing to investigate the Apophysomyces outbreak following a tornado in Joplin, Missouri, 2011.

    PubMed

    Etienne, Kizee A; Gillece, John; Hilsabeck, Remy; Schupp, Jim M; Colman, Rebecca; Lockhart, Shawn R; Gade, Lalitha; Thompson, Elizabeth H; Sutton, Deanna A; Neblett-Fanfair, Robyn; Park, Benjamin J; Turabelidze, George; Keim, Paul; Brandt, Mary E; Deak, Eszter; Engelthaler, David M

    2012-01-01

    Case reports of Apophysomyces spp. in immunocompetent hosts have been a result of traumatic deep implantation of Apophysomyces spp. spore-contaminated soil or debris. On May 22, 2011 a tornado occurred in Joplin, MO, leaving 13 tornado victims with Apophysomyces trapeziformis infections as a result of lacerations from airborne material. We used whole genome sequence typing (WGST) for high-resolution phylogenetic SNP analysis of 17 outbreak Apophysomyces isolates and five additional temporally and spatially diverse Apophysomyces control isolates (three A. trapeziformis and two A. variabilis isolates). Whole genome SNP phylogenetic analysis revealed three clusters of genotypically related or identical A. trapeziformis isolates and multiple distinct isolates among the Joplin group; this indicated multiple genotypes from a single or multiple sources. Though no linkage between genotype and location of exposure was observed, WGST analysis determined that the Joplin isolates were more closely related to each other than to the control isolates, suggesting local population structure. Additionally, species delineation based on WGST demonstrated the need to reassess currently accepted taxonomic classifications of phylogenetic species within the genus Apophysomyces. PMID:23209631

  9. Sequence to Medical Phenotypes: A Framework for Interpretation of Human Whole Genome DNA Sequence Data

    PubMed Central

    Dewey, Frederick E.; Grove, Megan E.; Priest, James R.; Waggott, Daryl; Batra, Prag; Miller, Clint L.; Wheeler, Matthew; Zia, Amin; Pan, Cuiping; Karzcewski, Konrad J.; Miyake, Christina; Whirl-Carrillo, Michelle; Klein, Teri E.; Datta, Somalee; Altman, Russ B.; Snyder, Michael; Quertermous, Thomas; Ashley, Euan A.

    2015-01-01

    Abstract High throughput sequencing has facilitated a precipitous drop in the cost of genomic sequencing, prompting predictions of a revolution in medicine via genetic personalization of diagnostic and therapeutic strategies. There are significant barriers to realizing this goal that are related to the difficult task of interpreting personal genetic variation. A comprehensive, widely accessible application for interpretation of whole genome sequence data is needed. Here, we present a series of methods for identification of genetic variants and genotypes with clinical associations, phasing genetic data and using Mendelian inheritance for quality control, and providing predictive genetic information about risk for rare disease phenotypes and response to pharmacological therapy in single individuals and father-mother-child trios. We demonstrate application of these methods for disease and drug response prognostication in whole genome sequence data from twelve unrelated adults, and for disease gene discovery in one father-mother-child trio with apparently simplex congenital ventricular arrhythmia. In doing so we identify clinically actionable inherited disease risk and drug response genotypes in pre-symptomatic individuals. We also nominate a new candidate gene in congenital arrhythmia, ATP2B4, and provide experimental evidence of a regulatory role for variants discovered using this framework. PMID:26448358

  10. Parallel single cancer cell whole genome amplification using button-valve assisted mixing in nanoliter chambers.

    PubMed

    Yang, Yoonsun; Swennenhuis, Joost F; Rho, Hoon Suk; Le Gac, Séverine; Terstappen, Leon W M M

    2014-01-01

    The heterogeneity of tumor cells and their alteration during the course of the disease urges the need for real time characterization of individual tumor cells to improve the assessment of treatment options. New generations of therapies are frequently associated with specific genetic alterations driving the need to determine the genetic makeup of tumor cells. Here, we present a microfluidic device for parallel single cell whole genome amplification (pscWGA) to obtain enough copies of a single cell genome to probe for the presence of treatment targets and the frequency of its occurrence among the tumor cells. Individual cells were first captured and loaded into eight parallel amplification units. Next, cells were lysed on a chip and their DNA amplified through successive introduction of dedicated reagents while mixing actively with the help of integrated button-valves. The reaction chamber volume for scWGA 23.85 nl, and starting from 6-7 pg DNA contained in a single cell, around 8 ng of DNA was obtained after WGA, representing over 1000-fold amplification. The amplified products from individual breast cancer cells were collected from the device to either directly investigate the amplification of specific genes by qPCR or for re-amplification of the DNA to obtain sufficient material for whole genome sequencing. Our pscWGA device provides sufficient DNA from individual cells for their genetic characterization, and will undoubtedly allow for automated sample preparation for single cancer cell genomic characterization.

  11. Parallel Single Cancer Cell Whole Genome Amplification Using Button-Valve Assisted Mixing in Nanoliter Chambers

    PubMed Central

    Yang, Yoonsun; Swennenhuis, Joost F.; Rho, Hoon Suk; Le Gac, Séverine; Terstappen, Leon W. M. M.

    2014-01-01

    The heterogeneity of tumor cells and their alteration during the course of the disease urges the need for real time characterization of individual tumor cells to improve the assessment of treatment options. New generations of therapies are frequently associated with specific genetic alterations driving the need to determine the genetic makeup of tumor cells. Here, we present a microfluidic device for parallel single cell whole genome amplification (pscWGA) to obtain enough copies of a single cell genome to probe for the presence of treatment targets and the frequency of its occurrence among the tumor cells. Individual cells were first captured and loaded into eight parallel amplification units. Next, cells were lysed on a chip and their DNA amplified through successive introduction of dedicated reagents while mixing actively with the help of integrated button-valves. The reaction chamber volume for scWGA 23.85 nl, and starting from 6–7 pg DNA contained in a single cell, around 8 ng of DNA was obtained after WGA, representing over 1000-fold amplification. The amplified products from individual breast cancer cells were collected from the device to either directly investigate the amplification of specific genes by qPCR or for re-amplification of the DNA to obtain sufficient material for whole genome sequencing. Our pscWGA device provides sufficient DNA from individual cells for their genetic characterization, and will undoubtedly allow for automated sample preparation for single cancer cell genomic characterization. PMID:25233459

  12. Phylogenetic discovery bias in Bacillus anthracis using single-nucleotide polymorphisms from whole-genome sequencing

    PubMed Central

    Pearson, Talima; Busch, Joseph D.; Ravel, Jacques; Read, Timothy D.; Rhoton, Shane D.; U'Ren, Jana M.; Simonson, Tatum S.; Kachur, Sergey M.; Leadem, Rebecca R.; Cardon, Michelle L.; Van Ert, Matthew N.; Huynh, Lynn Y.; Fraser, Claire M.; Keim, Paul

    2004-01-01

    Phylogenetic reconstruction using molecular data is often subject to homoplasy, leading to inaccurate conclusions about phylogenetic relationships among operational taxonomic units. Compared with other molecular markers, single-nucleotide polymorphisms (SNPs) exhibit extremely low mutation rates, making them rare in recently emerged pathogens, but they are less prone to homoplasy and thus extremely valuable for phylogenetic analyses. Despite their phylogenetic potential, ascertainment bias occurs when SNP characters are discovered through biased taxonomic sampling; by using whole-genome comparisons of five diverse strains of Bacillus anthracis to facilitate SNP discovery, we show that only polymorphisms lying along the evolutionary pathway between reference strains will be observed. We illustrate this in theoretical and simulated data sets in which complex phylogenetic topologies are reduced to linear evolutionary models. Using a set of 990 SNP markers, we also show how divergent branches in our topologies collapse to single points but provide accurate information on internodal distances and points of origin for ancestral clades. These data allowed us to determine the ancestral root of B. anthracis, showing that it lies closer to a newly described “C” branch than to either of two previously described “A” or “B” branches. In addition, subclade rooting of the C branch revealed unequal evolutionary rates that seem to be correlated with ecological parameters and strain attributes. Our use of nonhomoplastic whole-genome SNP characters allows branch points and clade membership to be estimated with great precision, providing greater insight into epidemiological, ecological, and forensic questions. PMID:15347815

  13. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    PubMed

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. PMID:27237775

  14. Whole-genome sequencing of matched primary and metastatic hepatocellular carcinomas

    PubMed Central

    2014-01-01

    Background To gain biological insights into lung metastases from hepatocellular carcinoma (HCC), we compared the whole-genome sequencing profiles of primary HCC and paired lung metastases. Methods We used whole-genome sequencing at 33X-43X coverage to profile somatic mutations in primary HCC (HBV+) and metachronous lung metastases (> 2 years interval). Results In total, 5,027-13,961 and 5,275-12,624 somatic single-nucleotide variants (SNVs) were detected in primary HCC and lung metastases, respectively. Generally, 38.88-78.49% of SNVs detected in metastases were present in primary tumors. We identified 65–221 structural variations (SVs) in primary tumors and 60–232 SVs in metastases. Comparison of these SVs shows very similar and largely overlapped mutated segments between primary and metastatic tumors. Copy number alterations between primary and metastatic pairs were also found to be closely related. Together, these preservations in genomic profiles from liver primary tumors to metachronous lung metastases indicate that the genomic features during tumorigenesis may be retained during metastasis. Conclusions We found very similar genomic alterations between primary and metastatic tumors, with a few mutations found specifically in lung metastases, which may explain the clinical observation that both primary and metastatic tumors are usually sensitive or resistant to the same systemic treatments. PMID:24405831

  15. Identification of EMS-Induced Mutations in Drosophila melanogaster by Whole-Genome Sequencing

    PubMed Central

    Blumenstiel, Justin P.; Noll, Aaron C.; Griffiths, Jennifer A.; Perera, Anoja G.; Walton, Kendra N.; Gilliland, William D.; Hawley, R. Scott; Staehling-Hampton, Karen

    2009-01-01

    Next-generation methods for rapid whole-genome sequencing enable the identification of single-base-pair mutations in Drosophila by comparing a chromosome bearing a new mutation to the unmutagenized sequence. To validate this approach, we sought to identify the molecular lesion responsible for a recessive EMS-induced mutation affecting egg shell morphology by using Illumina next-generation sequencing. After obtaining sufficient sequence from larvae that were homozygous for either wild-type or mutant chromosomes, we obtained high-quality reads for base pairs composing ∼70% of the third chromosome of both DNA samples. We verified 103 single-base-pair changes between the two chromosomes. Nine changes were nonsynonymous mutations and two were nonsense mutations. One nonsense mutation was in a gene, encore, whose mutations produce an egg shell phenotype also observed in progeny of homozygous mutant mothers. Complementation analysis revealed that the chromosome carried a new functional allele of encore, demonstrating that one round of next-generation sequencing can identify the causative lesion for a phenotype of interest. This new method of whole-genome sequencing represents great promise for mutant mapping in flies, potentially replacing conventional methods. PMID:19307605

  16. Whole-genome sequencing of a malignant granular cell tumor with metabolic response to pazopanib.

    PubMed

    Wei, Lei; Liu, Song; Conroy, Jeffrey; Wang, Jianmin; Papanicolau-Sengos, Antonios; Glenn, Sean T; Murakami, Mitsuko; Liu, Lu; Hu, Qiang; Conroy, Jacob; Miles, Kiersten Marie; Nowak, David E; Liu, Biao; Qin, Maochun; Bshara, Wiam; Omilian, Angela R; Head, Karen; Bianchi, Michael; Burgher, Blake; Darlak, Christopher; Kane, John; Merzianu, Mihai; Cheney, Richard; Fabiano, Andrew; Salerno, Kilian; Talati, Chetasi; Khushalani, Nikhil I; Trump, Donald L; Johnson, Candace S; Morrison, Carl D

    2015-10-01

    Granular cell tumors are an uncommon soft tissue neoplasm. Malignant granular cell tumors comprise <2% of all granular cell tumors, are associated with aggressive behavior and poor clinical outcome, and are poorly understood in terms of tumor etiology and systematic treatment. Because of its rarity, the genetic basis of malignant granular cell tumor remains unknown. We performed whole-genome sequencing of one malignant granular cell tumor with metabolic response to pazopanib. This tumor exhibited a very low mutation rate and an overall stable genome with local complex rearrangements. The mutation signature was dominated by C>T transitions, particularly when immediately preceded by a 5' G. A loss-of-function mutation was detected in a newly recognized tumor suppressor candidate, BRD7. No mutations were found in known targets of pazopanib. However, we identified a receptor tyrosine kinase pathway mutation in GFRA2 that warrants further evaluation. To the best of our knowledge, this is only the second reported case of a malignant granular cell tumor exhibiting a response to pazopanib, and the first whole-genome sequencing of this uncommon tumor type. The findings provide insight into the genetic basis of malignant granular cell tumors and identify potential targets for further investigation. PMID:27148567

  17. Identification of a novel salt tolerance gene in wild soybean by whole-genome sequencing.

    PubMed

    Qi, Xinpeng; Li, Man-Wah; Xie, Min; Liu, Xin; Ni, Meng; Shao, Guihua; Song, Chi; Kay-Yuen Yim, Aldrin; Tao, Ye; Wong, Fuk-Ling; Isobe, Sachiko; Wong, Chi-Fai; Wong, Kwong-Sen; Xu, Chunyan; Li, Chunqing; Wang, Ying; Guan, Rui; Sun, Fengming; Fan, Guangyi; Xiao, Zhixia; Zhou, Feng; Phang, Tsui-Hung; Liu, Xuan; Tong, Suk-Wah; Chan, Ting-Fung; Yiu, Siu-Ming; Tabata, Satoshi; Wang, Jian; Xu, Xun; Lam, Hon-Ming

    2014-07-09

    Using a whole-genome-sequencing approach to explore germplasm resources can serve as an important strategy for crop improvement, especially in investigating wild accessions that may contain useful genetic resources that have been lost during the domestication process. Here we sequence and assemble a draft genome of wild soybean and construct a recombinant inbred population for genotyping-by-sequencing and phenotypic analyses to identify multiple QTLs relevant to traits of interest in agriculture. We use a combination of de novo sequencing data from this work and our previous germplasm re-sequencing data to identify a novel ion transporter gene, GmCHX1, and relate its sequence alterations to salt tolerance. Rapid gain-of-function tests show the protective effects of GmCHX1 towards salt stress. This combination of whole-genome de novo sequencing, high-density-marker QTL mapping by re-sequencing and functional analyses can serve as an effective strategy to unveil novel genomic information in wild soybean to facilitate crop improvement.

  18. Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing.

    PubMed

    Ranjan, Ravi; Rani, Asha; Metwally, Ahmed; McGee, Halvor S; Perkins, David L

    2016-01-22

    The human microbiome has emerged as a major player in regulating human health and disease. Translational studies of the microbiome have the potential to indicate clinical applications such as fecal transplants and probiotics. However, one major issue is accurate identification of microbes constituting the microbiota. Studies of the microbiome have frequently utilized sequencing of the conserved 16S ribosomal RNA (rRNA) gene. We present a comparative study of an alternative approach using whole genome shotgun sequencing (WGS). In the present study, we analyzed the human fecal microbiome compiling a total of 194.1 × 10(6) reads from a single sample using multiple sequencing methods and platforms. Specifically, after establishing the reproducibility of our methods with extensive multiplexing, we compared: 1) The 16S rRNA amplicon versus the WGS method, 2) the Illumina HiSeq versus MiSeq platforms, 3) the analysis of reads versus de novo assembled contigs, and 4) the effect of shorter versus longer reads. Our study demonstrates that whole genome shotgun sequencing has multiple advantages compared with the 16S amplicon method including enhanced detection of bacterial species, increased detection of diversity and increased prediction of genes. In addition, increased length, either due to longer reads or the assembly of contigs, improved the accuracy of species detection.

  19. Multiplex Degenerate Primer Design for Targeted Whole Genome Amplification of Many Viral Genomes

    PubMed Central

    Gardner, Shea N.; Jaing, Crystal J.; Elsheikh, Maher M.; Peña, José; Hysom, David A.; Borucki, Monica K.

    2014-01-01

    Background. Targeted enrichment improves coverage of highly mutable viruses at low concentration in complex samples. Degenerate primers that anneal to conserved regions can facilitate amplification of divergent, low concentration variants, even when the strain present is unknown. Results. A tool for designing multiplex sets of degenerate sequencing primers to tile overlapping amplicons across multiple whole genomes is described. The new script, run_tiled_primers, is part of the PriMux software. Primers were designed for each segment of South American hemorrhagic fever viruses, tick-borne encephalitis, Henipaviruses, Arenaviruses, Filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus, and Japanese encephalitis virus. Each group is highly diverse with as little as 5% genome consensus. Primer sets were computationally checked for nontarget cross reactions against the NCBI nucleotide sequence database. Primers for murine hepatitis virus were demonstrated in the lab to specifically amplify selected genes from a laboratory cultured strain that had undergone extensive passage in vitro and in vivo. Conclusions. This software should help researchers design multiplex sets of primers for targeted whole genome enrichment prior to sequencing to obtain better coverage of low titer, divergent viruses. Applications include viral discovery from a complex background and improved sensitivity and coverage of rapidly evolving strains or variants in a gene family. PMID:25157264

  20. Retention and loss of amino acid biosynthetic pathways based on analysis of whole-genome sequences.

    PubMed

    Payne, Samuel H; Loomis, William F

    2006-02-01

    Plants and fungi can synthesize each of the 20 amino acids by using biosynthetic pathways inherited from their bacterial ancestors. However, the ability to synthesize nine amino acids (Phe, Trp, Ile, Leu, Val, Lys, His, Thr, and Met) was lost in a wide variety of eukaryotes that evolved the ability to feed on other organisms. Since the biosynthetic pathways and their respective enzymes are well characterized, orthologs can be recognized in whole genomes to understand when in evolution pathways were lost. The pattern of pathway loss and retention was analyzed in the complete genomes of three early-diverging protist parasites, the amoeba Dictyostelium, and six animals. The nine pathways were lost independently in animals, Dictyostelium, Leishmania, Plasmodium, and Cryptosporidium. Seven additional pathways appear to have been lost in one or another parasite, demonstrating that they are dispensable in a nutrition-rich environment. Our predictions of pathways retained and pathways lost based on computational analyses of whole genomes are validated by minimal-medium studies with mammals, fish, worms, and Dictyostelium. The apparent selective advantages of retaining biosynthetic capabilities for amino acids available in the diet are considered.

  1. Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing.

    PubMed

    Ranjan, Ravi; Rani, Asha; Metwally, Ahmed; McGee, Halvor S; Perkins, David L

    2016-01-22

    The human microbiome has emerged as a major player in regulating human health and disease. Translational studies of the microbiome have the potential to indicate clinical applications such as fecal transplants and probiotics. However, one major issue is accurate identification of microbes constituting the microbiota. Studies of the microbiome have frequently utilized sequencing of the conserved 16S ribosomal RNA (rRNA) gene. We present a comparative study of an alternative approach using whole genome shotgun sequencing (WGS). In the present study, we analyzed the human fecal microbiome compiling a total of 194.1 × 10(6) reads from a single sample using multiple sequencing methods and platforms. Specifically, after establishing the reproducibility of our methods with extensive multiplexing, we compared: 1) The 16S rRNA amplicon versus the WGS method, 2) the Illumina HiSeq versus MiSeq platforms, 3) the analysis of reads versus de novo assembled contigs, and 4) the effect of shorter versus longer reads. Our study demonstrates that whole genome shotgun sequencing has multiple advantages compared with the 16S amplicon method including enhanced detection of bacterial species, increased detection of diversity and increased prediction of genes. In addition, increased length, either due to longer reads or the assembly of contigs, improved the accuracy of species detection. PMID:26718401

  2. Parallel single cancer cell whole genome amplification using button-valve assisted mixing in nanoliter chambers.

    PubMed

    Yang, Yoonsun; Swennenhuis, Joost F; Rho, Hoon Suk; Le Gac, Séverine; Terstappen, Leon W M M

    2014-01-01

    The heterogeneity of tumor cells and their alteration during the course of the disease urges the need for real time characterization of individual tumor cells to improve the assessment of treatment options. New generations of therapies are frequently associated with specific genetic alterations driving the need to determine the genetic makeup of tumor cells. Here, we present a microfluidic device for parallel single cell whole genome amplification (pscWGA) to obtain enough copies of a single cell genome to probe for the presence of treatment targets and the frequency of its occurrence among the tumor cells. Individual cells were first captured and loaded into eight parallel amplification units. Next, cells were lysed on a chip and their DNA amplified through successive introduction of dedicated reagents while mixing actively with the help of integrated button-valves. The reaction chamber volume for scWGA 23.85 nl, and starting from 6-7 pg DNA contained in a single cell, around 8 ng of DNA was obtained after WGA, representing over 1000-fold amplification. The amplified products from individual breast cancer cells were collected from the device to either directly investigate the amplification of specific genes by qPCR or for re-amplification of the DNA to obtain sufficient material for whole genome sequencing. Our pscWGA device provides sufficient DNA from individual cells for their genetic characterization, and will undoubtedly allow for automated sample preparation for single cancer cell genomic characterization. PMID:25233459

  3. CVTree update: a newly designed phylogenetic study platform using composition vectors and whole genomes.

    PubMed

    Xu, Zhao; Hao, Bailin

    2009-07-01

    The CVTree web server (http://tlife.fudan.edu.cn/cvtree) presented here is a new implementation of the whole genome-based, alignment-free composition vector (CV) method for phylogenetic analysis. It is more efficient and user-friendly than the previously published version in the 2004 web server issue of Nucleic Acids Research. The development of whole genome-based alignment-free CV method has provided an independent verification to the traditional phylogenetic analysis based on a single gene or a few genes. This new implementation attempts to meet the challenge of ever increasing amount of genome data and includes in its database more than 850 prokaryotic genomes which will be updated monthly from NCBI, and more than 80 fungal genomes collected manually from several sequencing centers. This new CVTree web server provides a faster and stable research platform. Users can upload their own sequences to find their phylogenetic position among genomes selected from the server's; inbuilt database. All sequence data used in a session may be downloaded as a compressed file. In addition to standard phylogenetic trees, users can also choose to output trees whose monophyletic branches are collapsed to various taxonomic levels. This feature is particularly useful for comparing phylogeny with taxonomy when dealing with thousands of genomes.

  4. Independent Evolution of Winner Traits without Whole Genome Duplication in Dekkera Yeasts

    PubMed Central

    Dai, Shao-Xing; Li, Wen-Xing; Zheng, Jun-Juan; Li, Gong-Hua; Huang, Jing-Fei

    2016-01-01

    Dekkera yeasts have often been considered as alternative sources of ethanol production that could compete with S. cerevisiae. The two lineages of yeasts independently evolved traits that include high glucose and ethanol tolerance, aerobic fermentation, and a rapid ethanol fermentation rate. The Saccharomyces yeasts attained these traits mainly through whole genome duplication approximately 100 million years ago (Mya). However, the Dekkera yeasts, which were separated from S. cerevisiae approximately 200 Mya, did not undergo whole genome duplication (WGD) but still occupy a niche similar to S. cerevisiae. Upon analysis of two Dekkera yeasts and five closely related non-WGD yeasts, we found that a massive loss of cis-regulatory elements occurred in an ancestor of the Dekkera yeasts, which led to improved mitochondrial functions similar to the S. cerevisiae yeasts. The evolutionary analysis indicated that genes involved in the transcription and translation process exhibited faster evolution in the Dekkera yeasts. We detected 90 positively selected genes, suggesting that the Dekkera yeasts evolved an efficient translation system to facilitate adaptive evolution. Moreover, we identified that 12 vacuolar H+-ATPase (V-ATPase) function genes that were under positive selection, which assists in developing tolerance to high alcohol and high sugar stress. We also revealed that the enzyme PGK1 is responsible for the increased rate of glycolysis in the Dekkera yeasts. These results provide important insights to understand the independent adaptive evolution of the Dekkera yeasts and provide tools for genetic modification promoting industrial usage. PMID:27152421

  5. Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach.

    PubMed

    Liang, Muxuan; Li, Zhizhong; Chen, Ting; Zeng, Jianyang

    2015-01-01

    Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for

  6. Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach.

    PubMed

    Liang, Muxuan; Li, Zhizhong; Chen, Ting; Zeng, Jianyang

    2015-01-01

    Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for

  7. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions

    PubMed Central

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. PMID:26989155

  8. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    PubMed

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. PMID:26989155

  9. SBMDb: first whole genome putative microsatellite DNA marker database of sugarbeet for bioenergy and industrial applications.

    PubMed

    Iquebal, Mir Asif; Jaiswal, Sarika; Angadi, U B; Sablok, Gaurav; Arora, Vasu; Kumar, Sunil; Rai, Anil; Kumar, Dinesh

    2015-01-01

    DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world's first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of 'mono' repeat (76.82%) over 'di' repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of ∼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait of

  10. High-Accuracy HLA Type Inference from Whole-Genome Sequencing Data Using Population Reference Graphs

    PubMed Central

    Dilthey, Alexander T.; Gourraud, Pierre-Antoine; McVean, Gil

    2016-01-01

    Genetic variation at the Human Leucocyte Antigen (HLA) genes is associated with many autoimmune and infectious disease phenotypes, is an important element of the immunological distinction between self and non-self, and shapes immune epitope repertoires. Determining the allelic state of the HLA genes (HLA typing) as a by-product of standard whole-genome sequencing data would therefore be highly desirable and enable the immunogenetic characterization of samples in currently ongoing population sequencing projects. Extensive hyperpolymorphism and sequence similarity between the HLA genes, however, pose problems for accurate read mapping and make HLA type inference from whole-genome sequencing data a challenging problem. We describe how to address these challenges in a Population Reference Graph (PRG) framework. First, we construct a PRG for 46 (mostly HLA) genes and pseudogenes, their genomic context and their characterized sequence variants, integrating a database of over 10,000 known allele sequences. Second, we present a sequence-to-PRG paired-end read mapping algorithm that enables accurate read mapping for the HLA genes. Third, we infer the most likely pair of underlying alleles at G group resolution from the IMGT/HLA database at each locus, employing a simple likelihood framework. We show that HLA*PRG, our algorithm, outperforms existing methods by a wide margin. We evaluate HLA*PRG on six classical class I and class II HLA genes (HLA-A, -B, -C, -DQA1, -DQB1, -DRB1) and on a set of 14 samples (3 samples with 2 x 100bp, 11 samples with 2 x 250bp Illumina HiSeq data). Of 158 alleles tested, we correctly infer 157 alleles (99.4%). We also identify and re-type two erroneous alleles in the original validation data. We conclude that HLA*PRG for the first time achieves accuracies comparable to gold-standard reference methods from standard whole-genome sequencing data, though high computational demands (currently ~30–250 CPU hours per sample) remain a significant

  11. SBMDb: first whole genome putative microsatellite DNA marker database of sugarbeet for bioenergy and industrial applications.

    PubMed

    Iquebal, Mir Asif; Jaiswal, Sarika; Angadi, U B; Sablok, Gaurav; Arora, Vasu; Kumar, Sunil; Rai, Anil; Kumar, Dinesh

    2015-01-01

    DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world's first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of 'mono' repeat (76.82%) over 'di' repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of ∼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait of

  12. SBMDb: first whole genome putative microsatellite DNA marker database of sugarbeet for bioenergy and industrial applications

    PubMed Central

    Iquebal, Mir Asif; Jaiswal, Sarika; Angadi, U.B.; Sablok, Gaurav; Arora, Vasu; Kumar, Sunil; Rai, Anil; Kumar, Dinesh

    2015-01-01

    DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world’s first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of ‘mono’ repeat (76.82%) over ‘di’ repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of ∼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait

  13. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing

    PubMed Central

    2013-01-01

    Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Results Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Conclusions Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of

  14. Clostridium botulinum Group II Isolate Phylogenomic Profiling Using Whole-Genome Sequence Data

    PubMed Central

    Weedmark, K. A.; Mabon, P.; Hayden, K. L.; Lambert, D.; Van Domselaar, G.; Austin, J. W.

    2015-01-01

    Clostridium botulinum group II isolates (n = 163) from different geographic regions, outbreaks, and neurotoxin types and subtypes were characterized in silico using whole-genome sequence data. Two clusters representing a variety of botulinum neurotoxin (BoNT) types and subtypes were identified by multilocus sequence typing (MLST) and core single nucleotide polymorphism (SNP) analysis. While one cluster included BoNT/B4/F6/E9 and nontoxigenic members, the other comprised a wide variety of different BoNT/E subtype isolates and a nontoxigenic strain. In silico MLST and core SNP methods were consistent in terms of clade-level isolate classification; however, core SNP analysis showed higher resolution capability. Furthermore, core SNP analysis correctly distinguished isolates by outbreak and location. This study illustrated the utility of next-generation sequence-based typing approaches for isolate characterization and source attribution and identified discrete SNP loci and MLST alleles for isolate comparison. PMID:26116673

  15. Whole genome sequencing reveals extensive community-level transmission of group A Streptococcus in remote communities.

    PubMed

    Bowen, A C; Harris, T; Holt, D C; Giffard, P M; Carapetis, J R; Campbell, P T; McVERNON, J; Tong, S Y C

    2016-07-01

    Impetigo is common in remote Indigenous children of northern Australia, with the primary driver in this context being Streptococcus pyogenes [or group A Streptococcus (GAS)]. To reduce the high burden of impetigo, the transmission dynamics of GAS must be more clearly elucidated. We performed whole genome sequencing on 31 GAS isolates collected in a single community from children in 11 households with ⩾2 GAS-infected children. We aimed to determine whether transmission was occurring principally within households or across the community. The 31 isolates were represented by nine multilocus sequence types and isolates within each sequence type differed from one another by only 0-3 single nucleotide polymorphisms. There was evidence of extensive transmission both within households and across the community. Our findings suggest that strategies to reduce the burden of impetigo in this setting will need to extend beyond individual households, and incorporate multi-faceted, community-wide approaches. PMID:26833141

  16. Developmental timing of mutations revealed by whole-genome sequencing of twins with acute lymphoblastic leukemia.

    PubMed

    Ma, Yussanne; Dobbins, Sara E; Sherborne, Amy L; Chubb, Daniel; Galbiati, Marta; Cazzaniga, Giovanni; Micalizzi, Concetta; Tearle, Rick; Lloyd, Amy L; Hain, Richard; Greaves, Mel; Houlston, Richard S

    2013-04-30

    Acute lymphoblastic leukemia (ALL) is the major pediatric cancer. At diagnosis, the developmental timing of mutations contributing critically to clonal diversification and selection can be buried in the leukemia's covert natural history. Concordance of ALL in monozygotic, monochorionic twins is a consequence of intraplacental spread of an initiated preleukemic clone. Studying monozygotic twins with ALL provides a unique means of uncovering the timeline of mutations contributing to clonal evolution, pre- and postnatally. We sequenced the whole genomes of leukemic cells from two twin pairs with ALL to comprehensively characterize acquired somatic mutations in ALL, elucidating the developmental timing of all genetic lesions. Shared, prenatal, coding-region single-nucleotide variants were limited to the putative initiating lesions. All other nonsynonymous single-nucleotide variants were distinct between tumors and, therefore, secondary and postnatal. These changes occurred in a background of noncoding mutational changes that were almost entirely discordant in twin pairs and likely passenger mutations acquired during leukemic cell proliferation. PMID:23569245

  17. Whole-genome expression analysis in the third instar larval midgut of Drosophila melanogaster.

    PubMed

    Harrop, Thomas W R; Pearce, Stephen L; Daborn, Phillip J; Batterham, Philip

    2014-09-05

    Survival of insects on a substrate containing toxic substances such as plant secondary metabolites or insecticides is dependent on the metabolism or excretion of those xenobiotics. The primary sites of xenobiotic metabolism are the midgut, Malpighian tubules, and fat body. In general, gene expression in these organs is reported for the entire tissue by online databases, but several studies have shown that gene expression within the midgut is compartmentalized. Here, RNA sequencing is used to investigate whole-genome expression in subsections of third instar larval midguts of Drosophila melanogaster. The data support functional diversification in subsections of the midgut. Analysis of the expression of gene families that are implicated in the metabolism of xenobiotics suggests that metabolism may not be uniform along the midgut. These data provide a starting point for investigating gene expression and xenobiotic metabolism and other functions of the larval midgut.

  18. Whole genome sequencing as a tool for phylogenetic analysis of clinical strains of Mitis group streptococci.

    PubMed

    Rasmussen, L H; Dargis, R; Højholt, K; Christensen, J J; Skovgaard, O; Justesen, U S; Rosenvinge, F S; Moser, C; Lukjancenko, O; Rasmussen, S; Nielsen, X C

    2016-10-01

    Identification of Mitis group streptococci (MGS) to the species level is challenging for routine microbiology laboratories. Correct identification is crucial for the diagnosis of infective endocarditis, identification of treatment failure, and/or infection relapse. Eighty MGS from Danish patients with infective endocarditis were whole genome sequenced. We compared the phylogenetic analyses based on single genes (recA, sodA, gdh), multigene (MLSA), SNPs, and core-genome sequences. The six phylogenetic analyses generally showed a similar pattern of six monophyletic clusters, though a few differences were observed in single gene analyses. Species identification based on single gene analysis showed their limitations when more strains were included. In contrast, analyses incorporating more sequence data, like MLSA, SNPs and core-genome analyses, provided more distinct clustering. The core-genome tree showed the most distinct clustering. PMID:27325438

  19. A strategic stakeholder approach for addressing further analysis requests in whole genome sequencing research.

    PubMed

    Thornock, Bradley Steven O

    2016-01-01

    Whole genome sequencing (WGS) can be a cost-effective and efficient means of diagnosis for some children, but it also raises a number of ethical concerns. One such concern is how researchers derive and communicate results from WGS, including future requests for further analysis of stored sequences. The purpose of this paper is to think about what is at stake, and for whom, in any solution that is developed to deal with such requests. To accomplish this task, this paper will utilize stakeholder theory, a common method used in business ethics. Several scenarios that connect stakeholder concerns and WGS will also posited and analyzed. This paper concludes by developing criteria composed of a series of questions that researchers can answer in order to more effectively address requests for further analysis of stored sequences. PMID:27091475

  20. Small homologous blocks in phytophthora genomes do not point to an ancient whole-genome duplication.

    PubMed

    van Hooff, Jolien J E; Snel, Berend; Seidl, Michael F

    2014-05-01

    Genomes of the plant-pathogenic genus Phytophthora are characterized by small duplicated blocks consisting of two consecutive genes (2HOM blocks) and by an elevated abundance of similarly aged gene duplicates. Both properties, in particular the presence of 2HOM blocks, have been attributed to a whole-genome duplication (WGD) at the last common ancestor of Phytophthora. However, large intraspecies synteny-compelling evidence for a WGD-has not been detected. Here, we revisited the WGD hypothesis by deducing the age of 2HOM blocks. Two independent timing methods reveal that the majority of 2HOM blocks arose after divergence of the Phytophthora lineages. In addition, a large proportion of the 2HOM block copies colocalize on the same scaffold. Therefore, the presence of 2HOM blocks does not support a WGD at the last common ancestor of Phytophthora. Thus, genome evolution of Phytophthora is likely driven by alternative mechanisms, such as bursts of transposon activity.

  1. Plant Genetic Archaeology: Whole-Genome Sequencing Reveals the Pedigree of a Classical Trisomic Line

    PubMed Central

    Salomé, Patrice A.; Weigel, Detlef

    2014-01-01

    The circadian oscillator is astonishingly robust to changes in the environment but also to genomic changes that alter the copy number of its components through genome duplication, gene duplication, and homeologous gene loss. While studying the potential effect of aneuploidy on the Arabidopsis thaliana circadian clock, we discovered that a line thought to be trisomic for chromosome 3 also bears the gi-1 mutation, resulting in a short period and late flowering. With the help of whole-genome sequencing, we uncovered the unexpected complexity of this trisomic stock’s history, as its genome shows evidence of past outcrossing with another A. thaliana accession. Our study indicates that although historical aneuploidy lines exist and are available, it might be safer to generate new individuals and confirm their genomes and karyotypes by sequencing. PMID:25524155

  2. Whole-genome analyses resolve early branches in the tree of life of modern birds.

    PubMed

    Jarvis, Erich D; Mirarab, Siavash; Aberer, Andre J; Li, Bo; Houde, Peter; Li, Cai; Ho, Simon Y W; Faircloth, Brant C; Nabholz, Benoit; Howard, Jason T; Suh, Alexander; Weber, Claudia C; da Fonseca, Rute R; Li, Jianwen; Zhang, Fang; Li, Hui; Zhou, Long; Narula, Nitish; Liu, Liang; Ganapathy, Ganesh; Boussau, Bastien; Bayzid, Md Shamsuzzoha; Zavidovych, Volodymyr; Subramanian, Sankar; Gabaldón, Toni; Capella-Gutiérrez, Salvador; Huerta-Cepas, Jaime; Rekepalli, Bhanu; Munch, Kasper; Schierup, Mikkel; Lindow, Bent; Warren, Wesley C; Ray, David; Green, Richard E; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Li, Shengbin; Li, Ning; Huang, Yinhua; Derryberry, Elizabeth P; Bertelsen, Mads Frost; Sheldon, Frederick H; Brumfield, Robb T; Mello, Claudio V; Lovell, Peter V; Wirthlin, Morgan; Schneider, Maria Paula Cruz; Prosdocimi, Francisco; Samaniego, José Alfredo; Vargas Velazquez, Amhed Missael; Alfaro-Núñez, Alonzo; Campos, Paula F; Petersen, Bent; Sicheritz-Ponten, Thomas; Pas, An; Bailey, Tom; Scofield, Paul; Bunce, Michael; Lambert, David M; Zhou, Qi; Perelman, Polina; Driskell, Amy C; Shapiro, Beth; Xiong, Zijun; Zeng, Yongli; Liu, Shiping; Li, Zhenyu; Liu, Binghang; Wu, Kui; Xiao, Jin; Yinqi, Xiong; Zheng, Qiuemei; Zhang, Yong; Yang, Huanming; Wang, Jian; Smeds, Linnea; Rheindt, Frank E; Braun, Michael; Fjeldsa, Jon; Orlando, Ludovic; Barker, F Keith; Jønsson, Knud Andreas; Johnson, Warren; Koepfli, Klaus-Peter; O'Brien, Stephen; Haussler, David; Ryder, Oliver A; Rahbek, Carsten; Willerslev, Eske; Graves, Gary R; Glenn, Travis C; McCormack, John; Burt, Dave; Ellegren, Hans; Alström, Per; Edwards, Scott V; Stamatakis, Alexandros; Mindell, David P; Cracraft, Joel; Braun, Edward L; Warnow, Tandy; Jun, Wang; Gilbert, M Thomas P; Zhang, Guojie

    2014-12-12

    To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago.

  3. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    PubMed

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L). PMID:27617098

  4. Multidrug-resistant Escherichia coli soft tissue infection investigated with bacterial whole genome sequencing.

    PubMed

    Buchanan, Ruaridh; Stoesser, Nicole; Crook, Derrick; Bowler, Ian C J W

    2014-10-19

    A 45-year-old man with dilated cardiomyopathy presented with acute leg pain and erythema suggestive of necrotising fasciitis. Initial surgical exploration revealed no necrosis and treatment for a soft tissue infection was started. Blood and tissue cultures unexpectedly grew a Gram-negative bacillus, subsequently identified by an automated broth microdilution phenotyping system as an extended-spectrum β-lactamase producing Escherichia coli. The patient was treated with a 3-week course of antibiotics (ertapenem followed by ciprofloxacin) and debridement for small areas of necrosis, followed by skin grafting. The presence of E. coli triggered investigation of both host and pathogen. The patient was found to have previously undiagnosed liver disease, a risk factor for E. coli soft tissue infection. Whole genome sequencing of isolates from all specimens confirmed they were clonal, of sequence type ST131 and associated with a likely plasmid-associated AmpC (CMY-2), several other resistance genes and a number of virulence factors.

  5. Genomic Epidemiology: Whole-Genome-Sequencing-Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens.

    PubMed

    Deng, Xiangyu; den Bakker, Henk C; Hendriksen, Rene S

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon. Technological, operational, and policy challenges are still present and being addressed by an international and multidisciplinary community of researchers, public health practitioners, and other stakeholders.

  6. Whole-Genome Expression Analysis in the Third Instar Larval Midgut of Drosophila melanogaster

    PubMed Central

    Harrop, Thomas W. R.; Pearce, Stephen L.; Daborn, Phillip J.; Batterham, Philip

    2014-01-01

    Survival of insects on a substrate containing toxic substances such as plant secondary metabolites or insecticides is dependent on the metabolism or excretion of those xenobiotics. The primary sites of xenobiotic metabolism are the midgut, Malpighian tubules, and fat body. In general, gene expression in these organs is reported for the entire tissue by online databases, but several studies have shown that gene expression within the midgut is compartmentalized. Here, RNA sequencing is used to investigate whole-genome expression in subsections of third instar larval midguts of Drosophila melanogaster. The data support functional diversification in subsections of the midgut. Analysis of the expression of gene families that are implicated in the metabolism of xenobiotics suggests that metabolism may not be uniform along the midgut. These data provide a starting point for investigating gene expression and xenobiotic metabolism and other functions of the larval midgut. PMID:25193493

  7. BSSV: Bayesian based somatic structural variation identification with whole genome DNA-seq data.

    PubMed

    Chen, Xi; Shi, Xu; Shajahan, Ayesha N; Hilakivi-Clarke, Leena; Clarke, Robert; Xuan, Jianhua

    2014-01-01

    High coverage whole genome DNA-sequencing enables identification of somatic structural variation (SSV) more evident in paired tumor and normal samples. Recent studies show that simultaneous analysis of paired samples provides a better resolution of SSV detection than subtracting shared SVs. However, available tools can neither identify all types of SSVs nor provide any rank information regarding their somatic features. In this paper, we have developed a Bayesian framework, by integrating read alignment information from both tumor and normal samples, called BSSV, to calculate the significance of each SSV. Tested by simulated data, the precision of BSSV is comparable to that of available tools and the false negative rate is significantly lowered. We have also applied this approach to The Cancer Genome Atlas breast cancer data for SSV detection. Many known breast cancer specific mutated genes like RAD51, BRIP1, ER, PGR and PTPRD have been successfully identified.

  8. Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation.

    PubMed

    Zhao, Shancen; Zheng, Pingping; Dong, Shanshan; Zhan, Xiangjiang; Wu, Qi; Guo, Xiaosen; Hu, Yibo; He, Weiming; Zhang, Shanning; Fan, Wei; Zhu, Lifeng; Li, Dong; Zhang, Xuemei; Chen, Quan; Zhang, Hemin; Zhang, Zhihe; Jin, Xuelin; Zhang, Jinguo; Yang, Huanming; Wang, Jian; Wang, Jun; Wei, Fuwen

    2013-01-01

    The panda lineage dates back to the late Miocene and ultimately leads to only one extant species, the giant panda (Ailuropoda melanoleuca). Although global climate change and anthropogenic disturbances are recognized to shape animal population demography their contribution to panda population dynamics remains largely unknown. We sequenced the whole genomes of 34 pandas at an average 4.7-fold coverage and used this data set together with the previously deep-sequenced panda genome to reconstruct a continuous demographic history of pandas from their origin to the present. We identify two population expansions, two bottlenecks and two divergences. Evidence indicated that, whereas global changes in climate were the primary drivers of population fluctuation for millions of years, human activities likely underlie recent population divergence and serious decline. We identified three distinct panda populations that show genetic adaptation to their environments. However, in all three populations, anthropogenic activities have negatively affected pandas for 3,000 years. PMID:23242367

  9. Whole-Genome Sequencing to Determine Origin of Multinational Outbreak of Sarocladium kiliense Bloodstream Infections

    PubMed Central

    Roe, Chandler C.; Smith, Rachel M.; Vallabhaneni, Snigdha; Duarte, Carolina; Escandón, Patricia; Castañeda, Elizabeth; Gómez, Beatriz L.; de Bedout, Catalina; López, Luisa F.; Salas, Valentina; Hederra, Luz Maria; Fernández, Jorge; Pidal, Paola; Hormazabel, Juan Carlos; Otaíza-O’Ryan, Fernando; Vannberg, Fredrik O.; Gillece, John; Lemmer, Darrin; Driebe, Elizabeth M.; Engelthaler, David M.; Litvintseva, Anastasia P.

    2016-01-01

    We used whole-genome sequence typing (WGST) to investigate an outbreak of Sarocladium kiliense bloodstream infections (BSI) associated with receipt of contaminated antinausea medication among oncology patients in Colombia and Chile during 2013–2014. Twenty-five outbreak isolates (18 from patients and 7 from medication vials) and 11 control isolates unrelated to this outbreak were subjected to WGST to elucidate a source of infection. All outbreak isolates were nearly indistinguishable (<5 single-nucleotide polymorphisms), and >21,000 single-nucleotide polymorphisms were identified from unrelated control isolates, suggesting a point source for this outbreak. S. kiliense has been previously implicated in healthcare-related infections; however, the lack of available typing methods has precluded the ability to substantiate point sources. WGST for outbreak investigation caused by eukaryotic pathogens without reference genomes or existing genotyping methods enables accurate source identification to guide implementation of appropriate control and prevention measures. PMID:26891230

  10. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database.

    PubMed

    Allard, Marc W; Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M; Brown, Eric W; Timme, Ruth

    2016-08-01

    The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks.

  11. A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome

    SciTech Connect

    Chapman, Jarrod A.; Mascher, Martin; Buluc, Aydin; Barry, Kerrie; Georganas, Evangelos; Session, Adam; Strnadova, Veronika; Jenkins, Jerry; Sehgal, Sunish; Oliker, Leonid; Schmutz, Jeremy; Yelick, Katherine A.; Scholz, Uwe; Waugh, Robbie; Poland, Jesse A.; Muehlbauer, Gary J.; Stein, Nils; Rokhsar, Daniel S.

    2015-01-31

    We report that polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible to construct a mapping population.

  12. A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome

    DOE PAGESBeta

    Chapman, Jarrod A.; Mascher, Martin; Buluc, Aydin; Barry, Kerrie; Georganas, Evangelos; Session, Adam; Strnadova, Veronika; Jenkins, Jerry; Sehgal, Sunish; Oliker, Leonid; et al

    2015-01-31

    We report that polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible tomore » construct a mapping population.« less

  13. Developing insights into the mechanisms of evolution of bacterial pathogens from whole-genome sequences

    PubMed Central

    Bentley, Stephen D

    2014-01-01

    Evolution of bacterial pathogen populations has been detected in a variety of ways including phenotypic tests, such as metabolic activity, reaction to antisera and drug resistance and genotypic tests that measure variation in chromosome structure, repetitive loci and individual gene sequences. While informative, these methods only capture a small subset of the total variation and, therefore, have limited resolution. Advances in sequencing technologies have made it feasible to capture whole-genome sequence variation for each sample under study, providing the potential to detect all changes at all positions in the genome from single nucleotide changes to large-scale insertions and deletions. In this review, we focus on recent work that has applied this powerful new approach and summarize some of the advances that this has brought in our understanding of the details of how bacterial pathogens evolve. PMID:23075447

  14. Molecular etiology of an indolent lymphoproliferative disorder determined by whole-genome sequencing

    PubMed Central

    Parker, Jeremy D.K.; Shen, Yaoqing; Pleasance, Erin; Li, Yvonne; Schein, Jacqueline E.; Zhao, Yongjun; Moore, Richard; Wegrzyn-Woltosz, Joanna; Savage, Kerry J.; Weng, Andrew P.; Gascoyne, Randy D.; Jones, Steven; Marra, Marco; Laskin, Janessa; Karsan, Aly

    2016-01-01

    In an attempt to assess potential treatment options, whole-genome and transcriptome sequencing were performed on a patient with an unclassifiable small lymphoproliferative disorder. Variants from genome sequencing were prioritized using a combination of comparative variant distributions in a spectrum of lymphomas, and meta-analyses of gene expression profiling. In this patient, the molecular variants that we believe to be most relevant to the disease presentation most strongly resemble a diffuse large B-cell lymphoma (DLBCL), whereas the gene expression data are most consistent with a low-grade chronic lymphocytic leukemia (CLL). The variant of greatest interest was a predicted NOTCH2-truncating mutation, which has been recently reported in various lymphomas. PMID:27148583

  15. Draft whole genome sequence of the cyanide-degrading bacterium Pseudomonas pseudoalcaligenes CECT5344.

    PubMed

    Luque-Almagro, Víctor M; Acera, Felipe; Igeño, Ma Isabel; Wibberg, Daniel; Roldán, Ma Dolores; Sáez, Lara P; Hennig, Magdalena; Quesada, Alberto; Huertas, Ma José; Blom, Jochen; Merchán, Faustino; Escribano, Ma Paz; Jaenicke, Sebastian; Estepa, Jessica; Guijo, Ma Isabel; Martínez-Luque, Manuel; Macías, Daniel; Szczepanowski, Rafael; Becerra, Gracia; Ramirez, Silvia; Carmona, Ma Isabel; Gutiérrez, Oscar; Manso, Isabel; Pühler, Alfred; Castillo, Francisco; Moreno-Vivián, Conrado; Schlüter, Andreas; Blasco, Rafael

    2013-01-01

    Pseudomonas pseudoalcaligenes CECT5344 is a Gram-negative bacterium able to tolerate cyanide and to use it as the sole nitrogen source. We report here the first draft of the whole genome sequence of a P. pseudoalcaligenes strain that assimilates cyanide. Three aspects are specially emphasized in this manuscript. First, some generalities of the genome are shown and discussed in the context of other Pseudomonadaceae genomes, including genome size, G + C content, core genome and singletons among other features. Second, the genome is analysed in the context of cyanide metabolism, describing genes probably involved in cyanide assimilation, like those encoding nitrilases, and genes related to cyanide resistance, like the cio genes encoding the cyanide insensitive oxidases. Finally, the presence of genes probably involved in other processes with a great biotechnological potential like production of bioplastics and biodegradation of pollutants also is discussed. PMID:22998548

  16. Advances in Understanding Bacterial Pathogenesis Gained from Whole-Genome Sequencing and Phylogenetics.

    PubMed

    Klemm, Elizabeth; Dougan, Gordon

    2016-05-11

    The development of next-generation sequencing as a cost-effective technology has facilitated the analysis of bacterial population structure at a whole-genome level and at scale. From these data, phylogenic trees have been constructed that define population structures at a local, national, and global level, providing a framework for genetic analysis. Although still at an early stage, these approaches have yielded progress in several areas, including pathogen transmission mapping, the genetics of niche colonization and host adaptation, as well as gene-to-phenotype association studies. Antibiotic resistance has proven to be a major challenge in the early 21(st) century, and phylogenetic analyses have uncovered the dramatic effect that the use of antibiotics has had on shaping bacterial population structures. An update on insights into bacterial evolution from comparative genomics is provided in this review. PMID:27173928

  17. CVTree: a Whole-Genome and Alignment-Free Approach to Microbial Phylogeny

    NASA Astrophysics Data System (ADS)

    Hao, Bailin

    The number of sequenced genomes of Archaea, Bacteria, and Fungi accumulates rapidly. Several thousands genomes of these unicellular organisms will be available in a few years. Due to the extremely large difference in genome size and gene content it is difficult to use the traditional alignment-based method to infer phylogeny from the genomes. An alignment-free and whole-genome-based approach called CVTree has been developed and successfully applied to these organisms. As CVTree has been successfully applied to genomes of viruses, chloroplasts, Bacteria, Archaea and fungi, in this brief review we will mainly touch on some mathematical problems related to the foundation of the new approach, including a few yet unsolved problems, such as the violation of the triangular inequalities of the dissimilarity measure used in the CVTree method.

  18. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine

    PubMed Central

    Tsai, Ellen A.; Shakbatyan, Rimma; Evans, Jason; Rossetti, Peter; Graham, Chet; Sharma, Himanshu; Lin, Chiao-Feng; Lebo, Matthew S.

    2016-01-01

    Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient’s genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS) with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS. PMID:26927186

  19. Mapping genomic features to functional traits through microbial whole genome sequences.

    PubMed

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.

  20. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine.

    PubMed

    Tsai, Ellen A; Shakbatyan, Rimma; Evans, Jason; Rossetti, Peter; Graham, Chet; Sharma, Himanshu; Lin, Chiao-Feng; Lebo, Matthew S

    2016-01-01

    Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient's genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS) with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS. PMID:26927186

  1. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data

    PubMed Central

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  2. Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding

    PubMed Central

    de los Campos, Gustavo; Hickey, John M.; Pong-Wong, Ricardo; Daetwyler, Hans D.; Calus, Mario P. L.

    2013-01-01

    Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade. PMID:22745228

  3. A strategic stakeholder approach for addressing further analysis requests in whole genome sequencing research.

    PubMed

    Thornock, Bradley Steven O

    2016-01-01

    Whole genome sequencing (WGS) can be a cost-effective and efficient means of diagnosis for some children, but it also raises a number of ethical concerns. One such concern is how researchers derive and communicate results from WGS, including future requests for further analysis of stored sequences. The purpose of this paper is to think about what is at stake, and for whom, in any solution that is developed to deal with such requests. To accomplish this task, this paper will utilize stakeholder theory, a common method used in business ethics. Several scenarios that connect stakeholder concerns and WGS will also posited and analyzed. This paper concludes by developing criteria composed of a series of questions that researchers can answer in order to more effectively address requests for further analysis of stored sequences.

  4. Whole genome sequencing as a tool for phylogenetic analysis of clinical strains of Mitis group streptococci.

    PubMed

    Rasmussen, L H; Dargis, R; Højholt, K; Christensen, J J; Skovgaard, O; Justesen, U S; Rosenvinge, F S; Moser, C; Lukjancenko, O; Rasmussen, S; Nielsen, X C

    2016-10-01

    Identification of Mitis group streptococci (MGS) to the species level is challenging for routine microbiology laboratories. Correct identification is crucial for the diagnosis of infective endocarditis, identification of treatment failure, and/or infection relapse. Eighty MGS from Danish patients with infective endocarditis were whole genome sequenced. We compared the phylogenetic analyses based on single genes (recA, sodA, gdh), multigene (MLSA), SNPs, and core-genome sequences. The six phylogenetic analyses generally showed a similar pattern of six monophyletic clusters, though a few differences were observed in single gene analyses. Species identification based on single gene analysis showed their limitations when more strains were included. In contrast, analyses incorporating more sequence data, like MLSA, SNPs and core-genome analyses, provided more distinct clustering. The core-genome tree showed the most distinct clustering.

  5. Clinical Decision Support for Whole Genome Sequence Information Leveraging a Service-Oriented Architecture: a Prototype

    PubMed Central

    Welch, Brandon M.; Rodriguez-Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku

    2014-01-01

    Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time. PMID:25954430

  6. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database.

    PubMed

    Allard, Marc W; Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M; Brown, Eric W; Timme, Ruth

    2016-08-01

    The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. PMID:27008877

  7. A Gene-By-Gene Approach to Bacterial Population Genomics: Whole Genome MLST of Campylobacter.

    PubMed

    Sheppard, Samuel K; Jolley, Keith A; Maiden, Martin C J

    2012-01-01

    Campylobacteriosis remains a major human public health problem world-wide. Genetic analyses of Campylobacter isolates, and particularly molecular epidemiology, have been central to the study of this disease, particularly the characterization of Campylobacter genotypes isolated from human infection, farm animals, and retail food. These studies have demonstrated that Campylobacter populations are highly structured, with distinct genotypes associated with particular wild or domestic animal sources, and that chicken meat is the most likely source of most human infection in countries such as the UK. The availability of multiple whole genome sequences from Campylobacter isolates presents the prospect of identifying those genes or allelic variants responsible for host-association and increased human disease risk, but the diversity of Campylobacter genomes present challenges for such analyses. We present a gene-by-gene approach for investigating the genetic basis of phenotypes in diverse bacteria such as Campylobacter, implemented with the BIGSdb software on the pubMLST.org/campylobacter website. PMID:24704917

  8. Whole genome sequencing reveals extensive community-level transmission of group A Streptococcus in remote communities.

    PubMed

    Bowen, A C; Harris, T; Holt, D C; Giffard, P M; Carapetis, J R; Campbell, P T; McVERNON, J; Tong, S Y C

    2016-07-01

    Impetigo is common in remote Indigenous children of northern Australia, with the primary driver in this context being Streptococcus pyogenes [or group A Streptococcus (GAS)]. To reduce the high burden of impetigo, the transmission dynamics of GAS must be more clearly elucidated. We performed whole genome sequencing on 31 GAS isolates collected in a single community from children in 11 households with ⩾2 GAS-infected children. We aimed to determine whether transmission was occurring principally within households or across the community. The 31 isolates were represented by nine multilocus sequence types and isolates within each sequence type differed from one another by only 0-3 single nucleotide polymorphisms. There was evidence of extensive transmission both within households and across the community. Our findings suggest that strategies to reduce the burden of impetigo in this setting will need to extend beyond individual households, and incorporate multi-faceted, community-wide approaches.

  9. Real time application of whole genome sequencing for outbreak investigation - What is an achievable turnaround time?

    PubMed

    McGann, Patrick; Bunin, Jessica L; Snesrud, Erik; Singh, Seema; Maybank, Rosslyn; Ong, Ana C; Kwak, Yoon I; Seronello, Scott; Clifford, Robert J; Hinkle, Mary; Yamada, Stephen; Barnhill, Jason; Lesho, Emil

    2016-07-01

    Whole genome sequencing (WGS) is increasingly employed in clinical settings, though few assessments of turnaround times (TAT) have been performed in real-time. In this study, WGS was used to investigate an unfolding outbreak of vancomycin resistant Enterococcus faecium (VRE) among 3 patients in the ICU of a tertiary care hospital. Including overnight culturing, a TAT of just 48.5 h for a comprehensive report was achievable using an Illumina Miseq benchtop sequencer. WGS revealed that isolates from patient 2 and 3 differed from that of patient 1 by a single nucleotide polymorphism (SNP), indicating nosocomial transmission. However, the unparalleled resolution provided by WGS suggested that nosocomial transmission involved two separate events from patient 1 to patient 2 and 3, and not a linear transmission suspected by the time line. Rapid TAT's are achievable using WGS in the clinical setting and can provide an unprecedented level of resolution for outbreak investigations. PMID:27185645

  10. Whole genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing

    PubMed Central

    Harris, Simon R.; Clarke, Ian N.; Seth-Smith, Helena M. B.; Solomon, Anthony W.; Cutcliffe, Lesley T.; Marsh, Peter; Skilton, Rachel J.; Holland, Martin J.; Mabey, David; Peeling, Rosanna W.; Lewis, David A.; Spratt, Brian G.; Unemo, Magnus; Persson, Kenneth; Bjartling, Carina; Brunham, Robert; de Vries, Henry J.C.; Morré, Servaas A.; Speksnijder, Arjen; Bébéar, Cécile M.; Clerc, Maïté; de Barbeyrac, Bertille; Parkhill, Julian; Thomson, Nicholas R.

    2012-01-01

    Chlamydia trachomatis is responsible for both trachoma and sexually transmitted infections causing substantial morbidity and economic cost globally. Despite this, our knowledge of its population and evolutionary genetics is limited. Here we present a detailed whole genome phylogeny from representative strains of both trachoma and lymphogranuloma venereum (LGV) biovars from temporally and geographically diverse sources. Our analysis demonstrates that predicting phylogenetic structure using the ompA gene, traditionally used to classify Chlamydia, is misleading because extensive recombination in this region masks true relationships. We show that in many instances ompA is a chimera that can be exchanged in part or whole, both within and between biovars. We also provide evidence for exchange of, and recombination within, the cryptic plasmid, another important diagnostic target. We have used our phylogenetic framework to show how genetic exchange has manifested itself in ocular, urogenital and LGV C. trachomatis strains, including the epidemic LGV serotype L2b. PMID:22406642

  11. Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation.

    PubMed

    Zhao, Shancen; Zheng, Pingping; Dong, Shanshan; Zhan, Xiangjiang; Wu, Qi; Guo, Xiaosen; Hu, Yibo; He, Weiming; Zhang, Shanning; Fan, Wei; Zhu, Lifeng; Li, Dong; Zhang, Xuemei; Chen, Quan; Zhang, Hemin; Zhang, Zhihe; Jin, Xuelin; Zhang, Jinguo; Yang, Huanming; Wang, Jian; Wang, Jun; Wei, Fuwen

    2013-01-01

    The panda lineage dates back to the late Miocene and ultimately leads to only one extant species, the giant panda (Ailuropoda melanoleuca). Although global climate change and anthropogenic disturbances are recognized to shape animal population demography their contribution to panda population dynamics remains largely unknown. We sequenced the whole genomes of 34 pandas at an average 4.7-fold coverage and used this data set together with the previously deep-sequenced panda genome to reconstruct a continuous demographic history of pandas from their origin to the present. We identify two population expansions, two bottlenecks and two divergences. Evidence indicated that, whereas global changes in climate were the primary drivers of population fluctuation for millions of years, human activities likely underlie recent population divergence and serious decline. We identified three distinct panda populations that show genetic adaptation to their environments. However, in all three populations, anthropogenic activities have negatively affected pandas for 3,000 years.

  12. Whole genome sequencing provides an unambiguous link between Salmonella Dublin outbreak strain and a historical isolate.

    PubMed

    Mohammed, M; Delappe, N; O'Connor, J; McKeown, P; Garvey, P; Cormican, M

    2016-02-01

    Salmonella enterica subsp. enterica serovar Dublin is an uncommon cause of human salmonellosis; however, a relatively high proportion of cases are associated with invasive disease. The serotype is associated with cattle. A geographically diffuse outbreak of S. Dublin involving nine patients occurred in Ireland in 2013. The source of infection was not identified. Typing of outbreak associated isolates by pulsed-field gel electrophoresis (PFGE) was of limited value because PFGE has limited discriminatory power for S. Dublin. Whole genome sequencing (WGS) showed conclusively that the isolates were closely related to each other, to an apparently unrelated isolate from 2011 and distinct from other isolates that were not readily distinguishable by PFGE. PMID:26165314

  13. Ensuring backwards compatibility: traditional genotyping efforts in the era of whole genome sequencing.

    PubMed

    Bletz, S; Mellmann, A; Rothgänger, J; Harmsen, D

    2015-04-01

    When using next-generation whole genome sequencing (WGS), extraction of spa types from WGS data is essential for backwards compatibility with Sanger sequencing-based spa typing of methicillin-resistant Staphylococcus aureus (MRSA). We evaluated WGS-based spa typing with a 2×250 bp protocol in a diverse collection of 423 MRSA isolates using two pipelines that executed sequence quality-trimming and de novo assembly before spa typing. The SeqSphere(+) pipeline correctly typed 419 isolates (99.1%) whereas the CLCbio pipeline succeeded in 249 isolates (58.9%). In summary, WGS combined with an optimized de novo assembly enables nearly full compatibility with Sanger sequencing-based spa typing data. PMID:25658529

  14. Integration of whole-genome sequencing into infection control practices: the potential and the hurdles.

    PubMed

    Robilotti, Elizabeth; Kamboj, Mini

    2015-04-01

    Microbial whole-genome sequencing (WGS) is poised to transform many of the currently used approaches in medical microbiology. Recent reports on the application of WGS to understand genetic evolution and reconstruct transmission pathways have provided valuable information that will influence infection control practices. While this technology holds great promise, obstacles to full implementation remain. Two articles in this issue of the Journal of Clinical Microbiology (S. Octavia, Q. Wang, M. M. Tanaka, S. Kaur, V. Sintchenko, and R. Lan, J Clin Microbiol 53:1063-1071, 2015, doi:10.1128/JCM.03235-14, and S. J. Salipante, D. J. SenGupta, L. A. Cummings, T. A. Land, D. R. Hoogestraat, and B. T. Cookson, J Clin Microbiol 53:1072-1079, 2015, doi:10.1128/JCM.03385-14) describe the breadth of application of WGS to the field of clinical epidemiology. PMID:25673795

  15. Genomic Epidemiology: Whole-Genome-Sequencing-Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens.

    PubMed

    Deng, Xiangyu; den Bakker, Henk C; Hendriksen, Rene S

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon. Technological, operational, and policy challenges are still present and being addressed by an international and multidisciplinary community of researchers, public health practitioners, and other stakeholders. PMID:26772415

  16. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

    PubMed Central

    Alioto, Tyler S.; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D.; Hovig, Eivind; Heisler, Lawrence E.; Beck, Timothy A.; Simpson, Jared T.; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S.; Butler, Adam P.; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W.; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C.; Gut, Marta; Denroche, Robert E.; Harding, Nicholas J.; Yamaguchi, Takafumi N.; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G.; Anderson, Charlotte L.; Waddell, Nicola; Pearson, John V.; Grimmond, Sean M.; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A.; López-Otín, Carlos; Campo, Elías; Campbell, Peter J.; Boutros, Paul C.; Puente, Xose S.; Gerhard, Daniela S.; Pfister, Stefan M.; McPherson, John D.; Hudson, Thomas J.; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T. W.; Gut, Ivo G.

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  17. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

    PubMed

    Alioto, Tyler S; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D; Hovig, Eivind; Heisler, Lawrence E; Beck, Timothy A; Simpson, Jared T; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S; Butler, Adam P; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C; Gut, Marta; Denroche, Robert E; Harding, Nicholas J; Yamaguchi, Takafumi N; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G; Anderson, Charlotte L; Waddell, Nicola; Pearson, John V; Grimmond, Sean M; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A; López-Otín, Carlos; Campo, Elías; Campbell, Peter J; Boutros, Paul C; Puente, Xose S; Gerhard, Daniela S; Pfister, Stefan M; McPherson, John D; Hudson, Thomas J; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T W; Gut, Ivo G

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  18. The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates

    PubMed Central

    Berthelot, Camille; Brunet, Frédéric; Chalopin, Domitille; Juanchich, Amélie; Bernard, Maria; Noël, Benjamin; Bento, Pascal; Da Silva, Corinne; Labadie, Karine; Alberti, Adriana; Aury, Jean-Marc; Louis, Alexandra; Dehais, Patrice; Bardou, Philippe; Montfort, Jérôme; Klopp, Christophe; Cabau, Cédric; Gaspin, Christine; Thorgaard, Gary H.; Boussaha, Mekki; Quillet, Edwige; Guyomard, René; Galiana, Delphine; Bobe, Julien; Volff, Jean-Nicolas; Genêt, Carine; Wincker, Patrick; Jaillon, Olivier; Crollius, Hugues Roest; Guiguen, Yann

    2014-01-01

    Vertebrate evolution has been shaped by several rounds of whole-genome duplications (WGDs) that are often suggested to be associated with adaptive radiations and evolutionary innovations. Due to an additional round of WGD, the rainbow trout genome offers a unique opportunity to investigate the early evolutionary fate of a duplicated vertebrate genome. Here we show that after 100 million years of evolution the two ancestral subgenomes have remained extremely collinear, despite the loss of half of the duplicated protein-coding genes, mostly through pseudogenization. In striking contrast is the fate of miRNA genes that have almost all been retained as duplicated copies. The slow and stepwise rediploidization process characterized here challenges the current hypothesis that WGD is followed by massive and rapid genomic reorganizations and gene deletions. PMID:24755649

  19. Use of Whole Genome Sequencing and Patient Interviews To Link a Case of Sporadic Listeriosis to Consumption of Prepackaged Lettuce.

    PubMed

    Jackson, K A; Stroika, S; Katz, L S; Beal, J; Brandt, E; Nadon, C; Reimer, A; Major, B; Conrad, A; Tarr, C; Jackson, B R; Mody, R K

    2016-05-01

    We report on a case of listeriosis in a patient who probably consumed a prepackaged romaine lettuce-containing product recalled for Listeria monocytogenes contamination. Although definitive epidemiological information demonstrating exposure to the specific recalled product was lacking, the patient reported consumption of a prepackaged romaine lettuce-containing product of either the recalled brand or a different brand. A multinational investigation found that patient and food isolates from the recalled product were indistinguishable by pulsed-field gel electrophoresis and were highly related by whole genome sequencing, differing by four alleles by whole genome multilocus sequence typing and by five high-quality single nucleotide polymorphisms, suggesting a common source. To our knowledge, this is the first time prepackaged lettuce has been identified as a likely source for listeriosis. This investigation highlights the power of whole genome sequencing, as well as the continued need for timely and thorough epidemiological exposure data to identify sources of foodborne infections.

  20. Use of Whole Genome Sequencing and Patient Interviews To Link a Case of Sporadic Listeriosis to Consumption of Prepackaged Lettuce.

    PubMed

    Jackson, K A; Stroika, S; Katz, L S; Beal, J; Brandt, E; Nadon, C; Reimer, A; Major, B; Conrad, A; Tarr, C; Jackson, B R; Mody, R K

    2016-05-01

    We report on a case of listeriosis in a patient who probably consumed a prepackaged romaine lettuce-containing product recalled for Listeria monocytogenes contamination. Although definitive epidemiological information demonstrating exposure to the specific recalled product was lacking, the patient reported consumption of a prepackaged romaine lettuce-containing product of either the recalled brand or a different brand. A multinational investigation found that patient and food isolates from the recalled product were indistinguishable by pulsed-field gel electrophoresis and were highly related by whole genome sequencing, differing by four alleles by whole genome multilocus sequence typing and by five high-quality single nucleotide polymorphisms, suggesting a common source. To our knowledge, this is the first time prepackaged lettuce has been identified as a likely source for listeriosis. This investigation highlights the power of whole genome sequencing, as well as the continued need for timely and thorough epidemiological exposure data to identify sources of foodborne infections. PMID:27296429

  1. A method for constructing radiation hybrid maps of whole genomes: Application to physically mapping chromosome 14

    SciTech Connect

    Walter, M.A.; Mirzavans, F.; Tsuji, S.

    1994-09-01

    By reverting to the original protocols of Goss and Harris, we have created a panel of whole genome radiation hybrids (WG-RHs) using a diploid human fibroblast as the chromosome donor, rather than the usual monochromosomal human/rodent somatic cell hybrid. We have analyzed markers from chromosome 14 to test the feasibility of using WG-RH cell lines to generate physical maps of human chromosomes. As WG-RH mapping exploits rodent/human differences, loci need not be polymorphic to be informative. Sixty-one chromosome 14 markers, including 24 STSs and ETSs, were used to create a high resolution radiation hybrid map of human chromosome 14. The average marker retention was found to be 22.4%, very similar to the marker retention frequencies of conventional radiation hybrids. Two point and multipoint statistical analyses of the patterns of chromosome 14 marker retention were used to create a WG-RH map of human chromosome 14 with 4 gaps, corresponding to regions of low marker density. We are currently testing additional markers to close the map. Conventional radiation hybrid mapping requires between 100 and 200 hybrids to map each chromosome. The large number of hybrids (up to 4,000) required to map the whole genome is a major drawback of this method. In contrast, a single panel of 100 to 200 WG-RH cell lines is sufficient to allow the construction of a high resolution map of the whole human genome with a single panel of only 100 to 200 hybrids. Our results demonstrate that chromosome fragmentation by WG-RH can be used to map one chromosome, and by extension, entire genomes.

  2. Genome structure analysis of molluscs revealed whole genome duplication and lineage specific repeat variation.

    PubMed

    Yoshida, Masa-aki; Ishikura, Yukiko; Moritaki, Takeya; Shoguchi, Eiichi; Shimizu, Kentaro K; Sese, Jun; Ogura, Atsushi

    2011-09-01

    Comparative genome structure analysis allows us to identify novel genes, repetitive sequences and gene duplications. To explore lineage-specific genomic changes of the molluscs that is good model for development of nervous system in invertebrate, we conducted comparative genome structure analyses of three molluscs, pygmy squid, nautilus and scallops using partial genome shotgun sequencing. Most effective elements on the genome structural changes are repetitive elements (REs) causing expansion of genome size and whole genome duplication producing large amount of novel functional genes. Therefore, we investigated variation and proportion of REs and whole genome duplication. We, first, identified variations of REs in the three molluscan genomes by homology-based and de novo RE detection. Proportion of REs were 9.2%, 4.0%, and 3.8% in the pygmy squid, nautilus and scallop, respectively. We, then, estimated genome size of the species as 2.1, 4.2 and 1.8 Gb, respectively, with 2× coverage frequency and DNA sequencing theory. We also performed a gene duplication assay based on coding genes, and found that large-scale duplication events occurred after divergence from the limpet Lottia, an out-group of the three molluscan species. Comparison of all the results suggested that RE expansion did not relate to the increase in genome size of nautilus. Despite close relationships to nautilus, the squid has the largest portion of REs and smaller genome size than nautilus. We also identified lineage-specific RE and gene-family expansions, possibly relate to acquisition of the most complicated eye and brain systems in the three species.

  3. Whole-Genome Array CGH Evaluation for Replacing Prenatal Karyotyping in Hong Kong

    PubMed Central

    Kan, Anita S. Y.; Lau, Elizabeth T.; Tang, W. F.; Chan, Sario S. Y.; Ding, Simon C. K.; Chan, Kelvin Y. K.; Lee, C. P.; Hui, Pui Wah; Chung, Brian H. Y.; Leung, K. Y.; Ma, Teresa; Leung, Wing C.; Tang, Mary H. Y.

    2014-01-01

    Objective To evaluate the effectiveness of whole-genome array comparative genomic hybridization (aCGH) in prenatal diagnosis in Hong Kong. Methods Array CGH was performed on 220 samples recruited prospectively as the first-tier test study. In addition 150 prenatal samples with abnormal fetal ultrasound findings found to have normal karyotypes were analyzed as a ‘further-test’ study using NimbleGen CGX-135K oligonucleotide arrays. Results Array CGH findings were concordant with conventional cytogenetic results with the exception of one case of triploidy. It was found in the first-tier test study that aCGH detected 20% (44/220) clinically significant copy number variants (CNV), of which 21 were common aneuploidies and 23 had other chromosomal imbalances. There were 3.2% (7/220) samples with CNVs detected by aCGH but not by conventional cytogenetics. In the ‘further-test’ study, the additional diagnostic yield of detecting chromosome imbalance was 6% (9/150). The overall detection for CNVs of unclear clinical significance was 2.7% (10/370) with 0.9% found to be de novo. Eleven loci of common CNVs were found in the local population. Conclusion Whole-genome aCGH offered a higher resolution diagnostic capacity than conventional karyotyping for prenatal diagnosis either as a first-tier test or as a ‘further-test’ for pregnancies with fetal ultrasound anomalies. We propose replacing conventional cytogenetics with aCGH for all pregnancies undergoing invasive diagnostic procedures after excluding common aneuploidies and triploidies by quantitative fluorescent PCR. Conventional cytogenetics can be reserved for visualization of clinically significant CNVs. PMID:24505343

  4. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer

    PubMed Central

    Bova, G. Steven; Kallio, Heini M.L.; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B.; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-01-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  5. Estimating genome-wide significance for whole-genome sequencing studies.

    PubMed

    Xu, ChangJiang; Tachmazidou, Ioanna; Walter, Klaudia; Ciampi, Antonio; Zeggini, Eleftheria; Greenwood, Celia M T

    2014-05-01

    Although a standard genome-wide significance level has been accepted for the testing of association between common genetic variants and disease, the era of whole-genome sequencing (WGS) requires a new threshold. The allele frequency spectrum of sequence-identified variants is very different from common variants, and the identified rare genetic variation is usually jointly analyzed in a series of genomic windows or regions. In nearby or overlapping windows, these test statistics will be correlated, and the degree of correlation is likely to depend on the choice of window size, overlap, and the test statistic. Furthermore, multiple analyses may be performed using different windows or test statistics. Here we propose an empirical approach for estimating genome-wide significance thresholds for data arising from WGS studies, and we demonstrate that the empirical threshold can be efficiently estimated by extrapolating from calculations performed on a small genomic region. Because analysis of WGS may need to be repeated with different choices of test statistics or windows, this prediction approach makes it computationally feasible to estimate genome-wide significance thresholds for different analysis choices. Based on UK10K whole-genome sequence data, we derive genome-wide significance thresholds ranging between 2.5 × 10(-8) and 8 × 10(-8) for our analytic choices in window-based testing, and thresholds of 0.6 × 10(-8) -1.5 × 10(-8) for a combined analytic strategy of testing common variants using single-SNP tests together with rare variants analyzed with our sliding-window test strategy.

  6. Rapid Whole-Genome Sequencing of Mycobacterium tuberculosis Isolates Directly from Clinical Samples

    PubMed Central

    Brown, Amanda C.; Einer-Jensen, Katja; Holdstock, Jolyon; Houniet, Darren T.; Chan, Jacqueline Z. M.; Depledge, Daniel P.; Nikolayevskyy, Vladyslav; Broda, Agnieszka; Stone, Madeline J.; Christiansen, Mette T.; Williams, Rachel; McAndrew, Michael B.; Tutill, Helena; Brown, Julianne; Melzer, Mark; Rosmarin, Caryn; McHugh, Timothy D.; Shorten, Robert J.; Drobniewski, Francis; Speight, Graham; Breuer, Judith

    2015-01-01

    The rapid identification of antimicrobial resistance is essential for effective treatment of highly resistant Mycobacterium tuberculosis. Whole-genome sequencing provides comprehensive data on resistance mutations and strain typing for monitoring transmission, but unlike for conventional molecular tests, this has previously been achievable only from cultures of M. tuberculosis. Here we describe a method utilizing biotinylated RNA baits designed specifically for M. tuberculosis DNA to capture full M. tuberculosis genomes directly from infected sputum samples, allowing whole-genome sequencing without the requirement of culture. This was carried out on 24 smear-positive sputum samples, collected from the United Kingdom and Lithuania where a matched culture sample was available, and 2 samples that had failed to grow in culture. M. tuberculosis sequencing data were obtained directly from all 24 smear-positive culture-positive sputa, of which 20 were of high quality (>20× depth and >90% of the genome covered). Results were compared with those of conventional molecular and culture-based methods, and high levels of concordance between phenotypical resistance and predicted resistance based on genotype were observed. High-quality sequence data were obtained from one smear-positive culture-negative case. This study demonstrated for the first time the successful and accurate sequencing of M. tuberculosis genomes directly from uncultured sputa. Identification of known resistance mutations within a week of sample receipt offers the prospect for personalized rather than empirical treatment of drug-resistant tuberculosis, including the use of antimicrobial-sparing regimens, leading to improved outcomes. PMID:25972414

  7. Navigating Microbiological Food Safety in the Era of Whole-Genome Sequencing.

    PubMed

    Ronholm, J; Nasheri, Neda; Petronella, Nicholas; Pagotto, Franco

    2016-10-01

    The epidemiological investigation of a foodborne outbreak, including identification of related cases, source attribution, and development of intervention strategies, relies heavily on the ability to subtype the etiological agent at a high enough resolution to differentiate related from nonrelated cases. Historically, several different molecular subtyping methods have been used for this purpose; however, emerging techniques, such as single nucleotide polymorphism (SNP)-based techniques, that use whole-genome sequencing (WGS) offer a resolution that was previously not possible. With WGS, unlike traditional subtyping methods that lack complete information, data can be used to elucidate phylogenetic relationships and disease-causing lineages can be tracked and monitored over time. The subtyping resolution and evolutionary context provided by WGS data allow investigators to connect related illnesses that would be missed by traditional techniques. The added advantage of data generated by WGS is that these data can also be used for secondary analyses, such as virulence gene detection, antibiotic resistance gene profiling, synteny comparisons, mobile genetic element identification, and geographic attribution. In addition, several software packages are now available to generate in silico results for traditional molecular subtyping methods from the whole-genome sequence, allowing for efficient comparison with historical databases. Metagenomic approaches using next-generation sequencing have also been successful in the detection of nonculturable foodborne pathogens. This review addresses state-of-the-art techniques in microbial WGS and analysis and then discusses how this technology can be used to help support food safety investigations. Retrospective outbreak investigations using WGS are presented to provide organism-specific examples of the benefits, and challenges, associated with WGS in comparison to traditional molecular subtyping techniques. PMID:27559074

  8. Comprehensive red blood cell and platelet antigen prediction from whole genome sequencing: proof of principle

    PubMed Central

    Westhoff, Connie M.; Uy, Jon Michael; Aguad, Maria; Smeland‐Wagman, Robin; Kaufman, Richard M.; Rehm, Heidi L.; Green, Robert C.; Silberstein, Leslie E.

    2015-01-01

    BACKGROUND There are 346 serologically defined red blood cell (RBC) antigens and 33 serologically defined platelet (PLT) antigens, most of which have known genetic changes in 45 RBC or six PLT genes that correlate with antigen expression. Polymorphic sites associated with antigen expression in the primary literature and reference databases are annotated according to nucleotide positions in cDNA. This makes antigen prediction from next‐generation sequencing data challenging, since it uses genomic coordinates. STUDY DESIGN AND METHODS The conventional cDNA reference sequences for all known RBC and PLT genes that correlate with antigen expression were aligned to the human reference genome. The alignments allowed conversion of conventional cDNA nucleotide positions to the corresponding genomic coordinates. RBC and PLT antigen prediction was then performed using the human reference genome and whole genome sequencing (WGS) data with serologic confirmation. RESULTS Some major differences and alignment issues were found when attempting to convert the conventional cDNA to human reference genome sequences for the following genes: ABO, A4GALT, RHD, RHCE, FUT3, ACKR1 (previously DARC), ACHE, FUT2, CR1, GCNT2, and RHAG. However, it was possible to create usable alignments, which facilitated the prediction of all RBC and PLT antigens with a known molecular basis from WGS data. Traditional serologic typing for 18 RBC antigens were in agreement with the WGS‐based antigen predictions, providing proof of principle for this approach. CONCLUSION Detailed mapping of conventional cDNA annotated RBC and PLT alleles can enable accurate prediction of RBC and PLT antigens from whole genomic sequencing data. PMID:26634332

  9. Whole-Genome Sequencing for the Investigation of a Hospital Outbreak of MRSA in China

    PubMed Central

    Kong, Zhenzhen; Zhao, Peipei; Liu, Haibing; Yu, Xiang; Qin, Yanyan; Su, Zhaoliang; Wang, Shengjun; Xu, Huaxi; Chen, Jianguo

    2016-01-01

    Staphylococcus aureus is a globally disseminated drug-resistant bacterial species. It remains a leading cause of hospital-acquired infection, primarily among immunocompromised patients. In 2012, the Affiliated People’s Hospital of Jiangsu University experienced a putative outbreak of methicillin-resistant S. aureus (MRSA) that affected 12 patients in the Neurosurgery Department. In this study, whole-genome sequencing (WGS) was used to gain insight into the epidemiology of the outbreak caused by MRSA, and traditional bacterial genotyping approaches were also applied to provide supportive evidence for WGS. We sequenced the DNA from 6 isolates associated with the outbreak. Phylogenetic analysis was constructed by comparing single-nucleotide polymorphisms (SNPs) in the core genome of 6 isolates in the present study and another 3 referenced isolates from GenBank. Of the 6 MRSA sequences in the current study, 5 belonged to the same group, clustering with T0131, while the other one clustered closely with TW20. All of the isolates were identified as ST239-SCCmecIII clones. Whole-genome analysis revealed that four of the outbreak isolates were more tightly clustered into a group and SA13002 together with SA13009 were distinct from the outbreak strains, which were considered non-outbreak strains. Based on the sequencing results, the antibiotic-resistance gene status (present or absent) was almost perfectly concordant with the results of phenotypic susceptibility testing. Various toxin genes were also analyzed successfully. Our analysis demonstrates that using traditional molecular methods and WGS can facilitate the identification of outbreaks and help to control nosocomial transmission. PMID:26950298

  10. [Pathological Diagnoses and Whole-genome Sequence Analyses of the Jaagsiekte Sheep Retrovirus in Xinjiang, China].

    PubMed

    Yang, Sufang; Liang, Tian; Zhao, Qingliang; Zhang, Dianqing; Si Junqiang; Zhang, Jing; Yang, Xia; Sheng, Jinliang

    2015-05-01

    To carry out pathologic diagnoses and whole-genome sequence analyses of the Jaagsiekte sheep retrovirus (JSRV) in Xinjiang, China, we first observed sheep suspected to have the JSRV. Then, the extracted virus suspension was observed by transmission electron microscopy (TEM). Total RNAs from lungs of JSRV-infected sheep were extracted and reverse-transcribed using a cDNA synthesis kit. Six pairs of primers were designed according to the exogenous reference virus strain (AF105220). Reverse transcription-polymerase chain reaction was carried out from JSRV-infected tissue, and the whole genome of the JSRV sequenced. Our results showed: flow of nasal fluid ("wheelbarrow test"); different sizes of adenoma lesions in the lungs; papillary hyperplasia of alveolar epithelial cells; alveolar cavity filled with macrophages; dissolute nuclei in central lesions. TEM revealed JSRV particles with a diameter of 88 nm to 125. 4 nm. The full-length of the viral genome sequence was 7456 bp. BLAST analyses showed nucleotide homology of 96% and 95% compared with that of the representative strain from the USA (AF105220) and UK (AF357971). Nucleotide homology was 89.8% and 89.9% compared with the endogenous Jaagsiekte sheep retrovirus, Inner Mongolia strain (DQ838493) and USA strain (EF680300). The specific pathogenic amino-acid sequence "YXXM" was found in the TM district, similar to the exogenous JSRV: this gene has been reported to be oncogenic. This is the first report of the complete genomic sequence of the exogenous JSRV from Xinjiang, and could lay the foundation for study of the biological characteristics and pathogenic mechanisms of the pulmonary adenomatosis virus in sheep. PMID:26470525

  11. Comparison of whole genome sequences from human and non-human Escherichia coli O26 strains.

    PubMed

    Norman, Keri N; Clawson, Michael L; Strockbine, Nancy A; Mandrell, Robert E; Johnson, Roger; Ziebell, Kim; Zhao, Shaohua; Fratamico, Pina M; Stones, Robert; Allard, Marc W; Bono, James L

    2015-01-01

    Shiga toxin-producing Escherichia coli (STEC) O26 is the second leading E. coli serogroup responsible for human illness outbreaks behind E. coli O157:H7. Recent outbreaks have been linked to emerging pathogenic O26:H11 strains harboring stx 2 only. Cattle have been recognized as an important reservoir of O26 strains harboring stx 1; however the reservoir of these emerging stx 2 strains is unknown. The objective of this study was to identify nucleotide polymorphisms in human and cattle-derived strains in order to compare differences in polymorphism derived genotypes and virulence gene profiles between the two host species. Whole genome sequencing was performed on 182 epidemiologically unrelated O26 strains, including 109 human-derived strains and 73 non-human-derived strains. A panel of 289 O26 strains (241 STEC and 48 non-STEC) was subsequently genotyped using a set of 283 polymorphisms identified by whole genome sequencing, resulting in 64 unique genotypes. Phylogenetic analyses identified seven clusters within the O26 strains. The seven clusters did not distinguish between isolates originating from humans or cattle; however, clusters did correspond with particular virulence gene profiles. Human and non-human-derived strains harboring stx 1 clustered separately from strains harboring stx 2, strains harboring eae, and non-STEC strains. Strains harboring stx 2 were more closely related to non-STEC strains and strains harboring eae than to strains harboring stx 1. The finding of human and cattle-derived strains with the same polymorphism derived genotypes and similar virulence gene profiles, provides evidence that similar strains are found in cattle and humans and transmission between the two species may occur.

  12. Environmental Whole-Genome Amplification to Access Microbial Diversity in Contaminated Sediments

    SciTech Connect

    Abulencia, C.B.; Wyborski, D.L.; Garcia, J.; Podar, M.; Chen, W.; Chang, S.H.; Chang, H.W.; Watson, D.; Brodie,E.I.; Hazen, T.C.; Keller, M.

    2005-12-10

    Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using ?29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2 percent genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9 percent of the sequences had significant similarities to known proteins, and ''clusters of orthologous groups'' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.

  13. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation

    PubMed Central

    Technow, Frank; Messina, Carlos D.; Totir, L. Radu; Cooper, Mark

    2015-01-01

    Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics. PMID:26121133

  14. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation.

    PubMed

    Technow, Frank; Messina, Carlos D; Totir, L Radu; Cooper, Mark

    2015-01-01

    Genomic selection, enabled by whole genome prediction (WGP) methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E), continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs) attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC), a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.

  15. Tumor Touch Imprints as Source for Whole Genome Analysis of Neuroblastoma Tumors

    PubMed Central

    Brunner, Clemens; Brunner-Herglotz, Bettina; Ziegler, Andrea; Frech, Christian; Amann, Gabriele; Ladenstein, Ruth; Ambros, Inge M.; Ambros, Peter F.

    2016-01-01

    Introduction Tumor touch imprints (TTIs) are routinely used for the molecular diagnosis of neuroblastomas by interphase fluorescence in-situ hybridization (I-FISH). However, in order to facilitate a comprehensive, up-to-date molecular diagnosis of neuroblastomas and to identify new markers to refine risk and therapy stratification methods, whole genome approaches are needed. We examined the applicability of an ultra-high density SNP array platform that identifies copy number changes of varying sizes down to a few exons for the detection of genomic changes in tumor DNA extracted from TTIs. Material and Methods DNAs were extracted from TTIs of 46 neuroblastoma and 4 other pediatric tumors. The DNAs were analyzed on the Cytoscan HD SNP array platform to evaluate numerical and structural genomic aberrations. The quality of the data obtained from TTIs was compared to that from randomly chosen fresh or fresh frozen solid tumors (n = 212) and I-FISH validation was performed. Results SNP array profiles were obtained from 48 (out of 50) TTI DNAs of which 47 showed genomic aberrations. The high marker density allowed for single gene analysis, e.g. loss of nine exons in the ATRX gene and the visualization of chromothripsis. Data quality was comparable to fresh or fresh frozen tumor SNP profiles. SNP array results were confirmed by I-FISH. Conclusion TTIs are an excellent source for SNP array processing with the advantage of simple handling, distribution and storage of tumor tissue on glass slides. The minimal amount of tumor tissue needed to analyze whole genomes makes TTIs an economic surrogate source in the molecular diagnostic work up of tumor samples. PMID:27560999

  16. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    PubMed

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  17. Whole-genome analysis of multienvironment or multitrait QTL in MAGIC.

    PubMed

    Verbyla, Arūnas P; Cavanagh, Colin R; Verbyla, Klara L

    2014-09-18

    Multiparent Advanced Generation Inter-Cross (MAGIC) populations are now being utilized to more accurately identify the underlying genetic basis of quantitative traits through quantitative trait loci (QTL) analyses and subsequent gene discovery. The expanded genetic diversity present in such populations and the amplified number of recombination events mean that QTL can be identified at a higher resolution. Most QTL analyses are conducted separately for each trait within a single environment. Separate analysis does not take advantage of the underlying correlation structure found in multienvironment or multitrait data. By using this information in a joint analysis-be it multienvironment or multitrait - it is possible to gain a greater understanding of genotype- or QTL-by-environment interactions or of pleiotropic effects across traits. Furthermore, this can result in improvements in accuracy for a range of traits or in a specific target environment and can influence selection decisions. Data derived from MAGIC populations allow for founder probabilities of all founder alleles to be calculated for each individual within the population. This presents an additional layer of complexity and information that can be utilized to identify QTL. A whole-genome approach is proposed for multienvironment and multitrait QTL analysis in MAGIC. The whole-genome approach simultaneously incorporates all founder probabilities at each marker for all individuals in the analysis, rather than using a genome scan. A dimension reduction technique is implemented, which allows for high-dimensional genetic data. For each QTL identified, sizes of effects for each founder allele, the percentage of genetic variance explained, and a score to reflect the strength of the QTL are found. The approach was demonstrated to perform well in a small simulation study and for two experiments, using a wheat MAGIC population.

  18. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

    PubMed Central

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  19. Comparison of whole genome sequences from human and non-human Escherichia coli O26 strains

    PubMed Central

    Norman, Keri N.; Clawson, Michael L.; Strockbine, Nancy A.; Mandrell, Robert E.; Johnson, Roger; Ziebell, Kim; Zhao, Shaohua; Fratamico, Pina M.; Stones, Robert; Allard, Marc W.; Bono, James L.

    2015-01-01

    Shiga toxin-producing Escherichia coli (STEC) O26 is the second leading E. coli serogroup responsible for human illness outbreaks behind E. coli O157:H7. Recent outbreaks have been linked to emerging pathogenic O26:H11 strains harboring stx2 only. Cattle have been recognized as an important reservoir of O26 strains harboring stx1; however the reservoir of these emerging stx2 strains is unknown. The objective of this study was to identify nucleotide polymorphisms in human and cattle-derived strains in order to compare differences in polymorphism derived genotypes and virulence gene profiles between the two host species. Whole genome sequencing was performed on 182 epidemiologically unrelated O26 strains, including 109 human-derived strains and 73 non-human-derived strains. A panel of 289 O26 strains (241 STEC and 48 non-STEC) was subsequently genotyped using a set of 283 polymorphisms identified by whole genome sequencing, resulting in 64 unique genotypes. Phylogenetic analyses identified seven clusters within the O26 strains. The seven clusters did not distinguish between isolates originating from humans or cattle; however, clusters did correspond with particular virulence gene profiles. Human and non-human-derived strains harboring stx1 clustered separately from strains harboring stx2, strains harboring eae, and non-STEC strains. Strains harboring stx2 were more closely related to non-STEC strains and strains harboring eae than to strains harboring stx1. The finding of human and cattle-derived strains with the same polymorphism derived genotypes and similar virulence gene profiles, provides evidence that similar strains are found in cattle and humans and transmission between the two species may occur. PMID:25815275

  20. Multivariate whole genome average interval mapping: QTL analysis for multiple traits and/or environments.

    PubMed

    Verbyla, Arūnas P; Cullis, Brian R

    2012-09-01

    A major aim in some plant-based studies is the determination of quantitative trait loci (QTL) for multiple traits or across multiple environments. Understanding these QTL by trait or QTL by environment interactions can be of great value to the plant breeder. A whole genome approach for the analysis of QTL is presented for such multivariate applications. The approach is an extension of whole genome average interval mapping in which all intervals on a linkage map are included in the analysis simultaneously. A random effects working model is proposed for the multivariate (trait or environment) QTL effects for each interval, with a variance-covariance matrix linking the variates in a particular interval. The significance of the variance-covariance matrix for the QTL effects is tested and if significant, an outlier detection technique is used to select a putative QTL. This QTL by variate interaction is transferred to the fixed effects. The process is repeated until the variance-covariance matrix for QTL random effects is not significant; at this point all putative QTL have been selected. Unlinked markers can also be included in the analysis. A simulation study was conducted to examine the performance of the approach and demonstrated the multivariate approach results in increased power for detecting QTL in comparison to univariate methods. The approach is illustrated for data arising from experiments involving two doubled haploid populations. The first involves analysis of two wheat traits, α-amylase activity and height, while the second is concerned with a multi-environment trial for extensibility of flour dough. The method provides an approach for multi-trait and multi-environment QTL analysis in the presence of non-genetic sources of variation. PMID:22692445

  1. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer.

    PubMed

    Bova, G Steven; Kallio, Heini M L; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-05-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  2. High-throughput genomics in sorghum: from whole-genome resequencing to a SNP screening array.

    PubMed

    Bekele, Wubishet A; Wieckhorst, Silke; Friedt, Wolfgang; Snowdon, Rod J

    2013-12-01

    With its small, diploid and completely sequenced genome, sorghum (Sorghum bicolor L. Moench) is highly amenable to genomics-based breeding approaches. Here, we describe the development and testing of a robust single-nucleotide polymorphism (SNP) array platform that enables polymorphism screening for genome-wide and trait-linked polymorphisms in genetically diverse S. bicolor populations. Whole-genome sequences with 6× to 12× coverage from five genetically diverse S. bicolor genotypes, including three sweet sorghums and two grain sorghums, were aligned to the sorghum reference genome. From over 1 million high-quality SNPs, we selected 2124 Infinium Type II SNPs that were informative in all six source genomes, gave an optimal Assay Design Tool (ADT) score, had allele frequencies of 50% in the six genotypes and were evenly spaced throughout the S. bicolor genome. Furthermore, by phenotype-based pool sequencing, we selected an additional 876 SNPs with a phenotypic association to early-stage chilling tolerance, a key trait for European sorghum breeding. The 3000 attempted bead types were used to populate half of a dual-species Illumina iSelect SNP array. The array was tested using 564 Sorghum spp. genotypes, including offspring from four unrelated recombinant inbred line (RIL) and F2 populations and a genetic diversity collection. A high call rate of over 80% enabled validation of 2620 robust and polymorphic sorghum SNPs, underlining the efficiency of the array development scheme for whole-genome SNP selection and screening, with diverse applications including genetic mapping, genome-wide association studies and genomic selection.

  3. Whole genomic constellation of the first human G8 rotavirus strain detected in Japan.

    PubMed

    Agbemabiese, Chantal Ama; Nakagomi, Toyoko; Doan, Yen Hai; Nakagomi, Osamu

    2015-10-01

    Human G8 Rotavirus A (RVA) strains are commonly detected in Africa but are rarely detected in Japan and elsewhere in the world. In this study, the whole genome sequence of the first human G8 RVA strain designated AU109 isolated in a child with acute gastroenteritis in 1994 was determined in order to understand how the strain was generated including the host species origin of its genes. The genotype constellation of AU109 was G8-P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2. Phylogenetic analyses of the 11 genome segments revealed that its VP7 and VP1 genes were closely related to those of a Hungarian human G8P[14] RVA strain and these genes shared the most recent common ancestors in 1988 and 1982, respectively. AU109 possessed an NSP2 gene closely related to those of Chinese sheep and goat RVA strains. The remaining eight genome segments were closely related to Japanese human G2P[4] strains which circulated around 1985-1990. Bayesian evolutionary analyses revealed that the NSP2 gene of AU109 and those of the Chinese sheep and goat RVA strains diverged from a common ancestor around 1937. In conclusion, AU109 was generated through genetic reassortment event where Japanese DS-1-like G2P[4] strains circulating around 1985-1990 obtained the VP7, VP1 and NSP2 genes from unknown ruminant G8 RVA strains. These observations highlight the need for comprehensive examination of the whole genomes of RVA strains of less explored host species.

  4. Environmental Whole-Genome Amplification To Access Microbial Populations in Contaminated Sediments

    PubMed Central

    Abulencia, Carl B.; Wyborski, Denise L.; Garcia, Joseph A.; Podar, Mircea; Chen, Wenqiong; Chang, Sherman H.; Chang, Hwai W.; Watson, David; Brodie, Eoin L.; Hazen, Terry C.; Keller, Martin

    2006-01-01

    Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using φ29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2% genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small-subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9% of the sequences had significant similarities to known proteins, and “clusters of orthologous groups” (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible. PMID:16672469

  5. Whole-genome sequencing to understand the genetic architecture of common gene expression and biomarker phenotypes

    PubMed Central

    Wood, Andrew R.; Tuke, Marcus A.; Nalls, Mike; Hernandez, Dena; Gibbs, J. Raphael; Lin, Haoxiang; Xu, Christopher S.; Li, Qibin; Shen, Juan; Jun, Goo; Almeida, Marcio; Tanaka, Toshiko; Perry, John R. B.; Gaulton, Kyle; Rivas, Manny; Pearson, Richard; Curran, Joanne E.; Johnson, Matthew P.; Göring, Harald H. H.; Duggirala, Ravindranath; Blangero, John; Mccarthy, Mark I.; Bandinelli, Stefania; Murray, Anna; Weedon, Michael N.; Singleton, Andrew; Melzer, David; Ferrucci, Luigi; Frayling, Timothy M

    2015-01-01

    Initial results from sequencing studies suggest that there are relatively few low-frequency (<5%) variants associated with large effects on common phenotypes. We performed low-pass whole-genome sequencing in 680 individuals from the InCHIANTI study to test two primary hypotheses: (i) that sequencing would detect single low-frequency–large effect variants that explained similar amounts of phenotypic variance as single common variants, and (ii) that some common variant associations could be explained by low-frequency variants. We tested two sets of disease-related common phenotypes for which we had statistical power to detect large numbers of common variant–common phenotype associations—11 132 cis-gene expression traits in 450 individuals and 93 circulating biomarkers in all 680 individuals. From a total of 11 657 229 high-quality variants of which 6 129 221 and 5 528 008 were common and low frequency (<5%), respectively, low frequency–large effect associations comprised 7% of detectable cis-gene expression traits [89 of 1314 cis-eQTLs at P < 1 × 10−06 (false discovery rate ∼5%)] and one of eight biomarker associations at P < 8 × 10−10. Very few (30 of 1232; 2%) common variant associations were fully explained by low-frequency variants. Our data show that whole-genome sequencing can identify low-frequency variants undetected by genotyping based approaches when sample sizes are sufficiently large to detect substantial numbers of common variant associations, and that common variant associations are rarely explained by single low-frequency variants of large effect. PMID:25378555

  6. Whole-Genome Sequencing for the Investigation of a Hospital Outbreak of MRSA in China.

    PubMed

    Kong, Zhenzhen; Zhao, Peipei; Liu, Haibing; Yu, Xiang; Qin, Yanyan; Su, Zhaoliang; Wang, Shengjun; Xu, Huaxi; Chen, Jianguo

    2016-01-01

    Staphylococcus aureus is a globally disseminated drug-resistant bacterial species. It remains a leading cause of hospital-acquired infection, primarily among immunocompromised patients. In 2012, the Affiliated People's Hospital of Jiangsu University experienced a putative outbreak of methicillin-resistant S. aureus (MRSA) that affected 12 patients in the Neurosurgery Department. In this study, whole-genome sequencing (WGS) was used to gain insight into the epidemiology of the outbreak caused by MRSA, and traditional bacterial genotyping approaches were also applied to provide supportive evidence for WGS. We sequenced the DNA from 6 isolates associated with the outbreak. Phylogenetic analysis was constructed by comparing single-nucleotide polymorphisms (SNPs) in the core genome of 6 isolates in the present study and another 3 referenced isolates from GenBank. Of the 6 MRSA sequences in the current study, 5 belonged to the same group, clustering with T0131, while the other one clustered closely with TW20. All of the isolates were identified as ST239-SCCmecIII clones. Whole-genome analysis revealed that four of the outbreak isolates were more tightly clustered into a group and SA13002 together with SA13009 were distinct from the outbreak strains, which were considered non-outbreak strains. Based on the sequencing results, the antibiotic-resistance gene status (present or absent) was almost perfectly concordant with the results of phenotypic susceptibility testing. Various toxin genes were also analyzed successfully. Our analysis demonstrates that using traditional molecular methods and WGS can facilitate the identification of outbreaks and help to control nosocomial transmission. PMID:26950298

  7. Whole genomic constellation of the first human G8 rotavirus strain detected in Japan.

    PubMed

    Agbemabiese, Chantal Ama; Nakagomi, Toyoko; Doan, Yen Hai; Nakagomi, Osamu

    2015-10-01

    Human G8 Rotavirus A (RVA) strains are commonly detected in Africa but are rarely detected in Japan and elsewhere in the world. In this study, the whole genome sequence of the first human G8 RVA strain designated AU109 isolated in a child with acute gastroenteritis in 1994 was determined in order to understand how the strain was generated including the host species origin of its genes. The genotype constellation of AU109 was G8-P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2. Phylogenetic analyses of the 11 genome segments revealed that its VP7 and VP1 genes were closely related to those of a Hungarian human G8P[14] RVA strain and these genes shared the most recent common ancestors in 1988 and 1982, respectively. AU109 possessed an NSP2 gene closely related to those of Chinese sheep and goat RVA strains. The remaining eight genome segments were closely related to Japanese human G2P[4] strains which circulated around 1985-1990. Bayesian evolutionary analyses revealed that the NSP2 gene of AU109 and those of the Chinese sheep and goat RVA strains diverged from a common ancestor around 1937. In conclusion, AU109 was generated through genetic reassortment event where Japanese DS-1-like G2P[4] strains circulating around 1985-1990 obtained the VP7, VP1 and NSP2 genes from unknown ruminant G8 RVA strains. These observations highlight the need for comprehensive examination of the whole genomes of RVA strains of less explored host species. PMID:26275468

  8. Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

    PubMed Central

    2014-01-01

    Background Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of “living fossils.” As arthropods, they belong to the Ecdysozoa, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Results Here we apply a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab, Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers grouped into 1,876 distinct genetic intervals and 5,775 candidate conserved protein coding genes. Conclusions Comparison with other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications 300 million years ago, followed by extensive chromosome fusion. These results provide a counter-example to the often noted correlation between whole genome duplication and evolutionary radiations. The new, low-cost genetic mapping method for obtaining a chromosome-scale view of non-model organism genomes that we demonstrate here does not require laboratory culture, and is potentially applicable to a broad range of other species. PMID:24987520

  9. Navigating Microbiological Food Safety in the Era of Whole-Genome Sequencing.

    PubMed

    Ronholm, J; Nasheri, Neda; Petronella, Nicholas; Pagotto, Franco

    2016-10-01

    The epidemiological investigation of a foodborne outbreak, including identification of related cases, source attribution, and development of intervention strategies, relies heavily on the ability to subtype the etiological agent at a high enough resolution to differentiate related from nonrelated cases. Historically, several different molecular subtyping methods have been used for this purpose; however, emerging techniques, such as single nucleotide polymorphism (SNP)-based techniques, that use whole-genome sequencing (WGS) offer a resolution that was previously not possible. With WGS, unlike traditional subtyping methods that lack complete information, data can be used to elucidate phylogenetic relationships and disease-causing lineages can be tracked and monitored over time. The subtyping resolution and evolutionary context provided by WGS data allow investigators to connect related illnesses that would be missed by traditional techniques. The added advantage of data generated by WGS is that these data can also be used for secondary analyses, such as virulence gene detection, antibiotic resistance gene profiling, synteny comparisons, mobile genetic element identification, and geographic attribution. In addition, several software packages are now available to generate in silico results for traditional molecular subtyping methods from the whole-genome sequence, allowing for efficient comparison with historical databases. Metagenomic approaches using next-generation sequencing have also been successful in the detection of nonculturable foodborne pathogens. This review addresses state-of-the-art techniques in microbial WGS and analysis and then discusses how this technology can be used to help support food safety investigations. Retrospective outbreak investigations using WGS are presented to provide organism-specific examples of the benefits, and challenges, associated with WGS in comparison to traditional molecular subtyping techniques.

  10. Identification of Genome-Wide Mutations in Ciprofloxacin-Resistant F. tularensis LVS Using Whole Genome Tiling Arrays and Next Generation Sequencing

    PubMed Central

    Jaing, Crystal J.; McLoughlin, Kevin S.; Thissen, James B.; Zemla, Adam; Vergez, Lisa M.; Bourguet, Feliza; Mabery, Shalini; Fofanov, Viacheslav Y.; Koshinsky, Heather; Jackson, Paul J.

    2016-01-01

    Francisella tularensis is classified as a Class A bioterrorism agent by the U.S. government due to its high virulence and the ease with which it can be spread as an aerosol. It is a facultative intracellular pathogen and the causative agent of tularemia. Ciprofloxacin (Cipro) is a broad spectrum antibiotic effective against Gram-positive and Gram-negative bacteria. Increased Cipro resistance in pathogenic microbes is of serious concern when considering options for medical treatment of bacterial infections. Identification of genes and loci that are associated with Ciprofloxacin resistance will help advance the understanding of resistance mechanisms and may, in the future, provide better treatment options for patients. It may also provide information for development of assays that can rapidly identify Cipro-resistant isolates of this pathogen. In this study, we selected a large number of F. tularensis live vaccine strain (LVS) isolates that survived in progressively higher Ciprofloxacin concentrations, screened the isolates using a whole genome F. tularensis LVS tiling microarray and Illumina sequencing, and identified both known and novel mutations associated with resistance. Genes containing mutations encode DNA gyrase subunit A, a hypothetical protein, an asparagine synthase, a sugar transamine/perosamine synthetase and others. Structural modeling performed on these proteins provides insights into the potential function of these proteins and how they might contribute to Cipro resistance mechanisms. PMID:27668749

  11. Functional Whole-genome Analysis Identifies Polo-like Kinase 2 and Poliovirus Receptor as Essential for Neuronal Differentiation Upstream of the Negative Regulator αB-crystallin

    PubMed Central

    Draghetti, Cristina; Salvat, Catherine; Zanoguera, Francisca; Curchod, Marie-Laure; Vignaud, Chloé; Peixoto, Helene; Di Cara, Alessandro; Fischer, David; Dhanabal, Mohanraj; Andreas, Goutopoulos; Abderrahim, Hadi; Rommel, Christian; Camps, Montserrat

    2009-01-01

    This study aimed at identifying transcriptional changes associated to neuronal differentiation induced by six distinct stimuli using whole-genome microarray hybridization analysis. Bioinformatics analyses revealed the clustering of these six stimuli into two categories, suggesting separate gene/pathway dependence. Treatment with specific inhibitors demonstrated the requirement of both Janus kinase and microtubule-associated protein kinase activation to trigger differentiation with nerve growth factor (NGF) and dibutyryl cAMP. Conversely, activation of protein kinase A, phosphatidylinositol-3-kinase α, and mammalian target of rapamycin, although required for dibutyryl cAMP-induced differentiation, exerted a negative feedback on NGF-induced differentiation. We identified Polo-like kinase 2 (Plk2) and poliovirus receptor (PVR) as indispensable for NGF-driven neuronal differentiation and αB-crystallin (Cryab) as an inhibitor of this process. Silencing of Plk2 or PVR blocked NGF-triggered differentiation and Cryab down-regulation, while silencing of Cryab enhanced NGF-induced differentiation. Our results position both Plk2 and PVR upstream of the negative regulator Cryab in the pathway(s) leading to neuronal differentiation triggered by NGF. PMID:19700763

  12. Functional whole-genome analysis identifies Polo-like kinase 2 and poliovirus receptor as essential for neuronal differentiation upstream of the negative regulator alphaB-crystallin.

    PubMed

    Draghetti, Cristina; Salvat, Catherine; Zanoguera, Francisca; Curchod, Marie-Laure; Vignaud, Chloé; Peixoto, Helene; Di Cara, Alessandro; Fischer, David; Dhanabal, Mohanraj; Andreas, Goutopoulos; Abderrahim, Hadi; Rommel, Christian; Camps, Montserrat

    2009-11-13

    This study aimed at identifying transcriptional changes associated to neuronal differentiation induced by six distinct stimuli using whole-genome microarray hybridization analysis. Bioinformatics analyses revealed the clustering of these six stimuli into two categories, suggesting separate gene/pathway dependence. Treatment with specific inhibitors demonstrated the requirement of both Janus kinase and microtubule-associated protein kinase activation to trigger differentiation with nerve growth factor (NGF) and dibutyryl cAMP. Conversely, activation of protein kinase A, phosphatidylinositol-3-kinase alpha, and mammalian target of rapamycin, although required for dibutyryl cAMP-induced differentiation, exerted a negative feedback on NGF-induced differentiation. We identified Polo-like kinase 2 (Plk2) and poliovirus receptor (PVR) as indispensable for NGF-driven neuronal differentiation and alphaB-crystallin (Cryab) as an inhibitor of this process. Silencing of Plk2 or PVR blocked NGF-triggered differentiation and Cryab down-regulation, while silencing of Cryab enhanced NGF-induced differentiation. Our results position both Plk2 and PVR upstream of the negative regulator Cryab in the pathway(s) leading to neuronal differentiation triggered by NGF.

  13. Whole-genome resequencing of Hanwoo (Korean cattle) and insight into regions of homozygosity

    PubMed Central

    2013-01-01

    Background Hanwoo (Korean cattle), which originated from natural crossbreeding between taurine and zebu cattle, migrated to the Korean peninsula through North China. Hanwoo were raised as draft animals until the 1970s without the introduction of foreign germplasm. Since 1979, Hanwoo has been bred as beef cattle. Genetic variation was analyzed by whole-genome deep resequencing of a Hanwoo bull. The Hanwoo genome was compared to that of two other breeds, Black Angus and Holstein, and genes within regions of homozygosity were investigated to elucidate the genetic and genomic characteristics of Hanwoo. Results The Hanwoo bull genome was sequenced to 45.6-fold coverage using the ABI SOLiD system. In total, 4.7 million single-nucleotide polymorphisms and 0.4 million small indels were identified by comparison with the Btau4.0 reference assembly. Of the total number of SNPs and indels, 58% and 87%, respectively, were novel. The overall genotype concordance between the SNPs and BovineSNP50 BeadChip data was 96.4%. Of 1.6 million genetic differences in Hanwoo, approximately 25,000 non-synonymous SNPs, splice-site variants, and coding indels (NS/SS/Is) were detected in 8,360 genes. Among 1,045 genes containing reliable specific NS/SS/Is in Hanwoo, 109 genes contained more than one novel damaging NS/SS/I. Of the genes containing NS/SS/Is, 610 genes were assigned as trait-associated genes. Moreover, 16, 78, and 51 regions of homozygosity (ROHs) were detected in Hanwoo, Black Angus, and Holstein, respectively. ‘Regulation of actin filament length’ was revealed as a significant gene ontology term and 25 trait-associated genes for meat quality and disease resistance were found in 753 genes that resided in the ROHs of Hanwoo. In Hanwoo, 43 genes were located in common ROHs between whole-genome resequencing and SNP chips in BTA2, 10, and 13 coincided with quantitative trait loci for meat fat traits. In addition, the common ROHs in BTA2 and 16 were in agreement between Hanwoo and

  14. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    PubMed

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T; Rosenqvist Lund, Birthe S; Ameh, James A; Ambali, Abdul G; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M; Hendriksen, Rene S

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  15. Use of Whole-Genome Sequencing to Link Burkholderia pseudomallei from Air Sampling to Mediastinal Melioidosis, Australia

    PubMed Central

    Price, Erin P.; Mayo, Mark; Kaestli, Mirjam; Theobald, Vanessa; Harrington, Ian; Harrington, Glenda; Sarovich, Derek S.

    2015-01-01

    The frequency with which melioidosis results from inhalation rather than percutaneous inoculation or ingestion is unknown. We recovered Burkholderia pseudomallei from air samples at the residence of a patient with presumptive inhalational melioidosis and used whole-genome sequencing to link the environmental bacteria to B. pseudomallei recovered from the patient. PMID:26488732

  16. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    PubMed

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T; Rosenqvist Lund, Birthe S; Ameh, James A; Ambali, Abdul G; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M; Hendriksen, Rene S

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  17. Detection and Whole-Genome Sequencing of Carbapenemase-Producing Aeromonas hydrophila Isolates from Routine Perirectal Surveillance Culture

    PubMed Central

    Hughes, Heather Y.; Lau, Anna F.; Dekker, John P.; Michelin, Angela V.; Youn, Jung-Ho; Henderson, David K.; Frank, Karen M.; Segre, Julia A.

    2016-01-01

    Perirectal surveillance cultures and a stool culture grew Aeromonas species from three patients over a 6-week period and were without epidemiological links. Detection of the blaKPC-2 gene in one isolate prompted inclusion of non-Enterobacteriaceae in our surveillance culture workup. Whole-genome sequencing confirmed that the isolates were unrelated and provided data for Aeromonas reference genomes. PMID:26888898

  18. Draft Whole-Genome Sequence of a Haemophilus quentini Strain Isolated from an Infant in the United Kingdom

    PubMed Central

    Baxter, Laura; Thompson, Sarah; Collery, Mark M.; Hand, Daniel C.; Fink, Colin G.

    2016-01-01

    Haemophilus quentini is a rare and distinct genospecies of Haemophilus that has been suggested as a cause of neonatal bacteremia and urinary tract infections in men. We present the draft whole-genome sequence of H. quentini MP1 isolated from an infant in the United Kingdom, aiding future identification and detection of this pathogen.

  19. Whole genome sequencing of Candidatus Liberibacter asiaticus strain A4 from Guangdong, China, and strain HHCA from California

    Technology Transfer Automated Retrieval System (TEKTRAN)

    “Candidatus Liberibacter asiaticus” is associated with citrus Huanglongbing (HLB) in both China and the United States. While HLB has been known for over a century in Guangdong, China, the disease was first discovered in California in 2012. To better study the “old” and “new” HLBs, whole genomes of “...

  20. Whole-Genome Sequences of Two Campylobacter coli Isolates from the Antimicrobial Resistance Monitoring Program in Colombia.

    PubMed

    Bernal, Johan F; Donado-Godoy, Pilar; Valencia, María Fernanda; León, Maribel; Gómez, Yolanda; Rodríguez, Fernando; Agarwala, Richa; Landsman, David; Mariño-Ramírez, Leonardo

    2016-03-17

    Campylobacter coli, along with Campylobacter jejuni, is a major agent of gastroenteritis and acute enterocolitis in humans. We report the whole-genome sequences of two multidrug-resistance C. coli strains, isolated from the Colombian poultry chain. The isolates contain a variety of antimicrobial resistance genes for aminoglycosides, lincosamides, fluoroquinolones, and tetracycline.

  1. Whole-Genome Sequence of Mesorhizobium hungaricum sp. nov. Strain UASWS1009, a Potential Resource for Agricultural and Environmental Uses

    PubMed Central

    Crovadore, Julien; Cochard, Bastien; Calmin, Gautier; Chablais, Romain; Schulz, Torsten

    2016-01-01

    We report here the whole-genome shotgun sequences of the strain UASWS1009 of the species Mesorhizobium hungaricum sp. nov., which are different from any other known Mesorhizobium species. This is the first genome registered for this new species, which could be considered as a potential resource for agriculture and environmental uses. PMID:27738050

  2. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    PubMed Central

    Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available. PMID:27327771

  3. Whole-Genome Sequence of Pseudomonas xanthomarina Strain UASWS0955, a Potential Biological Agent for Agricultural and Environmental Uses

    PubMed Central

    Crovadore, Julien; Cochard, Bastien; Calmin, Gautier; Chablais, Romain; Schulz, Torsten

    2016-01-01

    We report here the whole-genome shotgun sequence of the strain UASWS0955 of the species Pseudomonas xanthomarina, isolated from sewage sludge. This genome was obtained with an Illumina MiniSeq and is the second genome registered for this species, which is considered as a promising resource for agriculture and bioremediation of contaminated soils. PMID:27738044

  4. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    PubMed Central

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T.; Rosenqvist Lund, Birthe S.; Ameh, James A.; Ambali, Abdul G.; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M.; Hendriksen, Rene S.

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  5. Whole-Genome Sequencing Reveals a New Genospecies of Methylobacterium sp. GXS13, Isolated from Vitis vinifera L. Xylem Sap

    PubMed Central

    Lai, Wan Xin; Gan, Han Ming; Hudson, André O.

    2016-01-01

    The whole-genome sequence of a new genospecies of Methylobacterium sp., named GXS13 and isolated from grapevine xylem sap, is reported and demonstrates potential for methylotrophy, cytokinin synthesis, and cell wall modification. In addition, biosynthetic gene clusters were identified for cupriachelin, carotenoid, and acyl-homoserine lactone using the antiSMASH server. PMID:26847900

  6. Whole-Genome Sequence of Pseudomonas graminis Strain UASWS1507, a Potential Biological Control Agent and Biofertilizer Isolated in Switzerland

    PubMed Central

    Crovadore, Julien; Calmin, Gautier; Chablais, Romain; Cochard, Bastien; Schulz, Torsten

    2016-01-01

    We report here the whole-genome shotgun sequence of the strain UASWS1507 of the species Pseudomonas graminis, isolated in Switzerland from an apple tree. This is the first genome registered for this species, which is considered as a potential and valuable resource of biological control agents and biofertilizers for agriculture. PMID:27795260

  7. Direct DNA Extraction from Mycobacterium tuberculosis Frozen Stocks as a Reculture-Independent Approach to Whole-Genome Sequencing.

    PubMed

    Bjorn-Mortensen, K; Zallet, J; Lillebaek, T; Andersen, A B; Niemann, S; Rasmussen, E M; Kohl, T A

    2015-08-01

    Culturing before DNA extraction represents a major time-consuming step in whole-genome sequencing of slow-growing bacteria, such as Mycobacterium tuberculosis. We report a workflow to extract DNA from frozen isolates without reculturing. Prepared libraries and sequence data were comparable with results from recultured aliquots of the same stocks.

  8. A whole genome sequence of ‘Candidatus Liberibacter asiaticus’ from Guangdong, China, where HLB was first described

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Citrus Huanglongbing (HLB, yellow shoot disease) has been endemic in Guangdong Province, China, for >100 years. “Candidatus Liberibacter asiaticus” (CLas) is a putative pathogen of HLB and currently unculturable. Here, a draft whole genome sequence of CLas strain A4 from Guangdong is presented. Stra...

  9. Whole-Genome Sequence of Fish-Pathogenic Mycobacterium sp. Strain 012931, Isolated from Yellowtail (Seriola quinqueradiata).

    PubMed

    Kurokawa, Satoru; Kabayama, Jun; Nho, Seong Won; Hwang, Seong Don; Hikima, Jun-Ichi; Jung, Tae Sung; Kondo, Hidehiro; Hirono, Ikuo; Takeyama, Haruko; Aoki, Takashi

    2013-01-01

    The genus Mycobacterium comprises a large number of well-characterized species, several of which are human and animal pathogens. Here, we report the whole-genome sequence of Mycobacterium sp. strain 012931, a fish pathogen responsible for huge losses in aquaculture farms in Japan. The strain was isolated from a marine fish, yellowtail (Seriola quinqueradiata). PMID:23929466

  10. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    PubMed

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  11. Whole-Genome Draft Sequences of Six Commensal Fecal and Six Mastitis-Associated Escherichia coli Strains of Bovine Origin

    PubMed Central

    Leimbach, Andreas; Witten, Anika; Wellnitz, Olga; Shpigel, Nahum; Petzl, Wolfram; Zerbe, Holm; Daniel, Rolf

    2016-01-01

    The bovine gastrointestinal tract is a natural reservoir for commensal and pathogenic Escherichia coli strains with the ability to cause mastitis. Here, we report the whole-genome sequences of six E. coli isolates from acute mastitis cases and six E. coli isolates from the feces of udder-healthy cows. PMID:27469942

  12. Whole-Genome Draft Sequences of Six Commensal Fecal and Six Mastitis-Associated Escherichia coli Strains of Bovine Origin.

    PubMed

    Leimbach, Andreas; Poehlein, Anja; Witten, Anika; Wellnitz, Olga; Shpigel, Nahum; Petzl, Wolfram; Zerbe, Holm; Daniel, Rolf; Dobrindt, Ulrich

    2016-01-01

    The bovine gastrointestinal tract is a natural reservoir for commensal and pathogenic Escherichia coli strains with the ability to cause mastitis. Here, we report the whole-genome sequences of six E. coli isolates from acute mastitis cases and six E. coli isolates from the feces of udder-healthy cows. PMID:27469942

  13. Whole-Genome Shotgun Sequencing of an Indian-Origin Lactobacillus helveticus Strain, MTCC 5463, with Probiotic Potential▿

    PubMed Central

    Prajapati, J. B.; Khedkar, C. D.; Chitra, J.; Suja, Senan; Mishra, V.; Sreeja, V.; Patel, R. K.; Ahir, V. B.; Bhatt, V. D.; Sajnani, M. R.; Jakhesara, S. J.; Koringa, P. G.; Joshi, C. G.

    2011-01-01

    Lactobacillus helveticus MTCC 5463 was isolated from a vaginal swab from a healthy adult female. The strain exhibited potential probiotic properties, with their beneficial role in the gastrointestinal tract and their ability to reduce cholesterol and stimulate immunity. We sequenced the whole genome and compared it with the published genome sequence of Lactobacillus helveticus DPC4571. PMID:21705605

  14. Whole-genome shotgun sequencing of an Indian-origin Lactobacillus helveticus strain, MTCC 5463, with probiotic potential.

    PubMed

    Prajapati, J B; Khedkar, C D; Chitra, J; Suja, Senan; Mishra, V; Sreeja, V; Patel, R K; Ahir, V B; Bhatt, V D; Sajnani, M R; Jakhesara, S J; Koringa, P G; Joshi, C G

    2011-08-01

    Lactobacillus helveticus MTCC 5463 was isolated from a vaginal swab from a healthy adult female. The strain exhibited potential probiotic properties, with their beneficial role in the gastrointestinal tract and their ability to reduce cholesterol and stimulate immunity. We sequenced the whole genome and compared it with the published genome sequence of Lactobacillus helveticus DPC4571. PMID:21705605

  15. Identification of Source of Brucella suis Infection in Human by Using Whole-Genome Sequencing, United States and Tonga.

    PubMed

    Quance, Christine; Robbe-Austerman, Suelee; Stuber, Tod; Brignole, Tom; DeBess, Emilio E; Boyd, Laurel; LeaMaster, Brad; Tiller, Rebekah; Draper, Jenny; Humphrey, Sharon; Erdman, Matthew M

    2016-01-01

    Brucella suis infection was diagnosed in a man from Tonga, Polynesia, who had butchered swine in Oregon, USA. Although the US commercial swine herd is designated brucellosis-free, exposure history suggested infection from commercial pigs. We used whole-genome sequencing to determine that the man was infected in Tonga, averting a field investigation.

  16. Identification of Source of Brucella suis Infection in Human by Whole-Genome Sequencing, United States and Tonga

    PubMed Central

    Quance, Christine; Stuber, Tod; Brignole, Tom; DeBess, Emilio E.; Boyd, Laurel; LeaMaster, Brad; Tiller, Rebekah; Draper, Jenny; Humphrey, Sharon; Erdman, Matthew M.

    2016-01-01

    Brucella suis infection was diagnosed in a man from Tonga, Polynesia, who had butchered swine in Oregon, USA. Although the US commercial swine herd is designated brucellosis-free, exposure history suggested infection from commercial pigs. We used whole-genome sequencing to determine that the man was infected in Tonga, averting a field investigation. PMID:26689610

  17. Whole-Genome Sequences of Two Campylobacter coli Isolates from the Antimicrobial Resistance Monitoring Program in Colombia

    PubMed Central

    Bernal, Johan F.; Donado-Godoy, Pilar; Valencia, María Fernanda; León, Maribel; Gómez, Yolanda; Rodríguez, Fernando; Agarwala, Richa; Landsman, David

    2016-01-01

    Campylobacter coli, along with Campylobacter jejuni, is a major agent of gastroenteritis and acute enterocolitis in humans. We report the whole-genome sequences of two multidrug-resistance C. coli strains, isolated from the Colombian poultry chain. The isolates contain a variety of antimicrobial resistance genes for aminoglycosides, lincosamides, fluoroquinolones, and tetracycline. PMID:26988048

  18. Understanding the Quorum-Sensing Bacterium Pantoea stewartii Strain M009 with Whole-Genome Sequencing Analysis.

    PubMed

    Tan, Wen-Si; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Kok-Gan

    2015-01-01

    Pantoea stewartii is known to be the causative agent of Stewart's wilt, which usually affects sweet corn (Zea mays) with the corn flea beetle as the transmission vector. In this work, we present the whole-genome sequence of Pantoea stewartii strain M009, isolated from a Malaysian tropical rainforest waterfall. PMID:25635007

  19. A searchable, whole genome resource designed for protein variant analysis in diverse lineages of U.S. beef cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A key feature of a gene's function is the variety of protein isoforms it encodes in a population. However, the genetic diversity in bovine whole genome databases tends to be underrepresented because these databases contain an abundance of sequence from the most influential sires. Our first aim was ...

  20. Whole genome identification of Mycobacterium tuberculosis vaccine candidates by comprehensive data mining and bioinformatic analyses

    PubMed Central

    Zvi, Anat; Ariel, Naomi; Fulkerson, John; Sadoff, Jerald C; Shafferman, Avigdor

    2008-01-01

    Background Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), infects ~8 million annually culminating in ~2 million deaths. Moreover, about one third of the population is latently infected, 10% of which develop disease during lifetime. Current approved prophylactic TB vaccines (BCG and derivatives thereof) are of variable efficiency in adult protection against pulmonary TB (0%–80%), and directed essentially against early phase infection. Methods A genome-scale dataset was constructed by analyzing published data of: (1) global gene expression studies under conditions which simulate intra-macrophage stress, dormancy, persistence and/or reactivation; (2) cellular and humoral immunity, and vaccine potential. This information was compiled along with revised annotation/bioinformatic characterization of selected gene products and in silico mapping of T-cell epitopes. Protocols for scoring, ranking and prioritization of the antigens were developed and applied. Results Cross-matching of literature and in silico-derived data, in conjunction with the prioritization scheme and biological rationale, allowed for selection of 189 putative vaccine candidates from the entire genome. Within the 189 set, the relative distribution of antigens in 3 functional categories differs significantly from their distribution in the whole genome, with reduction in the Conserved hypothetical category (due to improved annotation) and enrichment in Lipid and in Virulence categories. Other prominent representatives in the 189 set are the PE/PPE proteins; iron sequestration, nitroreductases and proteases, all within the Intermediary metabolism and respiration category; ESX secretion systems, resuscitation promoting factors and lipoproteins, all within the Cell wall category. Application of a ranking scheme based on qualitative and quantitative scores, resulted in a list of 45 best-scoring antigens, of which: 74% belong to the dormancy/reactivation/resuscitation classes; 30% belong

  1. Ethical and legal implications of whole genome and whole exome sequencing in African populations

    PubMed Central

    2013-01-01

    Background Rapid advances in high throughput genomic technologies and next generation sequencing are making medical genomic research more readily accessible and affordable, including the sequencing of patient and control whole genomes and exomes in order to elucidate genetic factors underlying disease. Over the next five years, the Human Heredity and Health in Africa (H3Africa) Initiative, funded by the Wellcome Trust (United Kingdom) and the National Institutes of Health (United States of America), will contribute greatly towards sequencing of numerous African samples for biomedical research. Discussion Funding agencies and journals often require submission of genomic data from research participants to databases that allow open or controlled data access for all investigators. Access to such genotype-phenotype and pedigree data, however, needs careful control in order to prevent identification of individuals or families. This is particularly the case in Africa, where many researchers and their patients are inexperienced in the ethical issues accompanying whole genome and exome research; and where an historical unidirectional flow of samples and data out of Africa has created a sense of exploitation and distrust. In the current study, we analysed the implications of the anticipated surge of next generation sequencing data in Africa and the subsequent data sharing concepts on the protection of privacy of research subjects. We performed a retrospective analysis of the informed consent process for the continent and the rest-of-the-world and examined relevant legislation, both current and proposed. We investigated the following issues: (i) informed consent, including guidelines for performing culturally-sensitive next generation sequencing research in Africa and availability of suitable informed consent documents; (ii) data security and subject privacy whilst practicing data sharing; (iii) conveying the implications of such concepts to research participants in resource

  2. Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: a retrospective observational study

    PubMed Central

    Walker, Timothy M; Ip, Camilla LC; Harrell, Ruth H; Evans, Jason T; Kapatai, Georgia; Dedicoat, Martin J; Eyre, David W; Wilson, Daniel J; Hawkey, Peter M; Crook, Derrick W; Parkhill, Julian; Harris, David; Walker, A Sarah; Bowden, Rory; Monk, Philip; Smith, E Grace; Peto, Tim EA

    2013-01-01

    Summary Background Tuberculosis incidence in the UK has risen in the past decade. Disease control depends on epidemiological data, which can be difficult to obtain. Whole-genome sequencing can detect microevolution within Mycobacterium tuberculosis strains. We aimed to estimate the genetic diversity of related M tuberculosis strains in the UK Midlands and to investigate how this measurement might be used to investigate community outbreaks. Methods In a retrospective observational study, we used Illumina technology to sequence M tuberculosis genomes from an archive of frozen cultures. We characterised isolates into four groups: cross-sectional, longitudinal, household, and community. We measured pairwise nucleotide differences within hosts and between hosts in household outbreaks and estimated the rate of change in DNA sequences. We used the findings to interpret network diagrams constructed from 11 community clusters derived from mycobacterial interspersed repetitive-unit–variable-number tandem-repeat data. Findings We sequenced 390 separate isolates from 254 patients, including representatives from all five major lineages of M tuberculosis. The estimated rate of change in DNA sequences was 0·5 single nucleotide polymorphisms (SNPs) per genome per year (95% CI 0·3–0·7) in longitudinal isolates from 30 individuals and 25 families. Divergence is rarely higher than five SNPs in 3 years. 109 (96%) of 114 paired isolates from individuals and households differed by five or fewer SNPs. More than five SNPs separated isolates from none of 69 epidemiologically linked patients, two (15%) of 13 possibly linked patients, and 13 (17%) of 75 epidemiologically unlinked patients (three-way comparison exact p<0·0001). Genetic trees and clinical and epidemiological data suggest that super-spreaders were present in two community clusters. Interpretation Whole-genome sequencing can delineate outbreaks of tuberculosis and allows inference about direction of transmission between

  3. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction.

    PubMed

    Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S

    2015-06-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome

  4. Whole-Genome Duplication and the Functional Diversification of Teleost Fish Hemoglobins

    PubMed Central

    Opazo, Juan C.; Butts, G. Tyler; Nery, Mariana F.; Storz, Jay F.; Hoffmann, Federico G.

    2013-01-01

    Subsequent to the two rounds of whole-genome duplication that occurred in the common ancestor of vertebrates, a third genome duplication occurred in the stem lineage of teleost fishes. This teleost-specific genome duplication (TGD) is thought to have provided genetic raw materials for the physiological, morphological, and behavioral diversification of this highly speciose group. The extreme physiological versatility of teleost fish is manifest in their diversity of blood–gas transport traits, which reflects the myriad solutions that have evolved to maintain tissue O2 delivery in the face of changing metabolic demands and environmental O2 availability during different ontogenetic stages. During the course of development, regulatory changes in blood–O2 transport are mediated by the expression of multiple, functionally distinct hemoglobin (Hb) isoforms that meet the particular O2-transport challenges encountered by the developing embryo or fetus (in viviparous or oviparous species) and in free-swimming larvae and adults. The main objective of the present study was to assess the relative contributions of whole-genome duplication, large-scale segmental duplication, and small-scale gene duplication in producing the extraordinary functional diversity of teleost Hbs. To accomplish this, we integrated phylogenetic reconstructions with analyses of conserved synteny to characterize the genomic organization and evolutionary history of the globin gene clusters of teleosts. These results were then integrated with available experimental data on functional properties and developmental patterns of stage-specific gene expression. Our results indicate that multiple α- and β-globin genes were present in the common ancestor of gars (order Lepisoteiformes) and teleosts. The comparative genomic analysis revealed that teleosts possess a dual set of TGD-derived globin gene clusters, each of which has undergone lineage-specific changes in gene content via repeated duplication and

  5. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    PubMed Central

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

    2016-01-01

    SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of

  6. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture.

    PubMed

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C P G M; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Amin, Najaf; van Duijn, Cornelia M; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce B J; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia M T; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

    2015-10-01

    The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture

  7. The implications of whole-genome sequencing in the control of tuberculosis

    PubMed Central

    Lee, Robyn S.

    2015-01-01

    The availability of whole-genome sequencing (WGS) as a tool for the diagnosis and clinical management of tuberculosis (TB) offers considerable promise in the fight against this stubborn epidemic. However, like other new technologies, the best application of WGS remains to be determined, for both conceptual and technical reasons. In this review, we consider the potential value of WGS in the clinical laboratory for the detection of Mycobacterium tuberculosis and the prediction of antibiotic resistance. We also discuss issues pertaining to data generation, interpretation and dissemination, given that WGS has to date been generally performed in research labs where results are not necessarily packaged in a clinician-friendly format. Although WGS is far more accessible now than it was in the past, the transition from a research tool to study TB into a clinical test to manage this disease may require further fine-tuning. Improvements will likely come through iterative efforts that involve both the laboratories ready to move TB into the genomic era and the front-line clinical/public health staff who will be interpreting the results to inform management decisions. PMID:27034776

  8. Sequence variants from whole genome sequencing a large group of Icelanders.

    PubMed

    Gudbjartsson, Daniel F; Sulem, Patrick; Helgason, Hannes; Gylfason, Arnaldur; Gudjonsson, Sigurjon A; Zink, Florian; Oddson, Asmundur; Magnusson, Gisli; Halldorsson, Bjarni V; Hjartarson, Eirikur; Sigurdsson, Gunnar Th; Kong, Augustine; Helgason, Agnar; Masson, Gisli; Magnusson, Olafur Th; Thorsteinsdottir, Unnur; Stefansson, Kari

    2015-01-01

    We have accumulated considerable data on the genetic makeup of the Icelandic population by sequencing the whole genomes of 2,636 Icelanders to depth of at least 10X and by chip genotyping 101,584 more. The sequencing was done with Illumina technology. The median sequencing depth was 20X and 909 individuals were sequenced to a depth of at least 30X. We found 20 million single nucleotide polymorphisms (SNPs) and 1.5 million insertions/deletions (indels) that passed stringent quality control. Almost all the common SNPs (derived allele frequency (DAF) over 2%) that we identified in Iceland have been observed by either dbSNP (build 137) or the Exome Sequencing Project (ESP) while only 60 and 20% of rare (DAF<0.5%) SNPs and indels in coding regions, the most heavily studied parts of the genome, have been observed in the public databases. Features of our variant data, such as the transition/transversion ratio and the length distribution of indels, are similar to published reports. PMID:25977816

  9. Power analysis of artificial selection experiments using efficient whole genome simulation of quantitative traits.

    PubMed

    Kessner, Darren; Novembre, John

    2015-04-01

    Evolve and resequence studies combine artificial selection experiments with massively parallel sequencing technology to study the genetic basis for complex traits. In these experiments, individuals are selected for extreme values of a trait, causing alleles at quantitative trait loci (QTL) to increase or decrease in frequency in the experimental population. We present a new analysis of the power of artificial selection experiments to detect and localize quantitative trait loci. This analysis uses a simulation framework that explicitly models whole genomes of individuals, quantitative traits, and selection based on individual trait values. We find that explicitly modeling QTL provides qualitatively different insights than considering independent loci with constant selection coefficients. Specifically, we observe how interference between QTL under selection affects the trajectories and lengthens the fixation times of selected alleles. We also show that a substantial portion of the genetic variance of the trait (50-100%) can be explained by detected QTL in as little as 20 generations of selection, depending on the trait architecture and experimental design. Furthermore, we show that power depends crucially on the opportunity for recombination during the experiment. Finally, we show that an increase in power is obtained by leveraging founder haplotype information to obtain allele frequency estimates.

  10. Unique Features of a Japanese ‘Candidatus Liberibacter asiaticus’ Strain Revealed by Whole Genome Sequencing

    PubMed Central

    Katoh, Hiroshi; Miyata, Shin-ichi; Inoue, Hiromitsu; Iwanami, Toru

    2014-01-01

    Citrus greening (huanglongbing) is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, ‘Candidatus Liberibacter asiaticus’, ‘Ca. L. americanus’, and ‘Ca. L. africanus’. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol), in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative ‘Ca. L. asiaticus’ Japanese isolate Ishi-1 was determined by metagenomic analysis of DNA extracted from ‘Ca. L. asiaticus’-infected psyllids and leaf midribs. The 1.19-Mb genome has an average 36.32% GC content. Annotation revealed 13 operons encoding rRNA and 44 tRNA genes, but no typical bacterial pathogenesis-related genes were located within the genome, similar to the Floridian psy62 and Chinese gxpsy. In contrast to other ‘Ca. L. asiaticus’ strains, the genome of the Japanese Ishi-1 strain lacks a prophage-related region. PMID:25180586

  11. Preferential retention of circadian clock genes during diploidization following whole genome triplication in Brassica rapa.

    PubMed

    Lou, Ping; Wu, Jian; Cheng, Feng; Cressman, Laura G; Wang, Xiaowu; McClung, C Robertson

    2012-06-01

    Much has been learned about the architecture and function of the circadian clock of Arabidopsis thaliana, a model for plant circadian rhythms. Circadian rhythms contribute to evolutionary fitness, suggesting that circadian rhythmicity may also contribute to agricultural productivity. Therefore, we extend our study of the plant circadian clock to Brassica rapa, an agricultural crop. Since its separation from Arabidopsis, B. rapa has undergone whole genome triplication and subsequent diploidization that has involved considerable gene loss. We find that circadian clock genes are preferentially retained relative to comparison groups of their neighboring genes, a set of randomly chosen genes, and a set of housekeeping genes broadly conserved in eukaryotes. The preferential retention of clock genes is consistent with the gene dosage hypothesis, which predicts preferential retention of highly networked or dose-sensitive genes. Two gene families encoding transcription factors that play important roles in the plant core oscillator--the PSEUDO-RESPONSE REGULATORS, including TIMING OF CAB EXPRESSION1, and the REVEILLE family, including CIRCADIAN CLOCK ASSOCIATED1 and LATE ELONGATED HYPOCOTYL--exhibit preferential retention consistent with the gene dosage hypothesis, but a third gene family, including ZEITLUPE, that encodes F-Box proteins that regulate posttranslational protein stability offers an exception.

  12. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association

    PubMed Central

    Strachan, Norval J. C.; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J.; Hanson, Mary F.; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H. A. M.; French, Nigel P.; George, Tessy; Biggs, Patrick J.; Forbes, Ken J.

    2015-01-01

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei’s genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei’s genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world. PMID:26442781

  13. Use of whole genome shotgun metagenomics: a practical guide for the microbiome-minded physician scientist.

    PubMed

    Ma, Jun; Prince, Amanda; Aagaard, Kjersti M

    2014-01-01

    Whole genome shotgun sequencing (WGS) has been increasingly recognized as the most comprehensive and robust approach for metagenomics research. When compared with 16S-based metagenomics, it offers the advantage of identification of species level taxonomy and the estimation of metabolic pathway activities from human and environmental samples. Several large-scale metagenomic projects have been recently conducted or are currently underway utilizing WGS. With the generation of vast amounts of data, the bioinformatics and computational analysis of WGS results become vital for the success of a metagenomics study. However, each step in the WGS data analysis, including metagenome assembly, gene prediction, taxonomy identification, function annotation, and pathway analysis, is complicated by the shear amount of data. Algorithms and tools have been developed specifically to handle WGS-generated metagenomics data with the hope of reducing the requirement on computational time and storage space. Here, we present an overview of the current state of metagenomics through WGS sequencing, challenges frequently encountered, and up-to-date solutions. Several applications that are uniquely applicable to microbiome studies in reproductive and perinatal medicine are also discussed. PMID:24390915

  14. Bonus Organisms in High-Throughput Eukaryotic Whole-Genome Shorgun Assembly

    SciTech Connect

    Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank; Platt, Darren

    2006-02-06

    The DOE Joint Genome Institute has sequenced over 50 eukaryotic genomes, ranging in size from 15 MB to 1.6 GB, over a wide range of organism types. In the course of doing so, it has become clear that a substantial fraction of these data sets contains bonus organisms, usually prokaryotes, in addition to the desired genome. While some of these additional organisms are extraneous contamination, they are sometimes symbionts, and so can be of biological interest. Therefore, it is desirable to assemble the bonus organisms along with the main genome. This transforms the problem into one of metagenomic assembly, which is considerably more challenging than traditional whole-genome shotgun (WGS) assembly. The different organisms will usually be present at different sequence depths, which is difficult to handle in most WGS assemblers. In addition, with multiple distinct genomes present, chimerism can produce cross-organism combinations. Finally, there is no guarantee that only a single bonus organism will be present. For example, one JGI project contained at least two different prokaryotic contaminants, plus a 145 KB plasmid of unknown origin. We have developed techniques to routinely identify and handle such bonus organisms in a high-throughput sequencing environment. Approaches include screening and partitioning the unassembled data, and iterative subassemblies. These methods are applicable not only to bonus organisms, but also to desired components such as organelles. These procedures have the additional benefit of identifying, and allowing for the removal of, cloning artifacts such as E.coli and spurious vector inclusions.

  15. Exome and whole genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity

    PubMed Central

    Dulak, Austin M.; Stojanov, Petar; Peng, Shouyong; Lawrence, Michael S.; Fox, Cameron; Stewart, Chip; Bandla, Santhoshi; Imamura, Yu; Schumacher, Steven E.; Shefler, Erica; McKenna, Aaron; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L.; Saksena, Gordon; Voet, Douglas; Ramos, Alex H.; Auclair, Daniel; Thompson, Kristin; Sougnez, Carrie; Onofrio, Robert C.; Guiducci, Candace; Beroukhim, Rameen; Zhou, David; Lin, Lin; Lin, Jules; Reddy, Rishindra; Chang, Andrew; Luketich, James D.; Pennathur, Arjun; Ogino, Shuji; Golub, Todd R.; Gabriel, Stacey B.; Lander, Eric S.; Beer, David G.; Godfrey, Tony E.; Getz, Gad; Bass, Adam J.

    2013-01-01

    The incidence of esophageal adenocarcinoma (EAC) has risen 600% over the last 30 years. With a five-year survival rate of 15%, identification of new therapeutic targets for EAC is greatly important. We analyze the mutation spectra from whole exome sequencing of 149 EAC tumors/normal pairs, 15 of which have also been subjected to whole genome sequencing. We identify a mutational signature defined by a high prevalence of A to C transversions at AA dinucleotides. Statistical analysis of exome data identified significantly mutated 26 genes. Of these genes, four (TP53, CDKN2A, SMAD4, and PIK3CA) have been previously implicated in EAC. The novel significantly mutated genes include chromatin modifying factors and candidate contributors: SPG20, TLR4, ELMO1, and DOCK2. Functional analyses of EAC-derived mutations in ELMO1 reveal increased cellular invasion. Therefore, we suggest a new hypothesis about the potential activation of the RAC1 pathway to be a contributor to EAC tumorigenesis. PMID:23525077

  16. Allele-specific copy-number discovery from whole-genome and whole-exome sequencing.

    PubMed

    Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J; Szatkiewicz, Jin P

    2015-08-18

    Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. PMID:25883151

  17. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU

    PubMed Central

    Lee, Lap-Kei; Cheung, Jeanno; Liu, Chi-Man

    2014-01-01

    This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels), BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads), or just 25 min for 210-fold whole exome sequencing. BALSA’s speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa. PMID:24949238

  18. Genome management and mismanagement—cell-level opportunities and challenges of whole-genome duplication

    PubMed Central

    Yant, Levi; Bomblies, Kirsten

    2015-01-01

    Whole-genome duplication (WGD) doubles the DNA content in the nucleus and leads to polyploidy. In whole-organism polyploids, WGD has been implicated in adaptability and the evolution of increased genome complexity, but polyploidy can also arise in somatic cells of otherwise diploid plants and animals, where it plays important roles in development and likely environmental responses. As with whole organisms, WGD can also promote adaptability and diversity in proliferating cell lineages, although whether WGD is beneficial is clearly context-dependent. WGD is also sometimes associated with aging and disease and may be a facilitator of dangerous genetic and karyotypic diversity in tumorigenesis. Scaling changes can affect cell physiology, but problems associated with WGD in large part seem to arise from problems with chromosome segregation in polyploid cells. Here we discuss both the adaptive potential and problems associated with WGD, focusing primarily on cellular effects. We see value in recognizing polyploidy as a key player in generating diversity in development and cell lineage evolution, with intriguing parallels across kingdoms. PMID:26637526

  19. Whole-genome sequencing reveals the effect of vaccination on the evolution of Bordetella pertussis

    PubMed Central

    Xu, Yinghua; Liu, Bin; Gröndahl-Yli-Hannuksila, Kirsi; Tan, Yajun; Feng, Lu; Kallonen, Teemu; Wang, Lichan; Peng, Ding; He, Qiushui; Wang, Lei; Zhang, Shumin

    2015-01-01

    Herd immunity can potentially induce a change of circulating viruses. However, it remains largely unknown that how bacterial pathogens adapt to vaccination. In this study, Bordetella pertussis, the causative agent of whooping cough, was selected as an example to explore possible effect of vaccination on the bacterial pathogen. We sequenced and analysed the complete genomes of 40 B. pertussis strains from Finland and China, as well as 11 previously sequenced strains from the Netherlands, where different vaccination strategies have been used over the past 50 years. The results showed that the molecular clock moved at different rates in these countries and in distinct periods, which suggested that evolution of the B. pertussis population was closely associated with the country vaccination coverage. Comparative whole-genome analyses indicated that evolution in this human-restricted pathogen was mainly characterised by ongoing genetic shift and gene loss. Furthermore, 116 SNPs were specifically detected in currently circulating ptxP3-containing strains. The finding might explain the successful emergence of this lineage and its spread worldwide. Collectively, our results suggest that the immune pressure of vaccination is one major driving force for the evolution of B. pertussis, which facilitates further exploration of the pathogenicity of B. pertussis. PMID:26283022

  20. Inference of gorilla demographic and selective history from whole-genome sequence data.

    PubMed

    McManus, Kimberly F; Kelley, Joanna L; Song, Shiya; Veeramah, Krishna R; Woerner, August E; Stevison, Laurie S; Ryder, Oliver A; Ape Genome Project, Great; Kidd, Jeffrey M; Wall, Jeffrey D; Bustamante, Carlos D; Hammer, Michael F

    2015-03-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection.

  1. Whole-Genome Sequencing and Intraspecific Analysis of the Yeast Species Lachancea quebecensis

    PubMed Central

    Freel, Kelle C.; Friedrich, Anne; Sarilar, Véronique; Devillers, Hugo; Neuvéglise, Cécile; Schacherer, Joseph

    2016-01-01

    The gold standard in yeast population genomics has been the model organism Saccharomyces cerevisiae. However, the exploration of yeast species outside the Saccharomyces genus is essential to broaden the understanding of genome evolution. Here, we report the analyses of whole-genome sequences of nineisolates from the recently described yeast species Lachancea quebecensis. The genome of one isolate was assembled and annotated, and the intraspecific variability within L. quebecensis was surveyed by comparing the sequences from the eight other isolates to this reference sequence. Our study revealed that these strains harbor genomes with an average nucleotide diversity of π = 2 × 10−3 which is slightly lower, although on the same order of magnitude, as that previously determined for S. cerevisiae (π = 4 × 10−3). Our results show that even though these isolates were all obtained from a relatively isolated geographic location, the same ecological source, and represent a smaller sample size than is available for S. cerevisiae, the levels of divergence are similar to those observed in this model species. This divergence is essentially linked to the presence of two distinct clusters delineated according to geographic location. However, even with relatively similar ranges of genome divergence, L. quebecensis has an extremely low global phenotypic variance of 0.062 compared with 0.59 previously determined in S. cerevisiae. PMID:26733577

  2. Whole-genome analyses resolve early branches in the tree of life of modern birds.

    PubMed

    Jarvis, Erich D; Mirarab, Siavash; Aberer, Andre J; Li, Bo; Houde, Peter; Li, Cai; Ho, Simon Y W; Faircloth, Brant C; Nabholz, Benoit; Howard, Jason T; Suh, Alexander; Weber, Claudia C; da Fonseca, Rute R; Li, Jianwen; Zhang, Fang; Li, Hui; Zhou, Long; Narula, Nitish; Liu, Liang; Ganapathy, Ganesh; Boussau, Bastien; Bayzid, Md Shamsuzzoha; Zavidovych, Volodymyr; Subramanian, Sankar; Gabaldón, Toni; Capella-Gutiérrez, Salvador; Huerta-Cepas, Jaime; Rekepalli, Bhanu; Munch, Kasper; Schierup, Mikkel; Lindow, Bent; Warren, Wesley C; Ray, David; Green, Richard E; Bruford, Michael W; Zhan, Xiangjiang; Dixon, Andrew; Li, Shengbin; Li, Ning; Huang, Yinhua; Derryberry, Elizabeth P; Bertelsen, Mads Frost; Sheldon, Frederick H; Brumfield, Robb T; Mello, Claudio V; Lovell, Peter V; Wirthlin, Morgan; Schneider, Maria Paula Cruz; Prosdocimi, Francisco; Samaniego, José Alfredo; Vargas Velazquez, Amhed Missael; Alfaro-Núñez, Alonzo; Campos, Paula F; Petersen, Bent; Sicheritz-Ponten, Thomas; Pas, An; Bailey, Tom; Scofield, Paul; Bunce, Michael; Lambert, David M; Zhou, Qi; Perelman, Polina; Driskell, Amy C; Shapiro, Beth; Xiong, Zijun; Zeng, Yongli; Liu, Shiping; Li, Zhenyu; Liu, Binghang; Wu, Kui; Xiao, Jin; Yinqi, Xiong; Zheng, Qiuemei; Zhang, Yong; Yang, Huanming; Wang, Jian; Smeds, Linnea; Rheindt, Frank E; Braun, Michael; Fjeldsa, Jon; Orlando, Ludovic; Barker, F Keith; Jønsson, Knud Andreas; Johnson, Warren; Koepfli, Klaus-Peter; O'Brien, Stephen; Haussler, David; Ryder, Oliver A; Rahbek, Carsten; Willerslev, Eske; Graves, Gary R; Glenn, Travis C; McCormack, John; Burt, Dave; Ellegren, Hans; Alström, Per; Edwards, Scott V; Stamatakis, Alexandros; Mindell, David P; Cracraft, Joel; Braun, Edward L; Warnow, Tandy; Jun, Wang; Gilbert, M Thomas P; Zhang, Guojie

    2014-12-12

    To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago. PMID:25504713

  3. Whole genome and transcriptome sequencing of matched primary and peritoneal metastatic gastric carcinoma

    PubMed Central

    Zhang, J.; Huang, J. Y.; Chen, Y. N.; Yuan, F.; Zhang, H.; Yan, F. H.; Wang, M. J.; Wang, G.; Su, M.; Lu, G; Huang, Y.; Dai, H.; Ji, J.; Zhang, J.; Zhang, J. N.; Jiang, Y. N.; Chen, S. J.; Zhu, Z. G.; Yu, Y. Y.

    2015-01-01

    Gastric cancer is one of the most aggressive cancers and is the second leading cause of cancer death worldwide. Approximately 40% of global gastric cancer cases occur in China, with peritoneal metastasis being the prevalent form of recurrence and metastasis in advanced disease. Currently, there are limited clinical approaches for predicting and treatment of peritoneal metastasis, resulting in a 6-month average survival time. By comprehensive genome analysis will uncover the pathogenesis of peritoneal metastasis. Here we describe a comprehensive whole-genome and transcriptome sequencing analysis of one advanced gastric cancer case, including non-cancerous mucosa, primary cancer and matched peritoneal metastatic cancer. The peripheral blood is used as normal control. We identified 27 mutated genes, of which 19 genes are reported in COSMIC database (ZNF208, CRNN, ATXN3, DCTN1, RP1L1, PRB4, PRB1, MUC4, HS6ST3, MUC17, JAM2, ITGAD, IREB2, IQUB, CORO1B, CCDC121, AKAP2, ACAN and ACADL), and eight genes have not previously been described in gastric cancer (CCDC178, ARMC4, TUBB6, PLIN4, PKLR, PDZD2, DMBT1and DAB1).Additionally,GPX4 and MPND in 19q13.3-13.4 region, is characterized as a novel fusion-gene. This study disclosed novel biological markers and tumorigenic pathways that would predict gastric cancer occurring peritoneal metastasis. PMID:26330360

  4. ENCODE whole-genome data in the UCSC Genome Browser: update 2012.

    PubMed

    Rosenbloom, Kate R; Dreszer, Timothy R; Long, Jeffrey C; Malladi, Venkat S; Sloan, Cricket A; Raney, Brian J; Cline, Melissa S; Karolchik, Donna; Barber, Galt P; Clawson, Hiram; Diekhans, Mark; Fujita, Pauline A; Goldman, Mary; Gravell, Robert C; Harte, Rachel A; Hinrichs, Angie S; Kirkup, Vanessa M; Kuhn, Robert M; Learned, Katrina; Maddren, Morgan; Meyer, Laurence R; Pohl, Andy; Rhead, Brooke; Wong, Matthew C; Zweig, Ann S; Haussler, David; Kent, W James

    2012-01-01

    The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser (http://genome.ucsc.edu) and on an early-access Preview Browser (http://genome-preview.ucsc.edu). Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal (http://encodeproject.org) provides specialized tools and information about the ENCODE data sets.

  5. Whole-Genome Resequencing Reveals Extensive Natural Variation in the Model Green Alga Chlamydomonas reinhardtii.

    PubMed

    Flowers, Jonathan M; Hazzouri, Khaled M; Pham, Gina M; Rosas, Ulises; Bahmani, Tayebeh; Khraiwesh, Basel; Nelson, David R; Jijakli, Kenan; Abdrabu, Rasha; Harris, Elizabeth H; Lefebvre, Paul A; Hom, Erik F Y; Salehi-Ashtiani, Kourosh; Purugganan, Michael D

    2015-09-01

    We performed whole-genome resequencing of 12 field isolates and eight commonly studied laboratory strains of the model organism Chlamydomonas reinhardtii to characterize genomic diversity and provide a resource for studies of natural variation. Our data support previous observations that Chlamydomonas is among the most diverse eukaryotic species. Nucleotide diversity is ∼3% and is geographically structured in North America with some evidence of admixture among sampling locales. Examination of predicted loss-of-function mutations in field isolates indicates conservation of genes associated with core cellular functions, while genes in large gene families and poorly characterized genes show a greater incidence of major effect mutations. De novo assembly of unmapped reads recovered genes in the field isolates that are absent from the CC-503 assembly. The laboratory reference strains show a genomic pattern of polymorphism consistent with their origin as the recombinant progeny of a diploid zygospore. Large duplications or amplifications are a prominent feature of laboratory strains and appear to have originated under laboratory culture. Extensive natural variation offers a new source of genetic diversity for studies of Chlamydomonas, including naturally occurring alleles that may prove useful in studies of gene function and the dissection of quantitative genetic traits.

  6. Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas.

    PubMed

    Zhang, Jinghui; Wu, Gang; Miller, Claudia P; Tatevossian, Ruth G; Dalton, James D; Tang, Bo; Orisme, Wilda; Punchihewa, Chandanamali; Parker, Matthew; Qaddoumi, Ibrahim; Boop, Fredrick A; Lu, Charles; Kandoth, Cyriac; Ding, Li; Lee, Ryan; Huether, Robert; Chen, Xiang; Hedlund, Erin; Nagahawatte, Panduka; Rusch, Michael; Boggs, Kristy; Cheng, Jinjun; Becksfort, Jared; Ma, Jing; Song, Guangchun; Li, Yongjin; Wei, Lei; Wang, Jianmin; Shurtleff, Sheila; Easton, John; Zhao, David; Fulton, Robert S; Fulton, Lucinda L; Dooling, David J; Vadodaria, Bhavin; Mulder, Heather L; Tang, Chunlao; Ochoa, Kerri; Mullighan, Charles G; Gajjar, Amar; Kriwacki, Richard; Sheer, Denise; Gilbertson, Richard J; Mardis, Elaine R; Wilson, Richard K; Downing, James R; Baker, Suzanne J; Ellison, David W

    2013-06-01

    The most common pediatric brain tumors are low-grade gliomas (LGGs). We used whole-genome sequencing to identify multiple new genetic alterations involving BRAF, RAF1, FGFR1, MYB, MYBL1 and genes with histone-related functions, including H3F3A and ATRX, in 39 LGGs and low-grade glioneuronal tumors (LGGNTs). Only a single non-silent somatic alteration was detected in 24 of 39 (62%) tumors. Intragenic duplications of the portion of FGFR1 encoding the tyrosine kinase domain (TKD) and rearrangements of MYB were recurrent and mutually exclusive in 53% of grade II diffuse LGGs. Transplantation of Trp53-null neonatal astrocytes expressing FGFR1 with the duplication involving the TKD into the brains of nude mice generated high-grade astrocytomas with short latency and 100% penetrance. FGFR1 with the duplication induced FGFR1 autophosphorylation and upregulation of the MAPK/ERK and PI3K pathways, which could be blocked by specific inhibitors. Focusing on the therapeutically challenging diffuse LGGs, our study of 151 tumors has discovered genetic alterations and potential therapeutic targets across the entire range of pediatric LGGs and LGGNTs.

  7. Whole genome sequence of Staphylococcus saprophyticus reveals the pathogenesis of uncomplicated urinary tract infection.

    PubMed

    Kuroda, Makoto; Yamashita, Atsushi; Hirakawa, Hideki; Kumano, Miyuki; Morikawa, Kazuya; Higashide, Masato; Maruyama, Atsushi; Inose, Yumiko; Matoba, Kimio; Toh, Hidehiro; Kuhara, Satoru; Hattori, Masahira; Ohta, Toshiko

    2005-09-13

    Staphylococcus saprophyticus is a uropathogenic Staphylococcus frequently isolated from young female outpatients presenting with uncomplicated urinary tract infections. We sequenced the whole genome of S. saprophyticus type strain ATCC 15305, which harbors a circular chromosome of 2,516,575 bp with 2,446 ORFs and two plasmids. Comparative genomic analyses with the strains of two other species, Staphylococcus aureus and Staphylococcus epidermidis, as well as experimental data, revealed the following characteristics of the S. saprophyticus genome. S. saprophyticus does not possess any virulence factors found in S. aureus, such as coagulase, enterotoxins, exoenzymes, and extracellular matrix-binding proteins, although it does have a remarkable paralog expansion of transport systems related to highly variable ion contents in the urinary environment. A further unique feature is that only a single ORF is predictable as a cell wall-anchored protein, and it shows positive hemagglutination and adherence to human bladder cell associated with initial colonization in the urinary tract. It also shows significantly high urease activity in S. saprophyticus. The uropathogenicity of S. saprophyticus can be attributed to its genome that is needed for its survival in the human urinary tract by means of novel cell wall-anchored adhesin and redundant uro-adaptive transport systems, together with urease.

  8. Isolation and whole genome sequencing of a Ruminococcus-like bacterium, associated with irritable bowel syndrome.

    PubMed

    Hynönen, Ulla; Rasinkangas, Pia; Satokari, Reetta; Paulin, Lars; de Vos, Willem M; Pietilä, Taija E; Kant, Ravi; Palva, Airi

    2016-06-01

    In our previous studies on the intestinal microbiota in irritable bowel syndrome (IBS), we identified a bacterial phylotype with higher abundance in patients suffering from diarrhea than in healthy controls. In the present work, we have isolated in pure culture strain RT94, belonging to this phylotype, determined its whole genome sequence and performed an extensive genomic analysis and phenotypical testing. This revealed strain RT94 to be a strict anaerobe apparently belonging to a novel species with only 94% similarity in the 16S rRNA gene sequence to the closest relatives Ruminococcus torques and Ruminococcus lactaris. The G + C content of strain RT94 is 45.2 mol% and the major long-chain cellular fatty acids are C16:0, C18:0 and C14:0. The isolate is metabolically versatile but not a mucus or cellulose utilizer. It produces acetate, ethanol, succinate, lactate and formate, but very little butyrate, as end products of glucose metabolism. The mechanisms underlying the association of strain RT94 with diarrhea-type IBS are discussed. PMID:26946362

  9. A Proposed Clinical Decision Support Architecture Capable of Supporting Whole Genome Sequence Information

    PubMed Central

    Welch, Brandon M.; Rodriguez Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku

    2014-01-01

    Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine. PMID:25411644

  10. Nested radiations and the pulse of angiosperm diversification: increased diversification rates often follow whole genome duplications.

    PubMed

    Tank, David C; Eastman, Jonathan M; Pennell, Matthew W; Soltis, Pamela S; Soltis, Douglas E; Hinchliff, Cody E; Brown, Joseph W; Sessa, Emily B; Harmon, Luke J

    2015-07-01

    Our growing understanding of the plant tree of life provides a novel opportunity to uncover the major drivers of angiosperm diversity. Using a time-calibrated phylogeny, we characterized hot and cold spots of lineage diversification across the angiosperm tree of life by modeling evolutionary diversification using stepwise AIC (MEDUSA). We also tested the whole-genome duplication (WGD) radiation lag-time model, which postulates that increases in diversification tend to lag behind established WGD events. Diversification rates have been incredibly heterogeneous throughout the evolutionary history of angiosperms and reveal a pattern of 'nested radiations' - increases in net diversification nested within other radiations. This pattern in turn generates a negative relationship between clade age and diversity across both families and orders. We suggest that stochastically changing diversification rates across the phylogeny explain these patterns. Finally, we demonstrate significant statistical support for the WGD radiation lag-time model. Across angiosperms, nested shifts in diversification led to an overall increasing rate of net diversification and declining relative extinction rates through time. These diversification shifts are only rarely perfectly associated with WGD events, but commonly follow them after a lag period.

  11. Whole-Genome Scans Provide Evidence of Adaptive Evolution in Malawian Plasmodium falciparum Isolates

    PubMed Central

    Ocholla, Harold; Preston, Mark D.; Mipando, Mwapatsa; Jensen, Anja T. R.; Campino, Susana; MacInnis, Bronwyn; Alcock, Daniel; Terlouw, Anja; Zongo, Issaka; Oudraogo, Jean-Bosco; Djimde, Abdoulaye A.; Assefa, Samuel; Doumbo, Ogobara K.; Borrmann, Steffen; Nzila, Alexis; Marsh, Kevin; Fairhurst, Rick M.; Nosten, Francois; Anderson, Tim J. C.; Kwiatkowski, Dominic P.; Craig, Alister; Clark, Taane G.; Montgomery, Jacqui

    2014-01-01

    Background Selection by host immunity and antimalarial drugs has driven extensive adaptive evolution in Plasmodium falciparum and continues to produce ever-changing landscapes of genetic variation. Methods We performed whole-genome sequencing of 69 P. falciparum isolates from Malawi and used population genetics approaches to investigate genetic diversity and population structure and identify loci under selection. Results High genetic diversity (π = 2.4 × 10−4), moderately high multiplicity of infection (2.7), and low linkage disequilibrium (500-bp) were observed in Chikhwawa District, Malawi, an area of high malaria transmission. Allele frequency–based tests provided evidence of recent population growth in Malawi and detected potential targets of host immunity and candidate vaccine antigens. Comparison of the sequence variation between isolates from Malawi and those from 5 geographically dispersed countries (Kenya, Burkina Faso, Mali, Cambodia, and Thailand) detected population genetic differences between Africa and Asia, within Southeast Asia, and within Africa. Haplotype-based tests of selection to sequence data from all 6 populations identified signals of directional selection at known drug-resistance loci, including pfcrt, pfdhps, pfmdr1, and pfgch1. Conclusions The sequence variations observed at drug-resistance loci reflect differences in each country's historical use of antimalarial drugs and may be useful in formulating local malaria treatment guidelines. PMID:24948693

  12. Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs.

    PubMed

    Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L

    2016-02-01

    Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs-Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa. PMID:26419336

  13. Whole-genome copy number variation analysis in anophthalmia and microphthalmia.

    PubMed

    Schilter, K F; Reis, L M; Schneider, A; Bardakjian, T M; Abdul-Rahman, O; Kozel, B A; Zimmerman, H H; Broeckel, U; Semina, E V

    2013-11-01

    Anophthalmia/microphthalmia (A/M) represent severe developmental ocular malformations. Currently, mutations in known genes explain less than 40% of A/M cases. We performed whole-genome copy number variation analysis in 60 patients affected with isolated or syndromic A/M. Pathogenic deletions of 3q26 (SOX2) were identified in four independent patients with syndromic microphthalmia. Other variants of interest included regions with a known role in human disease (likely pathogenic) as well as novel rearrangements (uncertain significance). A 2.2-Mb duplication of 3q29 in a patient with non-syndromic anophthalmia and an 877-kb duplication of 11p13 (PAX6) and a 1.4-Mb deletion of 17q11.2 (NF1) in two independent probands with syndromic microphthalmia and other ocular defects were identified; while ocular anomalies have been previously associated with 3q29 duplications, PAX6 duplications, and NF1 mutations in some cases, the ocular phenotypes observed here are more severe than previously reported. Three novel regions of possible interest included a 2q14.2 duplication which cosegregated with microphthalmia/microcornea and congenital cataracts in one family, and 2q21 and 15q26 duplications in two additional cases; each of these regions contains genes that are active during vertebrate ocular development. Overall, this study identified causative copy number mutations and regions with a possible role in ocular disease in 17% of A/M cases.

  14. Kuwaiti population subgroup of nomadic Bedouin ancestry-Whole genome sequence and analysis.

    PubMed

    John, Sumi Elsa; Thareja, Gaurav; Hebbar, Prashantha; Behbehani, Kazem; Thanaraj, Thangavel Alphonse; Alsmadi, Osama

    2015-03-01

    Kuwaiti native population comprises three distinct genetic subgroups of Persian, "city-dwelling" Saudi Arabian tribe, and nomadic "tent-dwelling" Bedouin ancestry. Bedouin subgroup is characterized by presence of 17% African ancestry; it owes it origin to nomadic tribes of the deserts of Arabian Peninsula and North Africa. By sequencing whole genome of a Kuwaiti male from this subgroup at 41X coverage, we report 3,752,878 SNPs, 411,839 indels, and 8451 structural variations. Neighbor-joining tree, based on shared variant positions carrying disease-risk alleles between the Bedouin and other continental genomes, places Bedouin genome at the nexus of African, Asian, and European genomes in concordance with geographical location of Kuwait and Peninsula. In congruence with participant's medical history for morbid obesity and bronchial asthma, risk alleles are seen at deleterious SNPs associated with obesity and asthma. Many of the observed deleterious 'novel' variants lie in genes associated with autosomal recessive disorders characteristic of the region. PMID:26484159

  15. Mechanisms of Linezolid Resistance among Coagulase-Negative Staphylococci Determined by Whole-Genome Sequencing

    PubMed Central

    Tewhey, Ryan; Gu, Bing; Kelesidis, Theodoros; Charlton, Carmen; Bobenchik, April; Hindler, Janet; Schork, Nicholas J.

    2014-01-01

    ABSTRACT Linezolid resistance is uncommon among staphylococci, but approximately 2% of clinical isolates of coagulase-negative staphylococci (CoNS) may exhibit resistance to linezolid (MIC, ≥8 µg/ml). We performed whole-genome sequencing (WGS) to characterize the resistance mechanisms and genetic backgrounds of 28 linezolid-resistant CoNS (21 Staphylococcus epidermidis isolates and 7 Staphylococcus haemolyticus isolates) obtained from blood cultures at a large teaching health system in California between 2007 and 2012. The following well-characterized mutations associated with linezolid resistance were identified in the 23S rRNA: G2576U, G2447U, and U2504A, along with the mutation C2534U. Mutations in the L3 and L4 riboproteins, at sites previously associated with linezolid resistance, were also identified in 20 isolates. The majority of isolates harbored more than one mutation in the 23S rRNA and L3 and L4 genes. In addition, the cfr methylase gene was found in almost half (48%) of S. epidermidis isolates. cfr had been only rarely identified in staphylococci in the United States prior to this study. Isolates of the same sequence type were identified with unique mutations associated with linezolid resistance, suggesting independent acquisition of linezolid resistance in each isolate. PMID:24915435

  16. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia

    PubMed Central

    Winter, David J.; Pacheco, M. Andreína; Vallejo, Andres F.; Schwartz, Rachel S.; Arevalo-Herrera, Myriam; Herrera, Socrates

    2015-01-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America. PMID:26709695

  17. Sensitive and specific KRAS somatic mutation analysis on whole-genome amplified DNA from archival tissues.

    PubMed

    van Eijk, Ronald; van Puijenbroek, Marjo; Chhatta, Amiet R; Gupta, Nisha; Vossen, Rolf H A M; Lips, Esther H; Cleton-Jansen, Anne-Marie; Morreau, Hans; van Wezel, Tom

    2010-01-01

    Kirsten RAS (KRAS) is a small GTPase that plays a key role in Ras/mitogen-activated protein kinase signaling; somatic mutations in KRAS are frequently found in many cancers. The most common KRAS mutations result in a constitutively active protein. Accurate detection of KRAS mutations is pivotal to the molecular diagnosis of cancer and may guide proper treatment selection. Here, we describe a two-step KRAS mutation screening protocol that combines whole-genome amplification (WGA), high-resolution melting analysis (HRM) as a prescreen method for mutation carrying samples, and direct Sanger sequencing of DNA from formalin-fixed, paraffin-embedded (FFPE) tissue, from which limited amounts of DNA are available. We developed target-specific primers, thereby avoiding amplification of homologous KRAS sequences. The addition of herring sperm DNA facilitated WGA in DNA samples isolated from as few as 100 cells. KRAS mutation screening using high-resolution melting analysis on wgaDNA from formalin-fixed, paraffin-embedded tissue is highly sensitive and specific; additionally, this method is feasible for screening of clinical specimens, as illustrated by our analysis of pancreatic cancers. Furthermore, PCR on wgaDNA does not introduce genotypic changes, as opposed to unamplified genomic DNA. This method can, after validation, be applied to virtually any potentially mutated region in the genome.

  18. Summarizing polygenic risks for complex diseases in a clinical whole genome report

    PubMed Central

    Kong, Sek Won; Lee, In-Hee; Leschiner, Ignaty; Krier, Joel; Kraft, Peter; Rehm, Heidi L.; Green, Robert C.; Kohane, Isaac S.; MacRae, Calum A.

    2015-01-01

    Purpose Disease-causing mutations and pharmacogenomic variants are of primary interest for clinical whole-genome sequencing. However, estimating genetic liability for common complex diseases using established risk alleles might one day prove clinically useful. Methods We compared polygenic scoring methods using a case-control data set with independently discovered risk alleles in the MedSeq Project. For eight traits of clinical relevance in both the primary-care and cardiomyopathy study cohorts, we estimated multiplicative polygenic risk scores using 161 published risk alleles and then normalized using the population median estimated from the 1000 Genomes Project. Results Our polygenic score approach identified the overrepresentation of independently discovered risk alleles in cases as compared with controls using a large-scale genome-wide association study data set. In addition to normalized multiplicative polygenic risk scores and rank in a population, the disease prevalence and proportion of heritability explained by known common risk variants provide important context in the interpretation of modern multilocus disease risk models. Conclusion Our approach in the MedSeq Project demonstrates how complex trait risk variants from an individual genome can be summarized and reported for the general clinician and also highlights the need for definitive clinical studies to obtain reference data for such estimates and to establish clinical utility. PMID:25341114

  19. Wide-cross whole-genome radiation hybrid mapping of cotton (Gossypium hirsutum L.).

    PubMed Central

    Gao, Wenxiang; Chen, Z Jeffrey; Yu, John Z; Raska, Dwaine; Kohel, Russell J; Womack, James E; Stelly, David M

    2004-01-01

    We report the development and characterization of a "wide-cross whole-genome radiation hybrid" (WWRH) panel from cotton (Gossypium hirsutum L.). Chromosomes were segmented by gamma-irradiation of G. hirsutum (n = 26) pollen, and segmented chromosomes were rescued after in vivo fertilization of G. barbadense egg cells (n = 26). A 5-krad gamma-ray WWRH mapping panel (N = 93) was constructed and genotyped at 102 SSR loci. SSR marker retention frequencies were higher than those for animal systems and marker retention patterns were informative. Using the program RHMAP, 52 of 102 SSR markers were mapped into 16 syntenic groups. Linkage group 9 (LG 9) SSR markers BNL0625 and BNL2805 had been colocalized by linkage analysis, but their order was resolved by differential retention among WWRH plants. Two linkage groups, LG 13 and LG 9, were combined into one syntenic group, and the chromosome 1 linkage group marker BNL4053 was reassigned to chromosome 9. Analyses of cytogenetic stocks supported synteny of LG 9 and LG 13 and localized them to the short arm of chromosome 17. They also supported reassignment of marker BNL4053 to the long arm of chromosome 9. A WWRH map of the syntenic group composed of linkage groups 9 and 13 was constructed by maximum-likelihood analysis under the general retention model. The results demonstrate not only the feasibility of WWRH panel construction and mapping, but also complementarity to traditional linkage mapping and cytogenetic methods. PMID:15280245

  20. Genome management and mismanagement--cell-level opportunities and challenges of whole-genome duplication.

    PubMed

    Yant, Levi; Bomblies, Kirsten

    2015-12-01

    Whole-genome duplication (WGD) doubles the DNA content in the nucleus and leads to polyploidy. In whole-organism polyploids, WGD has been implicated in adaptability and the evolution of increased genome complexity, but polyploidy can also arise in somatic cells of otherwise diploid plants and animals, where it plays important roles in development and likely environmental responses. As with whole organisms, WGD can also promote adaptability and diversity in proliferating cell lineages, although whether WGD is beneficial is clearly context-dependent. WGD is also sometimes associated with aging and disease and may be a facilitator of dangerous genetic and karyotypic diversity in tumorigenesis. Scaling changes can affect cell physiology, but problems associated with WGD in large part seem to arise from problems with chromosome segregation in polyploid cells. Here we discuss both the adaptive potential and problems associated with WGD, focusing primarily on cellular effects. We see value in recognizing polyploidy as a key player in generating diversity in development and cell lineage evolution, with intriguing parallels across kingdoms.

  1. Whole Genome Amplification of Plasma-Circulating DNA Enables Expanded Screening for Allelic Imbalance in Plasma

    PubMed Central

    Li, Jin; Harris, Lyndsay; Mamon, Harvey; Kulke, Matthew H.; Liu, Wei-Hua; Zhu, Penny; Mike Makrigiorgos, G.

    2006-01-01

    Apoptotic and necrotic tumor cells release DNA into plasma, providing an accessible tumor biomarker. Tumor-released plasma-circulating DNA can be screened for tumor-specific genetic changes, including mutation, methylation, or allelic imbalance. However, technical problems relating to the quantity and quality of DNA collected from plasma hinder downstream genetic screening and reduce biomarker detection sensitivity. Here, we present a new methodology, blunt-end ligation-mediated whole genome amplification (BL-WGA), that efficiently amplifies small apoptotic fragments (<200 bp) as well as intermediate and large necrotic fragments (>5 kb) and enables reliable high-throughput analysis of plasma-circulating DNA. In a single-tube reaction, purified double-stranded DNA was blunted with T4 DNA polymerase, self-ligated or cross-ligated with T4 DNA ligase and amplified via random primer-initiated multiple displacement amplification. Using plasma DNA from breast cancer patients and normal controls, we demonstrate that BL-WGA amplified the plasma-circulating genome by ∼1000-fold. Of 25 informative polymorphic sites screened via polymerase chain reaction-denaturating high-performance liquid chromatography, 24 (95%) were correctly determined by BL-WGA to be allelic retention or imbalance compared to 44% by multiple displacement amplification. By enabling target magnification and application of high-throughput genome analysis, BL-WGA improves sensitivity for detection of circulating tumor-specific biomarkers from bodily fluids or for recovery of nucleic acids from suboptimally stored specimens. PMID:16436631

  2. Whole genome sequencing provides insights into the genetic determinants of invasiveness in Salmonella Dublin.

    PubMed

    Mohammed, M; Cormican, M

    2016-08-01

    Salmonella enterica subsp. enterica serovar Dublin (S. Dublin) is one of the non-typhoidal Salmonella (NTS); however, a relatively high proportion of human infections are associated with invasive disease. We applied whole genome sequencing to representative invasive and non-invasive clinical isolates of S. Dublin to determine the genomic variations among them and to investigate the underlying genetic determinants associated with invasiveness in S. Dublin. Although no particular genomic variation was found to differentiate in invasive and non-invasive isolates four virulence factors were detected within the genome of all isolates including two different type VI secretion systems (T6SS) encoded on two Salmonella pathogenicity islands (SPI), including SPI-6 (T6SSSPI-6) and SPI-19 (T6SSSPI-19), an intact lambdoid prophage (Gifsy-2-like prophage) that contributes significantly to the virulence and pathogenesis of Salmonella serotypes in addition to a virulence plasmid. These four virulence factors may all contribute to the potential of S. Dublin to cause invasive disease in humans.

  3. Whole-genome sequencing reveals the effect of vaccination on the evolution of Bordetella pertussis.

    PubMed

    Xu, Yinghua; Liu, Bin; Gröndahl-Yli-Hannuksila, Kirsi; Tan, Yajun; Feng, Lu; Kallonen, Teemu; Wang, Lichan; Peng, Ding; He, Qiushui; Wang, Lei; Zhang, Shumin

    2015-08-18

    Herd immunity can potentially induce a change of circulating viruses. However, it remains largely unknown that how bacterial pathogens adapt to vaccination. In this study, Bordetella pertussis, the causative agent of whooping cough, was selected as an example to explore possible effect of vaccination on the bacterial pathogen. We sequenced and analysed the complete genomes of 40 B. pertussis strains from Finland and China, as well as 11 previously sequenced strains from the Netherlands, where different vaccination strategies have been used over the past 50 years. The results showed that the molecular clock moved at different rates in these countries and in distinct periods, which suggested that evolution of the B. pertussis population was closely associated with the country vaccination coverage. Comparative whole-genome analyses indicated that evolution in this human-restricted pathogen was mainly characterised by ongoing genetic shift and gene loss. Furthermore, 116 SNPs were specifically detected in currently circulating ptxP3-containing strains. The finding might explain the successful emergence of this lineage and its spread worldwide. Collectively, our results suggest that the immune pressure of vaccination is one major driving force for the evolution of B. pertussis, which facilitates further exploration of the pathogenicity of B. pertussis.

  4. Prospective Whole-Genome Sequencing Enhances National Surveillance of Listeria monocytogenes

    PubMed Central

    Kwong, Jason C.; Mercoulia, Karolina; Tomita, Takehiro; Easton, Marion; Li, Hua Y.; Bulach, Dieter M.; Stinear, Timothy P.; Seemann, Torsten

    2015-01-01

    Whole-genome sequencing (WGS) has emerged as a powerful tool for comparing bacterial isolates in outbreak detection and investigation. Here we demonstrate that WGS performed prospectively for national epidemiologic surveillance of Listeria monocytogenes has the capacity to be superior to our current approaches using pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), multilocus variable-number tandem-repeat analysis (MLVA), binary typing, and serotyping. Initially 423 L. monocytogenes isolates underwent WGS, and comparisons uncovered a diverse genetic population structure derived from three distinct lineages. MLST, binary typing, and serotyping results inferred in silico from the WGS data were highly concordant (>99%) with laboratory typing performed in parallel. However, WGS was able to identify distinct nested clusters within groups of isolates that were otherwise indistinguishable using our current typing methods. Routine WGS was then used for prospective epidemiologic surveillance on a further 97 L. monocytogenes isolates over a 12-month period, which provided a greater level of discrimination than that of conventional typing for inferring linkage to point source outbreaks. A risk-based alert system based on WGS similarity was used to inform epidemiologists required to act on the data. Our experience shows that WGS can be adopted for prospective L. monocytogenes surveillance and investigated for other pathogens relevant to public health. PMID:26607978

  5. Mycobacterial DNA extraction for whole-genome sequencing from early positive liquid (MGIT) cultures.

    PubMed

    Votintseva, Antonina A; Pankhurst, Louise J; Anson, Luke W; Morgan, Marcus R; Gascoyne-Binzi, Deborah; Walker, Timothy M; Quan, T Phuong; Wyllie, David H; Del Ojo Elias, Carlos; Wilcox, Mark; Walker, A Sarah; Peto, Tim E A; Crook, Derrick W

    2015-04-01

    We developed a low-cost and reliable method of DNA extraction from as little as 1 ml of early positive mycobacterial growth indicator tube (MGIT) cultures that is suitable for whole-genome sequencing to identify mycobacterial species and predict antibiotic resistance in clinical samples. The DNA extraction method is based on ethanol precipitation supplemented by pretreatment steps with a MolYsis kit or saline wash for the removal of human DNA and a final DNA cleanup step with solid-phase reversible immobilization beads. The protocol yielded ≥0.2 ng/μl of DNA for 90% (MolYsis kit) and 83% (saline wash) of positive MGIT cultures. A total of 144 (94%) of the 154 samples sequenced on the MiSeq platform (Illumina) achieved the target of 1 million reads, with <5% of reads derived from human or nasopharyngeal flora for 88% and 91% of samples, respectively. A total of 59 (98%) of 60 samples that were identified by the national mycobacterial reference laboratory (NMRL) as Mycobacterium tuberculosis were successfully mapped to the H37Rv reference, with >90% coverage achieved. The DNA extraction protocol, therefore, will facilitate fast and accurate identification of mycobacterial species and resistance using a range of bioinformatics tools.

  6. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    PubMed Central

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  7. New Perspectives on Microbial Community Distortion after Whole-Genome Amplification

    PubMed Central

    DeSantis, Todd Z.; Santo Domingo, Jorge W.; Ashbolt, Nicholas

    2015-01-01

    Whole-genome amplification (WGA) has become an important tool to explore the genomic information of microorganisms in an environmental sample with limited biomass, however potential selective biases during the amplification processes are poorly understood. Here, we describe the effects of WGA on 31 different microbial communities from five biotopes that also included low-biomass samples from drinking water and groundwater. Our findings provide evidence that microbiome segregation by biotope was possible despite WGA treatment. Nevertheless, samples from different biotopes revealed different levels of distortion, with genomic GC content significantly correlated with WGA perturbation. Certain phylogenetic clades revealed a homogenous trend across various sample types, for instance Alpha- and Betaproteobacteria showed a decrease in their abundance after WGA treatment. On the other hand, Enterobacteriaceae, an important biomarker group for fecal contamination in groundwater and drinking water, were strongly affected by WGA treatment without a predictable pattern. These novel results describe the impact of WGA on low-biomass samples and may highlight issues to be aware of when designing future metagenomic studies that necessitate preceding WGA treatment. PMID:26010362

  8. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing.

    PubMed

    Staton, Margaret; Best, Teodora; Khodwekar, Sudhir; Owusu, Sandra; Xu, Tao; Xu, Yi; Jennings, Tara; Cronn, Richard; Arumuganathan, A Kathiravetpilla; Coggeshall, Mark; Gailing, Oliver; Liang, Haiying; Romero-Severson, Jeanne; Schlarbaum, Scott; Carlson, John E

    2015-01-01

    Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence.

  9. Determinants of spontaneous mutation in the bacterium Escherichia coli as revealed by whole-genome sequencing.

    PubMed

    Foster, Patricia L; Lee, Heewook; Popodi, Ellen; Townes, Jesse P; Tang, Haixu

    2015-11-01

    A complete understanding of evolutionary processes requires that factors determining spontaneous mutation rates and spectra be identified and characterized. Using mutation accumulation followed by whole-genome sequencing, we found that the mutation rates of three widely diverged commensal Escherichia coli strains differ only by about 50%, suggesting that a rate of 1-2 × 10(-3) mutations per generation per genome is common for this bacterium. Four major forces are postulated to contribute to spontaneous mutations: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage caused by exogenous agents, and the activities of error-prone polymerases. To determine the relative importance of these factors, we studied 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to prevent or repair oxidative DNA damage significantly impacted mutation rates or spectra. These results suggest that, with the exception of oxidative damage, endogenously induced DNA damage does not perturb the overall accuracy of DNA replication in normally growing cells and that repair pathways may exist primarily to defend against exogenously induced DNA damage. The thousands of mutations caused by oxidative damage recovered across the entire genome revealed strong local-sequence biases of these mutations. Specifically, we found that the identity of the 3' base can affect the mutability of a purine by oxidative damage by as much as eightfold.

  10. Whole Genome Sequencing for the Retrospective Investigation of an Outbreak of Salmonella Typhimurium DT 8

    PubMed Central

    Ashton, Philip M; Peters, Tansy; Ameh, Linda; McAleer, Ralph; Petrie, Stewart; Nair, Satheesh; Muscat, Ivan; de Pinna, Elizabeth; Dallman, Tim

    2015-01-01

    Background: Salmonella enterica serovar Typhimurium DT8 is uncommon within the European Union. An increase in this phage type was reported in the summer of 2013 in the States of Jersey. Methods: A total of 21 human cases with this phage type were microbiologically confirmed. Salmonella isolates from mayonnaise made using raw eggs were also confirmed as being Salmonella Typhimurium DT8. The epidemiological investigations strongly supported a link between mayonnaise consumption and illness. Whole genome sequencing (WGS) was used to retrospectively investigate this outbreak with a view to assess the similarity between the suspect food and the human isolates and to characterise a known point source outbreak to assist in development of algorithms for outbreak detection. Results: Sequence data showed that the outbreak associated isolates, including the food isolates, formed a tightly clustered monophyletic group, with a maximum pairwise distance of 3 single nucleotide polymorphisms. Conclusions: WGS data is useful in confirming the causative agent of outbreaks where food and clinical isolates are available. This dataset, comprising a known outbreak, will be useful in the development of automatic algorithms for outbreak detection. PMID:25713745

  11. Whole Genome Sequencing Reveals a De Novo SHANK3 Mutation in Familial Autism Spectrum Disorder

    PubMed Central

    Nemirovsky, Sergio I.; Córdoba, Marta; Zaiat, Jonathan J.; Completa, Sabrina P.; Vega, Patricia A.; González-Morón, Dolores; Medina, Nancy M.; Fabbro, Mónica; Romero, Soledad; Brun, Bianca; Revale, Santiago; Ogara, María Florencia; Pecci, Adali; Marti, Marcelo; Vazquez, Martin; Turjanski, Adrián; Kauffman, Marcelo A.

    2015-01-01

    Introduction Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD). Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS) for the diagnostic approach to ASD. Methods We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents. Results Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6). Conclusions We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder. PMID:25646853

  12. Whole genome analyses of marine fish pathogenic isolate, Mycobacterium sp. 012931.

    PubMed

    Kurokawa, Satoru; Kabayama, Jun; Hwang, Seong Don; Nho, Seong Won; Hikima, Jun-ichi; Jung, Tae Sung; Kondo, Hidehiro; Hirono, Ikuo; Takeyama, Haruko; Mori, Tetsushi; Aoki, Takashi

    2014-10-01

    Mycobacterium is a genus within the order Actinomycetales that comprises of a large number of well-characterized species, several of which includes pathogens known to cause serious disease in human and animal. Here, we report the whole genome sequence of Mycobacterium sp. strain 012931 isolated from the marine fish, yellowtail (Seriola quinqueradiata). Mycobacterium sp. 012931 is a fish pathogen causing serious damage to aquaculture farms in Japan. DNA dot plot analysis showed that Mycobacterium sp. 012931 was more closely related to Mycobacterium marinum when compared across several Mycobacterium species. However, little conservation of the gene order was observed between Mycobacterium sp. 012931 and M. marinum genome. The annotated 5,464 genes of Mycobacterium sp. 012931 was classified into 26 subsystems. The insertion/deletion gene analysis shows Mycobacterium sp. 012931 had 643 unique genes that were not found in the M. marinum strains. In the virulence, disease, and defense subsystem, both insertion and deletion genes of Mycobacterium sp. 012931 were associated with the PPE gene cluster of Mycobacteria. Of seven plcB genes in Mycobacterium sp. 012931, plcB_2 and plcB_3 showed low identities with those of M. marinum strains. Therefore, Mycobacterium sp. 012931 has differences on genetic and virulence from M. marinum and may induce different interaction mechanisms between host and pathogen. PMID:24879010

  13. Rapid Whole-Genome Sequencing for Genetic Disease Diagnosis in Neonatal Intensive Care Units

    PubMed Central

    Saunders, Carol Jean; Miller, Neil Andrew; Soden, Sarah Elizabeth; Dinwiddie, Darrell Lee; Noll, Aaron; Alnadi, Noor Abu; Andraws, Nevene; Patterson, Melanie LeAnn; Krivohlavek, Lisa Ann; Fellis, Joel; Humphray, Sean; Saffrey, Peter; Kingsbury, Zoya; Weir, Jacqueline Claire; Betley, Jason; Grocock, Russell James; Margulies, Elliott Harrison; Farrow, Emily Gwendolyn; Artman, Michael; Safina, Nicole Pauline; Petrikin, Joshua Erin; Hall, Kevin Peter; Kingsmore, Stephen Francis

    2014-01-01

    Monogenic diseases are frequent causes of neonatal morbidity and mortality, and disease presentations are often undifferentiated at birth. More than 3500 monogenic diseases have been characterized, but clinical testing is available for only some of them and many feature clinical and genetic heterogeneity. Hence, an immense unmet need exists for improved molecular diagnosis in infants. Because disease progression is extremely rapid, albeit heterogeneous, in newborns, molecular diagnoses must occur quickly to be relevant for clinical decision-making. We describe 50-hour differential diagnosis of genetic disorders by whole-genome sequencing (WGS) that features automated bioinformatic analysis and is intended to be a prototype for use in neonatal intensive care units. Retrospective 50-hour WGS identified known molecular diagnoses in two children. Prospective WGS disclosed potential molecular diagnosis of a severe GJB2-related skin disease in one neonate; BRAT1-related lethal neonatal rigidity and multifocal seizure syndrome in another infant; identified BCL9L as a novel, recessive visceral heterotaxy gene (HTX6) in a pedigree; and ruled out known candidate genes in one infant. Sequencing of parents or affected siblings expedited the identification of disease genes in prospective cases. Thus, rapid WGS can potentially broaden and foreshorten differential diagnosis, resulting in fewer empirical treatments and faster progression to genetic and prognostic counseling. PMID:23035047

  14. Whole-genome plasma sequencing reveals focal amplifications as a driving force in metastatic prostate cancer

    PubMed Central

    Ulz, Peter; Belic, Jelena; Graf, Ricarda; Auer, Martina; Lafer, Ingrid; Fischereder, Katja; Webersinke, Gerald; Pummer, Karl; Augustin, Herbert; Pichler, Martin; Hoefler, Gerald; Bauernhofer, Thomas; Geigl, Jochen B.; Heitzer, Ellen; Speicher, Michael R.

    2016-01-01

    Genomic alterations in metastatic prostate cancer remain incompletely characterized. Here we analyse 493 prostate cancer cases from the TCGA database and perform whole-genome plasma sequencing on 95 plasma samples derived from 43 patients with metastatic prostate cancer. From these samples, we identify established driver aberrations in a cancer-related gene in nearly all cases (97.7%), including driver gene fusions (TMPRSS2:ERG), driver focal deletions (PTEN, RYBP and SHQ1) and driver amplifications (AR and MYC). In serial plasma analyses, we observe changes in focal amplifications in 40% of cases. The mean time interval between new amplifications was 26.4 weeks (range: 5–52 weeks), suggesting that they represent rapid adaptations to selection pressure. An increase in neuron-specific enolase is accompanied by clonal pattern changes in the tumour genome, most consistent with subclonal diversification of the tumour. Our findings suggest a high plasticity of prostate cancer genomes with newly occurring focal amplifications as a driving force in progression. PMID:27328849

  15. RepARK--de novo creation of repeat libraries from whole-genome NGS reads.

    PubMed

    Koch, Philipp; Platzer, Matthias; Downie, Bryan R

    2014-05-01

    Generation of repeat libraries is a critical step for analysis of complex genomes. In the era of next-generation sequencing (NGS), such libraries are usually produced using a whole-genome shotgun (WGS) derived reference sequence whose completeness greatly influences the quality of derived repeat libraries. We describe here a de novo repeat assembly method--RepARK (Repetitive motif detection by Assembly of Repetitive K-mers)--which avoids potential biases by using abundant k-mers of NGS WGS reads without requiring a reference genome. For validation, repeat consensuses derived from simulated and real Drosophila melanogaster NGS WGS reads were compared to repeat libraries generated by four established methods. RepARK is orders of magnitude faster than the other methods and generates libraries that are: (i) composed almost entirely of repetitive motifs, (ii) more comprehensive and (iii) almost completely annotated by TEclass. Additionally, we show that the RepARK method is applicable to complex genomes like human and can even serve as a diagnostic tool to identify repetitive sequences contaminating NGS datasets.

  16. Kuwaiti population subgroup of nomadic Bedouin ancestry—Whole genome sequence and analysis

    PubMed Central

    John, Sumi Elsa; Thareja, Gaurav; Hebbar, Prashantha; Behbehani, Kazem; Thanaraj, Thangavel Alphonse; Alsmadi, Osama

    2014-01-01

    Kuwaiti native population comprises three distinct genetic subgroups of Persian, “city-dwelling” Saudi Arabian tribe, and nomadic “tent-dwelling” Bedouin ancestry. Bedouin subgroup is characterized by presence of 17% African ancestry; it owes it origin to nomadic tribes of the deserts of Arabian Peninsula and North Africa. By sequencing whole genome of a Kuwaiti male from this subgroup at 41X coverage, we report 3,752,878 SNPs, 411,839 indels, and 8451 structural variations. Neighbor-joining tree, based on shared variant positions carrying disease-risk alleles between the Bedouin and other continental genomes, places Bedouin genome at the nexus of African, Asian, and European genomes in concordance with geographical location of Kuwait and Peninsula. In congruence with participant's medical history for morbid obesity and bronchial asthma, risk alleles are seen at deleterious SNPs associated with obesity and asthma. Many of the observed deleterious ‘novel’ variants lie in genes associated with autosomal recessive disorders characteristic of the region. PMID:26484159

  17. Whole-genome analyses of Korean native and Holstein cattle breeds by massively parallel sequencing.

    PubMed

    Choi, Jung-Woo; Liao, Xiaoping; Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2014-01-01

    A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea--Hanwoo, Jeju Heugu, and Korean Holstein--using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions-deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding.

  18. Whole-Genome Sequencing Uncovers the Genetic Basis of Chronic Mountain Sickness in Andean Highlanders

    PubMed Central

    Zhou, Dan; Udpa, Nitin; Ronen, Roy; Stobdan, Tsering; Liang, Junbin; Appenzeller, Otto; Zhao, Huiwen W.; Yin, Yi; Du, Yuanping; Guo, Lixia; Cao, Rui; Wang, Yu; Jin, Xin; Huang, Chen; Jia, Wenlong; Cao, Dandan; Guo, Guangwu; Gamboa, Jorge L.; Villafuerte, Francisco; Callacondo, David; Xue, Jin; Liu, Siqi; Frazer, Kelly A.; Li, Yingrui; Bafna, Vineet; Haddad, Gabriel G.

    2013-01-01

    The hypoxic conditions at high altitudes present a challenge for survival, causing pressure for adaptation. Interestingly, many high-altitude denizens (particularly in the Andes) are maladapted, with a condition known as chronic mountain sickness (CMS) or Monge disease. To decode the genetic basis of this disease, we sequenced and compared the whole genomes of 20 Andean subjects (10 with CMS and 10 without). We discovered 11 regions genome-wide with significant differences in haplotype frequencies consistent with selective sweeps. In these regions, two genes (an erythropoiesis regulator, SENP1, and an oncogene, ANP32D) had a higher transcriptional response to hypoxia in individuals with CMS relative to those without. We further found that downregulating the orthologs of these genes in flies dramatically enhanced survival rates under hypoxia, demonstrating that suppression of SENP1 and ANP32D plays an essential role in hypoxia tolerance. Our study provides an unbiased framework to identify and validate the genetic basis of adaptation to high altitudes and identifies potentially targetable mechanisms for CMS treatment. PMID:23954164

  19. Automated whole-genome multiple alignment of rat, mouse, and human

    SciTech Connect

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  20. Digital droplet multiple displacement amplification (ddMDA) for whole genome sequencing of limited DNA samples

    DOE PAGESBeta

    Rhee, Minsoung; Light, Yooli K.; Meagher, Robert J.; Singh, Anup K.; Kumar-Sinha, Chandan

    2016-05-04

    Here, multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently,more » the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet), ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli) compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.« less

  1. Whole genome data for omics-based research on the self-fertilizing fish Kryptolebias marmoratus.

    PubMed

    Rhee, Jae-Sung; Lee, Jae-Seong

    2014-08-30

    Genome resources have advantages for understanding diverse areas such as biological patterns and functioning of organisms. Omics platforms are useful approaches for the study of organs and organisms. These approaches can be powerful screening tools for whole genome, proteome, and metabolome profiling, and can be used to understand molecular changes in response to internal and external stimuli. This methodology has been applied successfully in freshwater model fish such as the zebrafish Danio rerio and the Japanese medaka Oryzias latipes in research areas such as basic physiology, developmental biology, genetics, and environmental biology. However, information is still scarce about model fish that inhabit brackish water or seawater. To develop the self-fertilizing killifish Kryptolebias marmoratus as a potential model species with unique characteristics and research merits, we obtained genomic information about K. marmoratus. We address ways to use these data for genome-based molecular mechanistic studies. We review the current state of genome information on K. marmoratus to initiate omics approaches. We evaluate the potential applications of integrated omics platforms for future studies in environmental science, developmental biology, and biomedical research. We conclude that information about the K. marmoratus genome will provide a better understanding of the molecular functions of genes, proteins, and metabolites that are involved in the biological functions of this species. Omics platforms, particularly combined technologies that make effective use of bioinformatics, will provide powerful tools for hypothesis-driven investigations and discovery-driven discussions on diverse aspects of this species and on fish and vertebrates in general.

  2. Whole genome sequencing of Gir cattle for identifying polymorphisms and loci under selection.

    PubMed

    Liao, Xiaoping; Peng, Fred; Forni, Selma; McLaren, David; Plastow, Graham; Stothard, Paul

    2013-10-01

    Genetic variation in Gir cattle (Bos indicus) has so far not been well characterized. In this study, we used whole genome sequencing of three Gir bulls and a pooled sample from another 11 bulls to identify polymorphisms and loci under selection. A total of 9 990 733 single nucleotide polymorphisms (SNPs) and 604 308 insertion/deletions (indels) were discovered in Gir samples, of which 62.34% and 83.62%, respectively, are previously unknown. Moreover, we detected 79 putative selective sweeps using the sequence data of the pooled sample. One of the most striking sweeps harbours several genes belonging to the cathelicidin gene family, such as CAMP, CATHL1, CATHL2, and CATHL3, which are related to pathogen- and parasite-resistance. Another interesting region harbours genes encoding mitogen-activated protein kinases, which are involved in directing cellular responses to a variety of stimuli, such as osmotic stress and heat shock. These findings are particularly interesting because Gir is resistant to hot temperatures and tropical diseases. This initial selective sweep analysis of Gir cattle has revealed a number of loci that could be important for their adaptation to tropical climates.

  3. Two Rounds of Whole Genome Duplication in the AncestralVertebrate

    SciTech Connect

    Dehal, Paramvir; Boore, Jeffrey L.

    2005-04-12

    The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish-tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of 4-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage.

  4. Whole genome sequence of Staphylococcus saprophyticus reveals the pathogenesis of uncomplicated urinary tract infection

    PubMed Central

    Kuroda, Makoto; Yamashita, Atsushi; Hirakawa, Hideki; Kumano, Miyuki; Morikawa, Kazuya; Higashide, Masato; Maruyama, Atsushi; Inose, Yumiko; Matoba, Kimio; Toh, Hidehiro; Kuhara, Satoru; Hattori, Masahira; Ohta, Toshiko

    2005-01-01

    Staphylococcus saprophyticus is a uropathogenic Staphylococcus frequently isolated from young female outpatients presenting with uncomplicated urinary tract infections. We sequenced the whole genome of S. saprophyticus type strain ATCC 15305, which harbors a circular chromosome of 2,516,575 bp with 2,446 ORFs and two plasmids. Comparative genomic analyses with the strains of two other species, Staphylococcus aureus and Staphylococcus epidermidis, as well as experimental data, revealed the following characteristics of the S. saprophyticus genome. S. saprophyticus does not possess any virulence factors found in S. aureus, such as coagulase, enterotoxins, exoenzymes, and extracellular matrix-binding proteins, although it does have a remarkable paralog expansion of transport systems related to highly variable ion contents in the urinary environment. A further unique feature is that only a single ORF is predictable as a cell wall-anchored protein, and it shows positive hemagglutination and adherence to human bladder cell associated with initial colonization in the urinary tract. It also shows significantly high urease activity in S. saprophyticus. The uropathogenicity of S. saprophyticus can be attributed to its genome that is needed for its survival in the human urinary tract by means of novel cell wall-anchored adhesin and redundant uro-adaptive transport systems, together with urease. PMID:16135568

  5. ecoPrimers: inference of new DNA barcode markers from whole genome sequence analysis

    PubMed Central

    Riaz, Tiayyba; Shehzad, Wasim; Viari, Alain; Pompanon, François; Taberlet, Pierre; Coissac, Eric

    2011-01-01

    Using non-conventional markers, DNA metabarcoding allows biodiversity assessment from complex substrates. In this article, we present ecoPrimers, a software for identifying new barcode markers and their associated PCR primers. ecoPrimers scans whole genomes to find such markers without a priori knowledge. ecoPrimers optimizes two quality indices measuring taxonomical range and discrimination to select the most efficient markers from a set of reference sequences, according to specific experimental constraints such as marker length or specifically targeted taxa. The key step of the algorithm is the identification of conserved regions among reference sequences for anchoring primers. We propose an efficient algorithm based on data mining, that allows the analysis of huge sets of sequences. We evaluate the efficiency of ecoPrimers by running it on three different sequence sets: mitochondrial, chloroplast and bacterial genomes. Identified barcode markers correspond either to barcode regions already in use for plants or animals, or to new potential barcodes. Results from empirical experiments carried out on a promising new barcode for analyzing vertebrate diversity fully agree with expectations based on bioinformatics analysis. These tests demonstrate the efficiency of ecoPrimers for inferring new barcodes fitting with diverse experimental contexts. ecoPrimers is available as an open source project at: http://www.grenoble.prabi.fr/trac/ecoPrimers. PMID:21930509

  6. ENCODE whole-genome data in the UCSC Genome Browser: update 2012

    PubMed Central

    Rosenbloom, Kate R.; Dreszer, Timothy R.; Long, Jeffrey C.; Malladi, Venkat S.; Sloan, Cricket A.; Raney, Brian J.; Cline, Melissa S.; Karolchik, Donna; Barber, Galt P.; Clawson, Hiram; Diekhans, Mark; Fujita, Pauline A.; Goldman, Mary; Gravell, Robert C.; Harte, Rachel A.; Hinrichs, Angie S.; Kirkup, Vanessa M.; Kuhn, Robert M.; Learned, Katrina; Maddren, Morgan; Meyer, Laurence R.; Pohl, Andy; Rhead, Brooke; Wong, Matthew C.; Zweig, Ann S.; Haussler, David; Kent, W. James

    2012-01-01

    The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser (http://genome.ucsc.edu) and on an early-access Preview Browser (http://genome-preview.ucsc.edu). Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal (http://encodeproject.org) provides specialized tools and information about the ENCODE data sets. PMID:22075998

  7. Multidrug-resistant Escherichia coli soft tissue infection investigated with bacterial whole genome sequencing

    PubMed Central

    Buchanan, Ruaridh; Stoesser, Nicole; Crook, Derrick; Bowler, Ian C J W

    2014-01-01

    A 45-year-old man with dilated cardiomyopathy presented with acute leg pain and erythema suggestive of necrotising fasciitis. Initial surgical exploration revealed no necrosis and treatment for a soft tissue infection was started. Blood and tissue cultures unexpectedly grew a Gram-negative bacillus, subsequently identified by an automated broth microdilution phenotyping system as an extended-spectrum β-lactamase producing Escherichia coli. The patient was treated with a 3-week course of antibiotics (ertapenem followed by ciprofloxacin) and debridement for small areas of necrosis, followed by skin grafting. The presence of E. coli triggered investigation of both host and pathogen. The patient was found to have previously undiagnosed liver disease, a risk factor for E. coli soft tissue infection. Whole genome sequencing of isolates from all specimens confirmed they were clonal, of sequence type ST131 and associated with a likely plasmid-associated AmpC (CMY-2), several other resistance genes and a number of virulence factors. PMID:25331151

  8. Whole genome analyses of marine fish pathogenic isolate, Mycobacterium sp. 012931.

    PubMed

    Kurokawa, Satoru; Kabayama, Jun; Hwang, Seong Don; Nho, Seong Won; Hikima, Jun-ichi; Jung, Tae Sung; Kondo, Hidehiro; Hirono, Ikuo; Takeyama, Haruko; Mori, Tetsushi; Aoki, Takashi

    2014-10-01

    Mycobacterium is a genus within the order Actinomycetales that comprises of a large number of well-characterized species, several of which includes pathogens known to cause serious disease in human and animal. Here, we report the whole genome sequence of Mycobacterium sp. strain 012931 isolated from the marine fish, yellowtail (Seriola quinqueradiata). Mycobacterium sp. 012931 is a fish pathogen causing serious damage to aquaculture farms in Japan. DNA dot plot analysis showed that Mycobacterium sp. 012931 was more closely related to Mycobacterium marinum when compared across several Mycobacterium species. However, little conservation of the gene order was observed between Mycobacterium sp. 012931 and M. marinum genome. The annotated 5,464 genes of Mycobacterium sp. 012931 was classified into 26 subsystems. The insertion/deletion gene analysis shows Mycobacterium sp. 012931 had 643 unique genes that were not found in the M. marinum strains. In the virulence, disease, and defense subsystem, both insertion and deletion genes of Mycobacterium sp. 012931 were associated with the PPE gene cluster of Mycobacteria. Of seven plcB genes in Mycobacterium sp. 012931, plcB_2 and plcB_3 showed low identities with those of M. marinum strains. Therefore, Mycobacterium sp. 012931 has differences on genetic and virulence from M. marinum and may induce different interaction mechanisms between host and pathogen.

  9. Determinants of spontaneous mutation in the bacterium Escherichia coli as revealed by whole-genome sequencing

    PubMed Central

    Foster, Patricia L.; Lee, Heewook; Popodi, Ellen; Townes, Jesse P.; Tang, Haixu

    2015-01-01

    A complete understanding of evolutionary processes requires that factors determining spontaneous mutation rates and spectra be identified and characterized. Using mutation accumulation followed by whole-genome sequencing, we found that the mutation rates of three widely diverged commensal Escherichia coli strains differ only by about 50%, suggesting that a rate of 1–2 × 10−3 mutations per generation per genome is common for this bacterium. Four major forces are postulated to contribute to spontaneous mutations: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage caused by exogenous agents, and the activities of error-prone polymerases. To determine the relative importance of these factors, we studied 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to prevent or repair oxidative DNA damage significantly impacted mutation rates or spectra. These results suggest that, with the exception of oxidative damage, endogenously induced DNA damage does not perturb the overall accuracy of DNA replication in normally growing cells and that repair pathways may exist primarily to defend against exogenously induced DNA damage. The thousands of mutations caused by oxidative damage recovered across the entire genome revealed strong local-sequence biases of these mutations. Specifically, we found that the identity of the 3′ base can affect the mutability of a purine by oxidative damage by as much as eightfold. PMID:26460006

  10. Impacts of Whole-Genome Triplication on MIRNA Evolution in Brassica rapa.

    PubMed

    Sun, Chao; Wu, Jian; Liang, Jianli; Schnable, James C; Yang, Wencai; Cheng, Feng; Wang, Xiaowu

    2015-11-01

    MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play essential roles in eukaryotes. Although the influence of whole-genome triplication (WGT) on protein-coding genes has been well documented in Brassica rapa, little is known about its impacts on MIRNAs. In this study, through generating a comprehensive annotation of 680 MIRNAs for B. rapa, we analyzed the evolutionary characteristics of these MIRNAs from different aspects in B. rapa. First, while MIRNAs and genes show similar patterns of biased distribution among subgenomes of B. rapa, we found that MIRNAs are much more overretained than genes following fractionation after WGT. Second, multiple-copy MIRNAs show significant sequence conservation than that of single-copy MIRNAs, which is opposite to that of genes. This indicates that increased purifying selection is acting upon these highly retained multiple-copy MIRNAs and their functional importance over singleton MIRNAs. Furthermore, we found the extensive divergence between pairs of miRNAs and their target genes following the WGT in B. rapa. In summary, our study provides a valuable resource for exploring MIRNA in B. rapa and highlights the impacts of WGT on the evolution of MIRNA.

  11. Whole genome sequencing reveals a novel CRISPR system in industrial Clostridium acetobutylicum.

    PubMed

    Peng, Lixin; Pei, Jianxin; Pang, Hao; Guo, Yuan; Lin, Lihua; Huang, Ribo

    2014-11-01

    Clostridium acetobutylicum is an important organism for biobutanol production. Due to frequent exposure to bacteriophages during fermentation, industrial C. acetobutylicum strains require a strong immune response against foreign genetic invaders. In the present study, a novel CRISPR system was reported in a C. acetobutylicum GXAS18-1 strain by whole genome sequencing, and several specific characteristics of the CRISPR system were revealed as follows: (1) multiple CRISPR loci were confirmed within the whole bacterial genome, while only one cluster of CRISPR-associated genes (Cas) was found in the current strain; (2) similar leader sequences at the 5' end of the multiple CRISPR loci were identified as promoter elements by promoter prediction, suggesting that these CRISPR loci were under the control of the same transcriptional factor; (3) homology analysis indicated that the present Cas genes shared only low sequence similarity with the published Cas families; and (4) concerning gene similarity and gene cluster order, these Cas genes belonged to the csm family and originated from the euryarchaeota by horizontal gene transfer.

  12. Molecular analysis of single oocyst of Eimeria by whole genome amplification (WGA) based nested PCR.

    PubMed

    Wang, Yunzhou; Tao, Geru; Cui, Yujuan; Lv, Qiyao; Xie, Li; Li, Yuan; Suo, Xun; Qin, Yinghe; Xiao, Lihua; Liu, Xianyong

    2014-09-01

    PCR-based molecular tools are widely used for the identification and characterization of protozoa. Here we report the molecular analysis of Eimeria species using combined methods of whole genome amplification (WGA) and nested PCR. Single oocyst of Eimeria stiedai or Eimeriamedia was directly used for random amplification of the genomic DNA with either primer extension preamplification (PEP) or multiple displacement amplification (MDA), and then the WGA product was used as template in nested PCR with species-specific primers for ITS-1, 18S rDNA and 23S rDNA of E. stiedai and E. media. WGA-based PCR was successful for the amplification of these genes from single oocyst. For the species identification of single oocyst isolated from mixed E. stiedai or E. media, the results from WGA-based PCR were exactly in accordance with those from morphological identification, suggesting the availability of this method in molecular analysis of eimerian parasites at the single oocyst level. WGA-based PCR method can also be applied for the identification and genetic characterization of other protists.

  13. Whole genome amplification from single cells in preimplantation genetic diagnosis and prenatal diagnosis.

    PubMed

    Peng, Wen; Takabayashi, Haruo; Ikawa, Kazumi

    2007-03-01

    The literature on whole genome amplification (WGA) techniques and their application to preimplantation genetic diagnosis (PGD) and prenatal diagnosis is reviewed. General polymerase chain reaction (PCR) fails to provide adequate information from limited cells in PGD and non-invasive prenatal diagnosis. Therefore several WGA techniques, such as primer extension preamplification (PEP) and degenerate oligonucleotide primed PCR (DOP-PCR), have been developed and successfully applied to clinical work during the past decade, especially in PGD and prenatal diagnosis. These techniques can provide ample amplification of genetic sequences from single cells for a series of subsequent PCR analyses such as restriction fragment length polymorphisms (RFLP) and comparative genomic hybridization (CGH), thus opening up a new area for prenatal diagnosis. However, several problems have been reported in the application of these techniques. The ideal WGA technique should have high yield, faithful representation of the original template, complete coverage of the genome, and simply performed procedure. In order to make good use of these techniques in future research and clinical work, it is undoubtedly necessary for an extensive understanding of the merits and pitfalls of these recently developed techniques.

  14. Whole-genome mutational burden analysis of three pluripotency induction methods

    PubMed Central

    Bhutani, Kunal; Nazor, Kristopher L.; Williams, Roy; Tran, Ha; Dai, Heng; Džakula, Željko; Cho, Edward H.; Pang, Andy W. C.; Rao, Mahendra; Cao, Han; Schork, Nicholas J.; Loring, Jeanne F.

    2016-01-01

    There is concern that the stresses of inducing pluripotency may lead to deleterious DNA mutations in induced pluripotent stem cell (iPSC) lines, which would compromise their use for cell therapies. Here we report comparative genomic analysis of nine isogenic iPSC lines generated using three reprogramming methods: integrating retroviral vectors, non-integrating Sendai virus and synthetic mRNAs. We used whole-genome sequencing and de novo genome mapping to identify single-nucleotide variants, insertions and deletions, and structural variants. Our results show a moderate number of variants in the iPSCs that were not evident in the parental fibroblasts, which may result from reprogramming. There were only small differences in the total numbers and types of variants among different reprogramming methods. Most importantly, a thorough genomic analysis showed that the variants were generally benign. We conclude that the process of reprogramming is unlikely to introduce variants that would make the cells inappropriate for therapy. PMID:26892726

  15. MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads.

    PubMed

    Tyakht, Alexander V; Popenko, Anna S; Belenikin, Maxim S; Altukhov, Ilya A; Pavlenko, Alexander V; Kostryukova, Elena S; Selezneva, Oksana V; Larin, Andrei K; Karpova, Irina Y; Alexeev, Dmitry G

    2012-01-01

    MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.

  16. New perspectives on microbial community distortion after whole-genome amplification.

    PubMed

    Probst, Alexander J; Weinmaier, Thomas; DeSantis, Todd Z; Santo Domingo, Jorge W; Ashbolt, Nicholas

    2015-01-01

    Whole-genome amplification (WGA) has become an important tool to explore the genomic information of microorganisms in an environmental sample with limited biomass, however potential selective biases during the amplification processes are poorly understood. Here, we describe the effects of WGA on 31 different microbial communities from five biotopes that also included low-biomass samples from drinking water and groundwater. Our findings provide evidence that microbiome segregation by biotope was possible despite WGA treatment. Nevertheless, samples from different biotopes revealed different levels of distortion, with genomic GC content significantly correlated with WGA perturbation. Certain phylogenetic clades revealed a homogenous trend across various sample types, for instance Alpha- and Betaproteobacteria showed a decrease in their abundance after WGA treatment. On the other hand, Enterobacteriaceae, an important biomarker group for fecal contamination in groundwater and drinking water, were strongly affected by WGA treatment without a predictable pattern. These novel results describe the impact of WGA on low-biomass samples and may highlight issues to be aware of when designing future metagenomic studies that necessitate preceding WGA treatment.

  17. Deciphering the Wisent Demographic and Adaptive Histories from Individual Whole-Genome Sequences

    PubMed Central

    Gautier, Mathieu; Moazami-Goudarzi, Katayoun; Levéziel, Hubert; Parinello, Hugues; Grohs, Cécile; Rialle, Stéphanie; Kowalczyk, Rafał; Flori, Laurence

    2016-01-01

    As the largest European herbivore, the wisent (Bison bonasus) is emblematic of the continent wildlife but has unclear origins. Here, we infer its demographic and adaptive histories from two individual whole-genome sequences via a detailed comparative analysis with bovine genomes. We estimate that the wisent and bovine species diverged from 1.7 × 106 to 850,000 years before present (YBP) through a speciation process involving an extended period of limited gene flow. Our data further support the occurrence of more recent secondary contacts, posterior to the Bos taurus and Bos indicus divergence (∼150,000 YBP), between the wisent and (European) taurine cattle lineages. Although the wisent and bovine population sizes experienced a similar sharp decline since the Last Glacial Maximum, we find that the wisent demography remained more fluctuating during the Pleistocene. This is in agreement with a scenario in which wisents responded to successive glaciations by habitat fragmentation rather than southward and eastward migration as for the bovine ancestors. We finally detect 423 genes under positive selection between the wisent and bovine lineages, which shed a new light on the genome response to different living conditions (temperature, available food resource, and pathogen exposure) and on the key gene functions altered by the domestication process. PMID:27436010

  18. Whole Genome Sequencing Identifies a Novel Factor Required for Secretory Granule Maturation in Tetrahymena thermophila

    PubMed Central

    Kontur, Cassandra; Kumar, Santosh; Lan, Xun; Pritchard, Jonathan K.; Turkewitz, Aaron P.

    2016-01-01

    Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded to a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies, in part, on ancestral lysosomal sorting machinery, but is also likely to involve novel factors. In prior work, multiple strains with defects in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation—a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wild-type copy of MMA1, and disrupting MMA1 in an otherwise wild-type strain phenocopies UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation. PMID:27317773

  19. Insights into three whole-genome duplications gleaned from the Paramecium caudatum genome sequence.

    PubMed

    McGrath, Casey L; Gout, Jean-Francois; Doak, Thomas G; Yanagi, Akira; Lynch, Michael

    2014-08-01

    Paramecium has long been a model eukaryote. The sequence of the Paramecium tetraurelia genome reveals a history of three successive whole-genome duplications (WGDs), and the sequences of P. biaurelia and P. sexaurelia suggest that these WGDs are shared by all members of the aurelia species complex. Here, we present the genome sequence of P. caudatum, a species closely related to the P. aurelia species group. P. caudatum shares only the most ancient of the three WGDs with the aurelia complex. We found that P. caudatum maintains twice as many paralogs from this early event as the P. aurelia species, suggesting that post-WGD gene retention is influenced by subsequent WGDs and supporting the importance of selection for dosage in gene retention. The availability of P. caudatum as an outgroup allows an expanded analysis of the aurelia intermediate and recent WGD events. Both the Guanine+Cytosine (GC) content and the expression level of preduplication genes are significant predictors of duplicate retention. We find widespread asymmetrical evolution among aurelia paralogs, which is likely caused by gradual pseudogenization rather than by neofunctionalization. Finally, cases of divergent resolution of intermediate WGD duplicates between aurelia species implicate this process acts as an ongoing reinforcement mechanism of reproductive isolation long after a WGD event.

  20. Whole-genome sequencing of multidrug-resistant Mycobacterium tuberculosis isolates from Myanmar.

    PubMed

    Aung, Htin Lin; Tun, Thanda; Moradigaravand, Danesh; Köser, Claudio U; Nyunt, Wint Wint; Aung, Si Thu; Lwin, Thandar; Thinn, Kyi Kyi; Crump, John A; Parkhill, Julian; Peacock, Sharon J; Cook, Gregory M; Hill, Philip C

    2016-09-01

    Drug-resistant tuberculosis (TB) is a major health threat in Myanmar. An initial study was conducted to explore the potential utility of whole-genome sequencing (WGS) for the diagnosis and management of drug-resistant TB in Myanmar. Fourteen multidrug-resistant Mycobacterium tuberculosis isolates were sequenced. Known resistance genes for a total of nine antibiotics commonly used in the treatment of drug-susceptible and multidrug-resistant TB (MDR-TB) in Myanmar were interrogated through WGS. All 14 isolates were MDR-TB, consistent with the results of phenotypic drug susceptibility testing (DST), and the Beijing lineage predominated. Based on the results of WGS, 9 of the 14 isolates were potentially resistant to at least one of the drugs used in the standard MDR-TB regimen but for which phenotypic DST is not conducted in Myanmar. This study highlights a need for the introduction of second-line DST as part of routine TB diagnosis in Myanmar as well as new classes of TB drugs to construct effective regimens. PMID:27530852

  1. Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs.

    PubMed

    Kenny, N J; Chan, K W; Nong, W; Qu, Z; Maeso, I; Yip, H Y; Chan, T F; Kwan, H S; Holland, P W H; Chu, K H; Hui, J H L

    2016-02-01

    Whole-genome duplication (WGD) results in new genomic resources that can be exploited by evolution for rewiring genetic regulatory networks in organisms. In metazoans, WGD occurred before the last common ancestor of vertebrates, and has been postulated as a major evolutionary force that contributed to their speciation and diversification of morphological structures. Here, we have sequenced genomes from three of the four extant species of horseshoe crabs-Carcinoscorpius rotundicauda, Limulus polyphemus and Tachypleus tridentatus. Phylogenetic and sequence analyses of their Hox and other homeobox genes, which encode crucial transcription factors and have been used as indicators of WGD in animals, strongly suggests that WGD happened before the last common ancestor of these marine chelicerates >135 million years ago. Signatures of subfunctionalisation of paralogues of Hox genes are revealed in the appendages of two species of horseshoe crabs. Further, residual homeobox pseudogenes are observed in the three lineages. The existence of WGD in the horseshoe crabs, noted for relative morphological stasis over geological time, suggests that genomic diversity need not always be reflected phenotypically, in contrast to the suggested situation in vertebrates. This study provides evidence of ancient WGD in the ecdysozoan lineage, and reveals new opportunities for studying genomic and regulatory evolution after WGD in the Metazoa.

  2. Deep Whole-Genome Sequencing to Detect Mixed Infection of Mycobacterium tuberculosis

    PubMed Central

    Gan, Mingyu; Liu, Qingyun; Yang, Chongguang; Gao, Qian; Luo, Tao

    2016-01-01

    Mixed infection by multiple Mycobacterium tuberculosis (MTB) strains is associated with poor treatment outcome of tuberculosis (TB). Traditional genotyping methods have been used to detect mixed infections of MTB, however, their sensitivity and resolution are limited. Deep whole-genome sequencing (WGS) has been proved highly sensitive and discriminative for studying population heterogeneity of MTB. Here, we developed a phylogenetic-based method to detect MTB mixed infections using WGS data. We collected published WGS data of 782 global MTB strains from public database. We called homogeneous and heterogeneous single nucleotide variations (SNVs) of individual strains by mapping short reads to the ancestral MTB reference genome. We constructed a phylogenomic database based on 68,639 homogeneous SNVs of 652 MTB strains. Mixed infections were determined if multiple evolutionary paths were identified by mapping the SNVs of individual samples to the phylogenomic database. By simulation, our method could specifically detect mixed infections when the sequencing depth of minor strains was as low as 1× coverage, and when the genomic distance of two mixed strains was as small as 16 SNVs. By applying our methods to all 782 samples, we detected 47 mixed infections and 45 of them were caused by locally endemic strains. The results indicate that our method is highly sensitive and discriminative for identifying mixed infections from deep WGS data of MTB isolates. PMID:27391214

  3. Whole Genome Sequencing and Complete Genetic Analysis Reveals Novel Pathways to Glycopeptide Resistance in Staphylococcus aureus

    PubMed Central

    Renzoni, Adriana; Andrey, Diego O.; Jousselin, Ambre; Barras, Christine; Monod, Antoinette; Vaudaux, Pierre; Lew, Daniel; Kelley, William L.

    2011-01-01

    The precise mechanisms leading to the emergence of low-level glycopeptide resistance in Staphylococcus aureus are poorly understood. In this study, we used whole genome deep sequencing to detect differences between two isogenic strains: a parental strain and a stable derivative selected stepwise for survival on 4 µg/ml teicoplanin, but which grows at higher drug concentrations (MIC 8 µg/ml). We uncovered only three single nucleotide changes in the selected strain. Nonsense mutations occurred in stp1, encoding a serine/threonine phosphatase, and in yjbH, encoding a post-transcriptional negative regulator of the redox/thiol stress sensor and global transcriptional regulator, Spx. A missense mutation (G45R) occurred in the histidine kinase sensor of cell wall stress, VraS. Using genetic methods, all single, pairwise combinations, and a fully reconstructed triple mutant were evaluated for their contribution to low-level glycopeptide resistance. We found a synergistic cooperation between dual phospho-signalling systems and a subtle contribution from YjbH, suggesting the activation of oxidative stress defences via Spx. To our knowledge, this is the first genetic demonstration of multiple sensor and stress pathways contributing simultaneously to glycopeptide resistance development. The multifactorial nature of glycopeptide resistance in this strain suggests a complex reprogramming of cell physiology to survive in the face of drug challenge. PMID:21738716

  4. Attitudes of African Americans toward Return of Results from Exome and Whole Genome Sequencing

    PubMed Central

    Yu, Joon-Ho; Crouch, Julia; Jamal, Seema M.; Tabor, Holly K.; Bamshad, Michael J.

    2013-01-01

    Exome sequencing and whole genome sequencing (ES/WGS) present patients and research participants with the opportunity to benefit from a broad scope of genetic results of clinical and personal utility. Yet, this potential for benefit also risks disenfranchising populations such as African Americans (AAs) that are already underrepresented in genetic research and utilize genetic tests at lower rates than other populations. Understanding a diverse range of perspectives on consenting for ES/WGS and receiving ES/WGS results is necessary to ensure parity in genomic health care and research. We conducted a series of 13 focus groups (n=76) to investigate if and how attitudes toward participation in ES/WGS research and return of results from ES/WGS differ between self described AAs and non-AAs. The majority of both AAs and non-AAs were willing to participate in WGS studies and receive individual genetic results, but the fraction not interested in either was higher in AAs. This is due in part to different expectations of health benefits from ES/WGS and how results should be managed. Our results underscore the need to develop and test culturally tailored strategies for returning ES/WGS results to AAs. PMID:23610051

  5. Whole genome amplification of degraded and nondegraded DNA for forensic purposes.

    PubMed

    Maciejewska, Agnieszka; Jakubowska, Joanna; Pawłowski, Ryszard

    2013-03-01

    Degraded DNA is often analyzed in forensic genetics laboratories. Reliable analysis of degraded DNA is of great importance, since its results impact the quality and reliability of expert testimonies. Recently, a number of whole genome amplification (WGA) methods have been proposed as preamplification tools. They work on the premise of being able to generate microgram quantities of DNA from as little as the quantity of DNA from a single cell. We chose, investigated, and compared seven WGA methods to evaluate their ability to "recover" degraded and nondegraded DNA: degenerate oligonucleotide-primed PCR, primer extension preamplification PCR, GenomePlex™ WGA commercial kit (Sigma), multiple displacement amplification, GenomiPhi™ Amplification kit (Amersham Biosciences), restriction and circularization-aided rolling circle amplification, and blunt-end ligation-mediated WGA. The efficiency and reliability of those methods were analyzed and compared using SGMPlus, YFiler, mtDNA, and Y-chromosome SNP typing. The best results for nondegraded DNA were obtained with GenomiPhi and PEP methods. In the case of degraded DNA (200 bp), the best results were obtained with GenomePlex which successfully amplified also severely degraded DNA (100 bp), thus enabling correct typing of mtDNA and Y-SNP loci. WGA may be very useful in analysis of low copy number DNA or degraded DNA in forensic genetics, especially after introduction of some improvements (sample pooling and replicate DNA typing).

  6. Whole-genome analyses resolve early branches in the tree of life of modern birds

    PubMed Central

    Jarvis, Erich D.; Mirarab, Siavash; Aberer, Andre J.; Li, Bo; Houde, Peter; Li, Cai; Ho, Simon Y. W.; Faircloth, Brant C.; Nabholz, Benoit; Howard, Jason T.; Suh, Alexander; Weber, Claudia C.; da Fonseca, Rute R.; Li, Jianwen; Zhang, Fang; Li, Hui; Zhou, Long; Narula, Nitish; Liu, Liang; Ganapathy, Ganesh; Boussau, Bastien; Bayzid, Md. Shamsuzzoha; Zavidovych, Volodymyr; Subramanian, Sankar; Gabaldón, Toni; Capella-Gutiérrez, Salvador; Huerta-Cepas, Jaime; Rekepalli, Bhanu; Munch, Kasper; Schierup, Mikkel; Lindow, Bent; Warren, Wesley C.; Ray, David; Green, Richard E.; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Li, Shengbin; Li, Ning; Huang, Yinhua; Derryberry, Elizabeth P.; Bertelsen, Mads Frost; Sheldon, Frederick H.; Brumfield, Robb T.; Mello, Claudio V.; Lovell, Peter V.; Wirthlin, Morgan; Schneider, Maria Paula Cruz; Prosdocimi, Francisco; Samaniego, José Alfredo; Velazquez, Amhed Missael Vargas; Alfaro-Núñez, Alonzo; Campos, Paula F.; Petersen, Bent; Sicheritz-Ponten, Thomas; Pas, An; Bailey, Tom; Scofield, Paul; Bunce, Michael; Lambert, David M.; Zhou, Qi; Perelman, Polina; Driskell, Amy C.; Shapiro, Beth; Xiong, Zijun; Zeng, Yongli; Liu, Shiping; Li, Zhenyu; Liu, Binghang; Wu, Kui; Xiao, Jin; Yinqi, Xiong; Zheng, Qiuemei; Zhang, Yong; Yang, Huanming; Wang, Jian; Smeds, Linnea; Rheindt, Frank E.; Braun, Michael; Fjeldsa, Jon; Orlando, Ludovic; Barker, F. Keith; Jønsson, Knud Andreas; Johnson, Warren; Koepfli, Klaus-Peter; O’Brien, Stephen; Haussler, David; Ryder, Oliver A.; Rahbek, Carsten; Willerslev, Eske; Graves, Gary R.; Glenn, Travis C.; McCormack, John; Burt, Dave; Ellegren, Hans; Alström, Per; Edwards, Scott V.; Stamatakis, Alexandros; Mindell, David P.; Cracraft, Joel; Braun, Edward L.; Warnow, Tandy; Jun, Wang; Gilbert, M. Thomas P.; Zhang, Guojie

    2015-01-01

    To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergence in Neoaves, two groups we named Passerea and Columbea, representing independent lineages of diverse and convergently evolved land and water bird species. Among Passerea, we infer the common ancestor of core landbirds to have been an apex predator and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence and high levels of incomplete lineage sorting that occurred during a rapid radiation after the Cretaceous-Paleogene mass extinction event about 66 million years ago. PMID:25504713

  7. Application of whole-genome sequencing for bacterial strain typing in molecular epidemiology.

    PubMed

    Salipante, Stephen J; SenGupta, Dhruba J; Cummings, Lisa A; Land, Tyler A; Hoogestraat, Daniel R; Cookson, Brad T

    2015-04-01

    Nosocomial infections pose a significant threat to patient health; however, the gold standard laboratory method for determining bacterial relatedness (pulsed-field gel electrophoresis [PFGE]) remains essentially unchanged 20 years after its introduction. Here, we explored bacterial whole-genome sequencing (WGS) as an alternative approach for molecular strain typing. We compared WGS to PFGE for investigating presumptive outbreaks involving three important pathogens: vancomycin-resistant Enterococcus faecium (n=19), methicillin-resistant Staphylococcus aureus (n=17), and Acinetobacter baumannii (n=15). WGS was highly reproducible (average≤0.39 differences between technical replicates), which enabled a functional, quantitative definition for determining clonality. Strain relatedness data determined by PFGE and WGS roughly correlated, but the resolution of WGS was superior (P=5.6×10(-8) to 0.016). Several discordant results were noted between the methods. A total of 28.9% of isolates which were indistinguishable by PFGE were nonclonal by WGS. For A. baumannii, a species known to undergo rapid horizontal gene transfer, 16.2% of isolate pairs considered nonidentical by PFGE were clonal by WGS. Sequencing whole bacterial genomes with single-nucleotide resolution demonstrates that PFGE is prone to false-positive and false-negative results and suggests the need for a new gold standard approach for molecular epidemiological strain typing.

  8. Whole-Genome Screening of Newborns? The Constitutional Boundaries of State Newborn Screening Programs

    PubMed Central

    King, Jaime S.; Smith, Monica E.

    2016-01-01

    State newborn screening (NBS) programs routinely screen nearly all of the 4 million newborns in the United States each year for ~30 primary conditions and a number of secondary conditions. NBS could be on the cusp of an unprecedented expansion as a result of advances in whole-genome sequencing (WGS). As WGS becomes cheaper and easier and as our knowledge and understanding of human genetics expand, the question of whether WGS has a role to play in state NBS programs becomes increasingly relevant and complex. As geneticists and state public health officials begin to contemplate the technical and procedural details of whether WGS could benefit existing NBS programs, this is an opportune time to revisit the legal framework of state NBS programs. In this article, we examine the constitutional underpinnings of state-mandated NBS and explore the range of current state statutes and regulations that govern the programs. We consider the legal refinements that will be needed to keep state NBS programs within constitutional bounds, focusing on 2 areas of concern: consent procedures and the criteria used to select new conditions for NBS panels. We conclude by providing options for states to consider when contemplating the use of WGS for NBS. PMID:26729704

  9. Using whole-genome sequencing to determine appropriate streptomycin epidemiological cutoffs for Salmonella and Escherichia coli.

    PubMed

    Tyson, Gregory H; Li, Cong; Ayers, Sherry; McDermott, Patrick F; Zhao, Shaohua

    2016-02-01

    For Enterobacteriaceae such as Salmonella spp. and Escherichia coli, no unified interpretive resistance criteria exist for streptomycin, an epidemiologically important antibiotic. As part of the National Antimicrobial Resistance Monitoring System, we had previously used a minimum inhibitory concentration of ≥ 64 μg mL(-1) as an epidemiological cutoff value (ECV) to define non-wild-type isolates. To identify whether this ECV correlated with genetic determinants of resistance, we performed whole-genome sequencing of 463 Salmonella and E. coli isolates to identify streptomycin resistance genotypes. From this analysis, we found that using a streptomycin resistance breakpoint of ≥ 64 μg mL(-1) classified over 20% of strains possessing aadA or strA/strB resistance genes as wild-type. Therefore, to improve the concordance between genotypic and phenotypic data, we propose reducing the phenotypic cutoff values to ≥ 32 μg mL(-1) for both Salmonella and E. coli, to be used widely as ECVs to categorize non-wild-type isolates.

  10. Identification of Salmonella for public health surveillance using whole genome sequencing

    PubMed Central

    Ashton, Philip M.; Nair, Satheesh; Peters, Tansy M.; Bale, Janet A.; Powell, David G.; Painset, Anaïs; Tewolde, Rediat; Schaefer, Ulf; de Pinna, Elizabeth M.; Grant, Kathie A.

    2016-01-01

    In April 2015, Public Health England implemented whole genome sequencing (WGS) as a routine typing tool for public health surveillance of Salmonella, adopting a multilocus sequence typing (MLST) approach as a replacement for traditional serotyping. The WGS derived sequence type (ST) was compared to the phenotypic serotype for 6,887 isolates of S. enterica subspecies I, and of these, 6,616 (96%) were concordant. Of the 4% (n = 271) of isolates of subspecies I exhibiting a mismatch, 119 were due to a process error in the laboratory, 26 were likely caused by the serotype designation in the MLST database being incorrect and 126 occurred when two different serovars belonged to the same ST. The population structure of S. enterica subspecies II–IV differs markedly from that of subspecies I and, based on current data, defining the serovar from the clonal complex may be less appropriate for the classification of this group. Novel sequence types that were not present in the MLST database were identified in 8.6% of the total number of samples tested (including S. enterica subspecies I–IV and S. bongori) and these 654 isolates belonged to 326 novel STs. For S. enterica subspecies I, WGS MLST derived serotyping is a high throughput, accurate, robust, reliable typing method, well suited to routine public health surveillance. The combined output of ST and serovar supports the maintenance of traditional serovar nomenclature while providing additional insight on the true phylogenetic relationship between isolates. PMID:27069781

  11. Uniform and accurate single-cell sequencing based on emulsion whole-genome amplification

    PubMed Central

    Fu, Yusi; Li, Chunmei; Lu, Sijia; Zhou, Wenxiong; Tang, Fuchou; Xie, X. Sunney; Huang, Yanyi

    2015-01-01

    Whole-genome amplification (WGA) for next-generation sequencing has seen wide applications in biology and medicine when characterization of the genome of a single cell is required. High uniformity and fidelity of WGA is needed to accurately determine genomic variations, such as copy number variations (CNVs) and single-nucleotide variations (SNVs). Prevailing WGA methods have been limited by fluctuation of the amplification yield along the genome, as well as false-positive and -negative errors for SNV identification. Here, we report emulsion WGA (eWGA) to overcome these problems. We divide single-cell genomic DNA into a large number (105) of picoliter aqueous droplets in oil. Containing only a few DNA fragments, each droplet is led to reach saturation of DNA amplification before demulsification such that the differences in amplification gain among the fragments are minimized. We demonstrate the proof-of-principle of eWGA with multiple displacement amplification (MDA), a popular WGA method. This easy-to-operate approach enables simultaneous detection of CNVs and SNVs in an individual human cell, exhibiting significantly improved amplification evenness and accuracy. PMID:26340991

  12. Uniform and accurate single-cell sequencing based on emulsion whole-genome amplification.

    PubMed

    Fu, Yusi; Li, Chunmei; Lu, Sijia; Zhou, Wenxiong; Tang, Fuchou; Xie, X Sunney; Huang, Yanyi

    2015-09-22

    Whole-genome amplification (WGA) for next-generation sequencing has seen wide applications in biology and medicine when characterization of the genome of a single cell is required. High uniformity and fidelity of WGA is needed to accurately determine genomic variations, such as copy number variations (CNVs) and single-nucleotide variations (SNVs). Prevailing WGA methods have been limited by fluctuation of the amplification yield along the genome, as well as false-positive and -negative errors for SNV identification. Here, we report emulsion WGA (eWGA) to overcome these problems. We divide single-cell genomic DNA into a large number (10(5)) of picoliter aqueous droplets in oil. Containing only a few DNA fragments, each droplet is led to reach saturation of DNA amplification before demulsification such that the differences in amplification gain among the fragments are minimized. We demonstrate the proof-of-principle of eWGA with multiple displacement amplification (MDA), a popular WGA method. This easy-to-operate approach enables simultaneous detection of CNVs and SNVs in an individual human cell, exhibiting significantly improved amplification evenness and accuracy.

  13. Use of bacterial whole-genome sequencing to investigate local persistence and spread in bovine tuberculosis.

    PubMed

    Trewby, Hannah; Wright, David; Breadon, Eleanor L; Lycett, Samantha J; Mallon, Tom R; McCormick, Carl; Johnson, Paul; Orton, Richard J; Allen, Adrian R; Galbraith, Julie; Herzyk, Pawel; Skuce, Robin A; Biek, Roman; Kao, Rowland R

    2016-03-01

    Mycobacterium bovis is the causal agent of bovine tuberculosis, one of the most important diseases currently facing the UK cattle industry. Here, we use high-density whole genome sequencing (WGS) in a defined sub-population of M. bovis in 145 cattle across 66 herd breakdowns to gain insights into local spread and persistence. We show that despite low divergence among isolates, WGS can in principle expose contributions of under-sampled host populations to M. bovis transmission. However, we demonstrate that in our data such a signal is due to molecular type switching, which had been previously undocumented for M. bovis. Isolates from farms with a known history of direct cattle movement between them did not show a statistical signal of higher genetic similarity. Despite an overall signal of genetic isolation by distance, genetic distances also showed no apparent relationship with spatial distance among affected farms over distances <5 km. Using simulations, we find that even over the brief evolutionary timescale covered by our data, Bayesian phylogeographic approaches are feasible. Applying such approaches showed that M. bovis dispersal in this system is heterogeneous but slow overall, averaging 2 km/year. These results confirm that widespread application of WGS to M. bovis will bring novel and important insights into the dynamics of M. bovis spread and persistence, but that the current questions most pertinent to control will be best addressed using approaches that more directly integrate WGS with additional epidemiological data. PMID:26972511

  14. Whole-Genome Analyses of Korean Native and Holstein Cattle Breeds by Massively Parallel Sequencing

    PubMed Central

    Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P.; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2014-01-01

    A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea—Hanwoo, Jeju Heugu, and Korean Holstein—using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions–deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding. PMID:24992012

  15. A whole-genome SNP array (RICE6K) for genomic breeding in rice.

    PubMed

    Yu, Huihui; Xie, Weibo; Li, Jing; Zhou, Fasong; Zhang, Qifa

    2014-01-01

    The advances in genotyping technology provide an opportunity to use genomic tools in crop breeding. As compared to field selections performed in conventional breeding programmes, genomics-based genotype screen can potentially reduce number of breeding cycles and more precisely integrate target genes for particular traits into an ideal genetic background. We developed a whole-genome single nucleotide polymorphism (SNP) array, RICE6K, based on Infinium technology, using representative SNPs selected from more than four million SNPs identified from resequencing data of more than 500 rice landraces. RICE6K contains 5102 SNP and insertion-deletion (InDel) markers, about 4500 of which were of high quality in the tested rice lines producing highly repeatable results. Forty-five functional markers that are located inside 28 characterized genes of important traits can be detected using RICE6K. The SNP markers are evenly distributed on the 12 chromosomes of rice with the average density of 12 SNPs per 1 Mb and can provide information for polymorphisms between indica and japonica subspecies as well as varieties within indica and japonica groups. Application tests of RICE6K showed that the array is suitable for rice germplasm fingerprinting, genotyping bulked segregating pools, seed authenticity check and genetic background selection. These results suggest that RICE6K provides an efficient and reliable genotyping tool for rice genomic breeding.

  16. Mining metagenomic whole genome sequences revealed subdominant but constant Lactobacillus population in the human gut microbiota.

    PubMed

    Rossi, Maddalena; Martínez-Martínez, Daniel; Amaretti, Alberto; Ulrici, Alessandro; Raimondi, Stefano; Moya, Andrés

    2016-06-01

    The genus Lactobacillus includes over 215 species that colonize plants, foods, sewage and the gastrointestinal tract (GIT) of humans and animals. In the GIT, Lactobacillus population can be made by true inhabitants or by bacteria occasionally ingested with fermented or spoiled foods, or with probiotics. This study longitudinally surveyed Lactobacillus species and strains in the feces of a healthy subject through whole genome sequencing (WGS) data-mining, in order to identify members of the permanent or transient populations. In three time-points (0, 670 and 700 d), 58 different species were identified, 16 of them being retrieved for the first time in human feces. L. rhamnosus, L. ruminis, L. delbrueckii, L. plantarum, L. casei and L. acidophilus were the most represented, with estimated amounts ranging between 6 and 8 Log (cells g(-1) ), while the other were detected at 4 or 5 Log (cells g(-1) ). 86 Lactobacillus strains belonging to 52 species were identified. 43 seemingly occupied the GIT as true residents, since were detected in a time span of almost 2 years in all the three samples or in 2 samples separated by 670 or 700 d. As a whole, a stable community of lactobacilli was disclosed, with wide and understudied biodiversity.

  17. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing

    PubMed Central

    Staton, Margaret; Best, Teodora; Khodwekar, Sudhir; Owusu, Sandra; Xu, Tao; Xu, Yi; Jennings, Tara; Cronn, Richard; Arumuganathan, A. Kathiravetpilla; Coggeshall, Mark; Gailing, Oliver; Liang, Haiying; Romero-Severson, Jeanne; Schlarbaum, Scott; Carlson, John E.

    2015-01-01

    Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence. PMID:26698853

  18. Whole Genome Sequencing of Field Isolates Reveals Extensive Genetic Diversity in Plasmodium vivax from Colombia.

    PubMed

    Winter, David J; Pacheco, M Andreína; Vallejo, Andres F; Schwartz, Rachel S; Arevalo-Herrera, Myriam; Herrera, Socrates; Cartwright, Reed A; Escalante, Ananias A

    2015-12-01

    Plasmodium vivax is the most prevalent malarial species in South America and exerts a substantial burden on the populations it affects. The control and eventual elimination of P. vivax are global health priorities. Genomic research contributes to this objective by improving our understanding of the biology of P. vivax and through the development of new genetic markers that can be used to monitor efforts to reduce malaria transmission. Here we analyze whole-genome data from eight field samples from a region in Cordóba, Colombia where malaria is endemic. We find considerable genetic diversity within this population, a result that contrasts with earlier studies suggesting that P. vivax had limited diversity in the Americas. We also identify a selective sweep around a substitution known to confer resistance to sulphadoxine-pyrimethamine (SP). This is the first observation of a selective sweep for SP resistance in this species. These results indicate that P. vivax has been exposed to SP pressure even when the drug is not in use as a first line treatment for patients afflicted by this parasite. We identify multiple non-synonymous substitutions in three other genes known to be involved with drug resistance in Plasmodium species. Finally, we found extensive microsatellite polymorphisms. Using this information we developed 18 polymorphic and easy to score microsatellite loci that can be used in epidemiological investigations in South America.

  19. Whole-Genome Duplications Spurred the Functional Diversification of the Globin Gene Superfamily in Vertebrates

    PubMed Central

    Hoffmann, Federico G.; Opazo, Juan C.; Storz, Jay F.

    2012-01-01

    It has been hypothesized that two successive rounds of whole-genome duplication (WGD) in the stem lineage of vertebrates provided genetic raw materials for the evolutionary innovation of many vertebrate-specific features. However, it has seldom been possible to trace such innovations to specific functional differences between paralogous gene products that derive from a WGD event. Here, we report genomic evidence for a direct link between WGD and key physiological innovations in the vertebrate oxygen transport system. Specifically, we demonstrate that key globin proteins that evolved specialized functions in different aspects of oxidative metabolism (hemoglobin, myoglobin, and cytoglobin) represent paralogous products of two WGD events in the vertebrate common ancestor. Analysis of conserved macrosynteny between the genomes of vertebrates and amphioxus (subphylum Cephalochordata) revealed that homologous chromosomal segments defined by myoglobin + globin-E, cytoglobin, and the α-globin gene cluster each descend from the same linkage group in the reconstructed proto-karyotype of the chordate common ancestor. The physiological division of labor between the oxygen transport function of hemoglobin and