Prescreening of microbial populations for the assessment of sequencing potential.
Hanning, Irene B; Ricke, Steven C
2011-01-01
Next-generation sequencing (NGS) is a powerful tool that can be utilized to profile and compare microbial populations. By amplifying a target gene present in all bacteria and subsequently sequencing amplicons, the bacteria genera present in the populations can be identified and compared. In some scenarios, little to no difference may exist among microbial populations being compared in which case a prescreening method would be practical to determine which microbial populations would be suitable for further analysis by NGS. Denaturing density-gradient electrophoresis (DGGE) is relatively cheaper than NGS and the data comparing microbial populations are ready to be viewed immediately after electrophoresis. DGGE follows essentially the same initial methodology as NGS by targeting and amplifying the 16S rRNA gene. However, as opposed to sequencing amplicons, DGGE amplicons are analyzed by electrophoresis. By prescreening microbial populations with DGGE, more efficient use of NGS methods can be accomplished. In this chapter, we outline the protocol for DGGE targeting the same gene (16S rRNA) that would be targeted for NGS to compare and determine differences in microbial populations from a wide range of ecosystems.
Genotype imputation in a coalescent model with infinitely-many-sites mutation
Huang, Lucy; Buzbas, Erkan O.; Rosenberg, Noah A.
2012-01-01
Empirical studies have identified population-genetic factors as important determinants of the properties of genotype-imputation accuracy in imputation-based disease association studies. Here, we develop a simple coalescent model of three sequences that we use to explore the theoretical basis for the influence of these factors on genotype-imputation accuracy, under the assumption of infinitely-many-sites mutation. Employing a demographic model in which two populations diverged at a given time in the past, we derive the approximate expectation and variance of imputation accuracy in a study sequence sampled from one of the two populations, choosing between two reference sequences, one sampled from the same population as the study sequence and the other sampled from the other population. We show that under this model, imputation accuracy—as measured by the proportion of polymorphic sites that are imputed correctly in the study sequence—increases in expectation with the mutation rate, the proportion of the markers in a chromosomal region that are genotyped, and the time to divergence between the study and reference populations. Each of these effects derives largely from an increase in information available for determining the reference sequence that is genetically most similar to the sequence targeted for imputation. We analyze as a function of divergence time the expected gain in imputation accuracy in the target using a reference sequence from the same population as the target rather than from the other population. Together with a growing body of empirical investigations of genotype imputation in diverse human populations, our modeling framework lays a foundation for extending imputation techniques to novel populations that have not yet been extensively examined. PMID:23079542
Leichty, Aaron R; Brisson, Dustin
2014-10-01
Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.
Next-generation sequencing for targeted discovery of rare mutations in rice
USDA-ARS?s Scientific Manuscript database
Advances in DNA sequencing (i.e., next-generation sequencing, NGS) have greatly increased the power and efficiency of detecting rare mutations in large mutant populations. Targeting Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach for identifying gene mutations resulting fro...
USDA-ARS?s Scientific Manuscript database
Many studies leverage targeted whole genome sequencing (WGS) experiments in order to identify rare and causal variants within populations. As a natural consequence of experimental design, many of these surveys tend to sequence redundant haplotype segments due to high frequency in the base population...
Holtz, Yan; Ardisson, Morgane; Ranwez, Vincent; Besnard, Alban; Leroy, Philippe; Poux, Gérard; Roumet, Pierre; Viader, Véronique; Santoni, Sylvain; David, Jacques
2016-01-01
Targeted sequence capture is a promising technology which helps reduce costs for sequencing and genotyping numerous genomic regions in large sets of individuals. Bait sequences are designed to capture specific alleles previously discovered in parents or reference populations. We studied a set of 135 RILs originating from a cross between an emmer cultivar (Dic2) and a recent durum elite cultivar (Silur). Six thousand sequence baits were designed to target Dic2 vs. Silur polymorphisms discovered in a previous RNAseq study. These baits were exposed to genomic DNA of the RIL population. Eighty percent of the targeted SNPs were recovered, 65% of which were of high quality and coverage. The final high density genetic map consisted of more than 3,000 markers, whose genetic and physical mapping were consistent with those obtained with large arrays. PMID:27171472
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets
2013-01-01
Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
Er, Tze-Kiong; Wang, Yen-Yun; Chen, Chih-Chieh; Herreros-Villanueva, Marta; Liu, Ta-Chih; Yuan, Shyng-Shiou F
2015-10-01
Many genetic factors play an important role in the development of oral squamous cell carcinoma. The aim of this study was to assess the mutational profile in oral squamous cell carcinoma using formalin-fixed, paraffin-embedded tumors from a Taiwanese population by performing targeted sequencing of 26 cancer-associated genes that are frequently mutated in solid tumors. Next-generation sequencing was performed in 50 formalin-fixed, paraffin-embedded tumor specimens obtained from patients with oral squamous cell carcinoma. Genetic alterations in the 26 cancer-associated genes were detected using a deep sequencing (>1000X) approach. TP53, PIK3CA, MET, APC, CDH1, and FBXW7 were most frequently mutated genes. Most remarkably, TP53 mutations and PIK3CA mutations, which accounted for 68% and 18% of tumors, respectively, were more prevalent in a Taiwanese population. Other genes including MET (4%), APC (4%), CDH1 (2%), and FBXW7 (2%) were identified in our population. In summary, our study shows the feasibility of performing targeted sequencing using formalin-fixed, paraffin-embedded samples. Additionally, this study also reports the mutational landscape of oral squamous cell carcinoma in the Taiwanese population. We believe that this study will shed new light on fundamental aspects in understanding the molecular pathogenesis of oral squamous cell carcinoma and may aid in the development of new targeted therapies. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Exploiting CRISPR-Cas nucleases to produce sequence-specific antimicrobials.
Bikard, David; Euler, Chad W; Jiang, Wenyan; Nussenzweig, Philip M; Goldberg, Gregory W; Duportet, Xavier; Fischetti, Vincent A; Marraffini, Luciano A
2014-11-01
Antibiotics target conserved bacterial cellular pathways or growth functions and therefore cannot selectively kill specific members of a complex microbial population. Here, we develop programmable, sequence-specific antimicrobials using the RNA-guided nuclease Cas9 (refs.1,2) delivered by a bacteriophage. We show that Cas9, reprogrammed to target virulence genes, kills virulent, but not avirulent, Staphylococcus aureus. Reprogramming the nuclease to target antibiotic resistance genes destroys staphylococcal plasmids that harbor antibiotic resistance genes and immunizes avirulent staphylococci to prevent the spread of plasmid-borne resistance genes. We also show that CRISPR-Cas9 antimicrobials function in vivo to kill S. aureus in a mouse skin colonization model. This technology creates opportunities to manipulate complex bacterial populations in a sequence-specific manner.
Assessment of phylogenetic sensitivity for reconstructing HIV-1 epidemiological relationships.
Beloukas, Apostolos; Magiorkinis, Emmanouil; Magiorkinis, Gkikas; Zavitsanou, Asimina; Karamitros, Timokratis; Hatzakis, Angelos; Paraskevis, Dimitrios
2012-06-01
Phylogenetic analysis has been extensively used as a tool for the reconstruction of epidemiological relations for research or for forensic purposes. It was our objective to assess the sensitivity of different phylogenetic methods and various phylogenetic programs to reconstruct epidemiological links among HIV-1 infected patients that is the probability to reveal a true transmission relationship. Multiple datasets (90) were prepared consisting of HIV-1 sequences in protease (PR) and partial reverse transcriptase (RT) sampled from patients with documented epidemiological relationship (target population), and from unrelated individuals (control population) belonging to the same HIV-1 subtype as the target population. Each dataset varied regarding the number, the geographic origin and the transmission risk groups of the sequences among the control population. Phylogenetic trees were inferred by neighbor-joining (NJ), maximum likelihood heuristics (hML) and Bayesian methods. All clusters of sequences belonging to the target population were correctly reconstructed by NJ and Bayesian methods receiving high bootstrap and posterior probability (PP) support, respectively. On the other hand, TreePuzzle failed to reconstruct or provide significant support for several clusters; high puzzling step support was associated with the inclusion of control sequences from the same geographic area as the target population. In contrary, all clusters were correctly reconstructed by hML as implemented in PhyML 3.0 receiving high bootstrap support. We report that under the conditions of our study, hML using PhyML, NJ and Bayesian methods were the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. Copyright © 2012 Elsevier B.V. All rights reserved.
Ancestry estimation and control of population stratification for sequence-based association studies.
Wang, Chaolong; Zhan, Xiaowei; Bragg-Gresham, Jennifer; Kang, Hyun Min; Stambolian, Dwight; Chew, Emily Y; Branham, Kari E; Heckenlively, John; Fulton, Robert; Wilson, Richard K; Mardis, Elaine R; Lin, Xihong; Swaroop, Anand; Zöllner, Sebastian; Abecasis, Gonçalo R
2014-04-01
Estimating individual ancestry is important in genetic association studies where population structure leads to false positive signals, although assigning ancestry remains challenging with targeted sequence data. We propose a new method for the accurate estimation of individual genetic ancestry, based on direct analysis of off-target sequence reads, and implement our method in the publicly available LASER software. We validate the method using simulated and empirical data and show that the method can accurately infer worldwide continental ancestry when used with sequencing data sets with whole-genome shotgun coverage as low as 0.001×. For estimates of fine-scale ancestry within Europe, the method performs well with coverage of 0.1×. On an even finer scale, the method improves discrimination between exome-sequenced study participants originating from different provinces within Finland. Finally, we show that our method can be used to improve case-control matching in genetic association studies and to reduce the risk of spurious findings due to population structure.
Development of sequence-specific antimicrobials based on programmable CRISPR-Cas nucleases
Bikard, David; Euler, Chad; Jiang, Wenyan; Nussenzweig, Philip M.; Goldberg, Gregory W.; Duportet, Xavier; Fischetti, Vincent A.; Marraffini, Luciano A.
2014-01-01
Antibiotics target conserved bacterial cellular pathways or growth functions and therefore cannot selectively kill specific members of a complex microbial population. Here, we develop programmable, sequence-specific antimicrobials using the RNA-guided nuclease Cas91, 2 delivered by a bacteriophage. We show that Cas9 re-programmed to target virulence genes kills virulent, but not avirulent, Staphylococcus aureus. Re-programming the nuclease to target antibiotic resistance genes destroys staphylococcal plasmids that harbor antibiotic resistance genes3, 4 and immunizes avirulent staphylococci to prevent the spread of plasmid-borne resistance genes. We also demonstrate the approach in vivo, showing its efficacy against S. aureus in a mouse skin colonization model. This new technology creates opportunities to manipulate complex bacterial populations in a sequence-specific manner. PMID:25282355
Prospective identification of parasitic sequences in phage display screens
Matochko, Wadim L.; Cory Li, S.; Tang, Sindy K.Y.; Derda, Ratmir
2014-01-01
Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 109 random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the ‘parasites’), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 ‘parasites’. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼106 diverse clones prevents the biased selection of parasitic clones. PMID:24217917
Site-specific selfish genes as tools for the control and genetic engineering of natural populations.
Burt, Austin
2003-05-07
Site-specific selfish genes exploit host functions to copy themselves into a defined target DNA sequence, and include homing endonuclease genes, group II introns and some LINE-like transposable elements. If such genes can be engineered to target new host sequences, then they can be used to manipulate natural populations, even if the number of individuals released is a small fraction of the entire population. For example, a genetic load sufficient to eradicate a population can be imposed in fewer than 20 generations, if the target is an essential host gene, the knockout is recessive and the selfish gene has an appropriate promoter. There will be selection for resistance, but several strategies are available for reducing the likelihood of it evolving. These genes may also be used to genetically engineer natural populations, by means of population-wide gene knockouts, gene replacements and genetic transformations. By targeting sex-linked loci just prior to meiosis one may skew the population sex ratio, and by changing the promoter one may limit the spread of the gene to neighbouring populations. The proposed constructs are evolutionarily stable in the face of the mutations most likely to arise during their spread, and strategies are also available for reversing the manipulations.
The genome sequence of a widespread apex Predator, the golden eagle (Aquila chrysaetos)
Jacqueline M. Doyle; Todd E. Katzner; Peter H. Bloom; Yanzhu Ji; Bhagya K. Wijayawardena; J. Andrew DeWoody; Ludovic Orlando
2014-01-01
Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male...
Pausch, Hubert; Wurmser, Christine; Reinhardt, Friedrich; Emmerling, Reiner; Fries, Ruedi
2015-06-01
Most association studies for pinpointing trait-associated variants are performed within breed. The availability of sequence data from key ancestors of several cattle breeds now enables immediate assessment of the frequency of trait-associated variants in populations different from the mapping population and their imputation into large validation populations. The objective of this study was to validate the effects of 4 putatively causative variants on milk production traits, male fertility, and stature in German Fleckvieh and Holstein-Friesian animals using targeted sequence imputation. We used whole-genome sequence data of 456 animals to impute 4 missense mutations in DGAT1, GHR, PRLR, and PROP1 into 10,363 Fleckvieh and 8,812 Holstein animals. The accuracy of the imputed genotypes exceeded 95% for all variants. Association testing with imputed variants revealed consistent antagonistic effects of the DGAT1 p.A232K and GHR p.F279Y variants on milk yield and protein and fat contents, respectively, in both breeds. The allele frequency of both polymorphisms has changed considerably in the past 20 yr, indicating that they were targets of recent selection for milk production traits. The PRLR p.S18N variant was associated with yield traits in Fleckvieh but not in Holstein, suggesting that it may be in linkage disequilibrium with a mutation affecting yield traits rather than being causal. The reported effects of the PROP1 p.H173R variant on milk production, male fertility, and stature could not be confirmed. Our results demonstrate that population-wide imputation of candidate causal variants from sequence data is feasible, enabling their rapid validation in large independent populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Bowen, Lizabeth; Aldridge, B.M.; Miles, A. Keith; Stott, J.L.
2006-01-01
The major histocompatibility complex (MHC) is central to maintaining the immunologic vigor of individuals and populations. Classical MHC class II genes were targeted for partial sequencing in sea otters (Enhydra lutris) from populations in California, Washington, and Alaska. Sequences derived from sea otter peripheral blood leukocyte mRNAs were similar to those classified as DQA, DQB, DRA, and DRB in other species. Comparisons of the derived amino acid compositions supported the classification of these as functional molecules from at least one DQA, DQB, and DRA locus and at least two DRB loci. While limited in scope, phylogenetic analysis of the DRB peptide‐binding region suggested the possible existence of distinct clades demarcated by geographic region. These preliminary findings support the need for additional MHC gene sequencing and expansion to a comprehensive study targeting additional otters.
High-Throughput resequencing of maize landraces at genomic regions associated with flowering time
USDA-ARS?s Scientific Manuscript database
Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequenci...
Highly multiplexed targeted DNA sequencing from single nuclei.
Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E
2016-02-01
Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people
Nelson, Matthew R.; Wegmann, Daniel; Ehm, Margaret G.; Kessner, Darren; St. Jean, Pamela; Verzilli, Claudio; Shen, Judong; Tang, Zhengzheng; Bacanu, Silviu-Alin; Fraser, Dana; Warren, Liling; Aponte, Jennifer; Zawistowski, Matthew; Liu, Xiao; Zhang, Hao; Zhang, Yong; Li, Jun; Li, Yun; Li, Li; Woollard, Peter; Topp, Simon; Hall, Matthew D.; Nangle, Keith; Wang, Jun; Abecasis, Gonçalo; Cardon, Lon R.; Zöllner, Sebastian; Whittaker, John C.; Chissoe, Stephanie L.; Novembre, John; Mooser, Vincent
2015-01-01
Rare genetic variants contribute to complex disease risk; however, the abundance of rare variants in human populations remains unknown. We explored this spectrum of variation by sequencing 202 genes encoding drug targets in 14,002 individuals. We find rare variants are abundant (one every 17 bases) and geographically localized, such that even with large sample sizes, rare variant catalogs will be largely incomplete. We used the observed patterns of variation to estimate population growth parameters, the proportion of variants in a given frequency class that are putatively deleterious, and mutation rates for each gene. Overall we conclude that, due to rapid population growth and weak purifying selection, human populations harbor an abundance of rare variants, many of which are deleterious and have relevance to understanding disease risk. PMID:22604722
Discovery of rare mutations in populations: TILLING by sequencing
USDA-ARS?s Scientific Manuscript database
Discovery of rare mutations in populations requires methods for processing and analyzing in parallel many individuals. Previous TILLING methods employed enzymatic or physical discrimination of heteroduplexed from homoduplexed target DNA. We used mutant populations of rice and wheat to develop a meth...
Wong, Lai-Ping; Lai, Jason Kuan-Han; Saw, Woei-Yuh; Ong, Rick Twee-Hee; Cheng, Anthony Youzhi; Pillai, Nisha Esakimuthu; Liu, Xuanyao; Xu, Wenting; Chen, Peng; Foo, Jia-Nee; Tan, Linda Wei-Lin; Koo, Seok-Hwee; Soong, Richie; Wenk, Markus Rene; Lim, Wei-Yen; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying
2014-05-01
South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Varela, Miguel A; Curtis, Helen J; Douglas, Andrew GL; Hammond, Suzan M; O'Loughlin, Aisling J; Sobrido, Maria J; Scholefield, Janine; Wood, Matthew JA
2016-01-01
Allele-specific gene therapy aims to silence expression of mutant alleles through targeting of disease-linked single-nucleotide polymorphisms (SNPs). However, SNP linkage to disease varies between populations, making such molecular therapies applicable only to a subset of patients. Moreover, not all SNPs have the molecular features necessary for potent gene silencing. Here we provide knowledge to allow the maximisation of patient coverage by building a comprehensive understanding of SNPs ranked according to their predicted suitability toward allele-specific silencing in 14 repeat expansion diseases: amyotrophic lateral sclerosis and frontotemporal dementia, dentatorubral-pallidoluysian atrophy, myotonic dystrophy 1, myotonic dystrophy 2, Huntington's disease and several spinocerebellar ataxias. Our systematic analysis of DNA sequence variation shows that most annotated SNPs are not suitable for potent allele-specific silencing across populations because of suboptimal sequence features and low variability (>97% in HD). We suggest maximising patient coverage by selecting SNPs with high heterozygosity across populations, and preferentially targeting SNPs that lead to purine:purine mismatches in wild-type alleles to obtain potent allele-specific silencing. We therefore provide fundamental knowledge on strategies for optimising patient coverage of therapeutics for microsatellite expansion disorders by linking analysis of population genetic variation to the selection of molecular targets. PMID:25990798
Varela, Miguel A; Curtis, Helen J; Douglas, Andrew G L; Hammond, Suzan M; O'Loughlin, Aisling J; Sobrido, Maria J; Scholefield, Janine; Wood, Matthew J A
2016-02-01
Allele-specific gene therapy aims to silence expression of mutant alleles through targeting of disease-linked single-nucleotide polymorphisms (SNPs). However, SNP linkage to disease varies between populations, making such molecular therapies applicable only to a subset of patients. Moreover, not all SNPs have the molecular features necessary for potent gene silencing. Here we provide knowledge to allow the maximisation of patient coverage by building a comprehensive understanding of SNPs ranked according to their predicted suitability toward allele-specific silencing in 14 repeat expansion diseases: amyotrophic lateral sclerosis and frontotemporal dementia, dentatorubral-pallidoluysian atrophy, myotonic dystrophy 1, myotonic dystrophy 2, Huntington's disease and several spinocerebellar ataxias. Our systematic analysis of DNA sequence variation shows that most annotated SNPs are not suitable for potent allele-specific silencing across populations because of suboptimal sequence features and low variability (>97% in HD). We suggest maximising patient coverage by selecting SNPs with high heterozygosity across populations, and preferentially targeting SNPs that lead to purine:purine mismatches in wild-type alleles to obtain potent allele-specific silencing. We therefore provide fundamental knowledge on strategies for optimising patient coverage of therapeutics for microsatellite expansion disorders by linking analysis of population genetic variation to the selection of molecular targets.
Sequence-specific antimicrobials using efficiently delivered RNA-guided nucleases.
Citorik, Robert J; Mimee, Mark; Lu, Timothy K
2014-11-01
Current antibiotics tend to be broad spectrum, leading to indiscriminate killing of commensal bacteria and accelerated evolution of drug resistance. Here, we use CRISPR-Cas technology to create antimicrobials whose spectrum of activity is chosen by design. RNA-guided nucleases (RGNs) targeting specific DNA sequences are delivered efficiently to microbial populations using bacteriophage or bacteria carrying plasmids transmissible by conjugation. The DNA targets of RGNs can be undesirable genes or polymorphisms, including antibiotic resistance and virulence determinants in carbapenem-resistant Enterobacteriaceae and enterohemorrhagic Escherichia coli. Delivery of RGNs significantly improves survival in a Galleria mellonella infection model. We also show that RGNs enable modulation of complex bacterial populations by selective knockdown of targeted strains based on genetic signatures. RGNs constitute a class of highly discriminatory, customizable antimicrobials that enact selective pressure at the DNA level to reduce the prevalence of undesired genes, minimize off-target effects and enable programmable remodeling of microbiota.
SNP discovery by high-throughput sequencing in soybean
2010-01-01
Background With the advance of new massively parallel genotyping technologies, quantitative trait loci (QTL) fine mapping and map-based cloning become more achievable in identifying genes for important and complex traits. Development of high-density genetic markers in the QTL regions of specific mapping populations is essential for fine-mapping and map-based cloning of economically important genes. Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation existing between any diverse genotypes that are usually used for QTL mapping studies. The massively parallel sequencing technologies (Roche GS/454, Illumina GA/Solexa, and ABI/SOLiD), have been widely applied to identify genome-wide sequence variations. However, it is still remains unclear whether sequence data at a low sequencing depth are enough to detect the variations existing in any QTL regions of interest in a crop genome, and how to prepare sequencing samples for a complex genome such as soybean. Therefore, with the aims of identifying SNP markers in a cost effective way for fine-mapping several QTL regions, and testing the validation rate of the putative SNPs predicted with Solexa short sequence reads at a low sequencing depth, we evaluated a pooled DNA fragment reduced representation library and SNP detection methods applied to short read sequences generated by Solexa high-throughput sequencing technology. Results A total of 39,022 putative SNPs were identified by the Illumina/Solexa sequencing system using a reduced representation DNA library of two parental lines of a mapping population. The validation rates of these putative SNPs predicted with low and high stringency were 72% and 85%, respectively. One hundred sixty four SNP markers resulted from the validation of putative SNPs and have been selectively chosen to target a known QTL, thereby increasing the marker density of the targeted region to one marker per 42 K bp. Conclusions We have demonstrated how to quickly identify large numbers of SNPs for fine mapping of QTL regions by applying massively parallel sequencing combined with genome complexity reduction techniques. This SNP discovery approach is more efficient for targeting multiple QTL regions in a same genetic population, which can be applied to other crops. PMID:20701770
Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A
2014-10-01
Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter
2017-01-01
Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594
Error baseline rates of five sample preparation methods used to characterize RNA virus populations.
Kugelman, Jeffrey R; Wiley, Michael R; Nagle, Elyse R; Reyes, Daniel; Pfeffer, Brad P; Kuhn, Jens H; Sanchez-Lockhart, Mariano; Palacios, Gustavo F
2017-01-01
Individual RNA viruses typically occur as populations of genomes that differ slightly from each other due to mutations introduced by the error-prone viral polymerase. Understanding the variability of RNA virus genome populations is critical for understanding virus evolution because individual mutant genomes may gain evolutionary selective advantages and give rise to dominant subpopulations, possibly even leading to the emergence of viruses resistant to medical countermeasures. Reverse transcription of virus genome populations followed by next-generation sequencing is the only available method to characterize variation for RNA viruses. However, both steps may lead to the introduction of artificial mutations, thereby skewing the data. To better understand how such errors are introduced during sample preparation, we determined and compared error baseline rates of five different sample preparation methods by analyzing in vitro transcribed Ebola virus RNA from an artificial plasmid-based system. These methods included: shotgun sequencing from plasmid DNA or in vitro transcribed RNA as a basic "no amplification" method, amplicon sequencing from the plasmid DNA or in vitro transcribed RNA as a "targeted" amplification method, sequence-independent single-primer amplification (SISPA) as a "random" amplification method, rolling circle reverse transcription sequencing (CirSeq) as an advanced "no amplification" method, and Illumina TruSeq RNA Access as a "targeted" enrichment method. The measured error frequencies indicate that RNA Access offers the best tradeoff between sensitivity and sample preparation error (1.4-5) of all compared methods.
Demographic history and rare allele sharing among human populations.
Gravel, Simon; Henn, Brenna M; Gutenkunst, Ryan N; Indap, Amit R; Marth, Gabor T; Clark, Andrew G; Yu, Fuli; Gibbs, Richard A; Bustamante, Carlos D
2011-07-19
High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2-4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence.
Demographic history and rare allele sharing among human populations
Gravel, Simon; Henn, Brenna M.; Gutenkunst, Ryan N.; Indap, Amit R.; Marth, Gabor T.; Clark, Andrew G.; Yu, Fuli; Gibbs, Richard A.; Bustamante, Carlos D.; Altshuler, David L.; Durbin, Richard M.; Abecasis, Gonçalo R.; Bentley, David R.; Chakravarti, Aravinda; Clark, Andrew G.; Collins, Francis S.; De La Vega, Francisco M.; Donnelly, Peter; Egholm, Michael; Flicek, Paul; Gabriel, Stacey B.; Gibbs, Richard A.; Knoppers, Bartha M.; Lander, Eric S.; Lehrach, Hans; Mardis, Elaine R.; McVean, Gil A.; Nickerson, Debbie A.; Peltonen, Leena; Schafer, Alan J.; Sherry, Stephen T.; Wang, Jun; Wilson, Richard K.; Gibbs, Richard A.; Deiros, David; Metzker, Mike; Muzny, Donna; Reid, Jeff; Wheeler, David; Wang, Jun; Li, Jingxiang; Jian, Min; Li, Guoqing; Li, Ruiqiang; Liang, Huiqing; Tian, Geng; Wang, Bo; Wang, Jian; Wang, Wei; Yang, Huanming; Zhang, Xiuqing; Zheng, Huisong; Lander, Eric S.; Altshuler, David L.; Ambrogio, Lauren; Bloom, Toby; Cibulskis, Kristian; Fennell, Tim J.; Gabriel, Stacey B.; Jaffe, David B.; Shefler, Erica; Sougnez, Carrie L.; Bentley, David R.; Gormley, Niall; Humphray, Sean; Kingsbury, Zoya; Koko-Gonzales, Paula; Stone, Jennifer; McKernan, Kevin J.; Costa, Gina L.; Ichikawa, Jeffry K.; Lee, Clarence C.; Sudbrak, Ralf; Lehrach, Hans; Borodina, Tatiana A.; Dahl, Andreas; Davydov, Alexey N.; Marquardt, Peter; Mertes, Florian; Nietfeld, Wilfiried; Rosenstiel, Philip; Schreiber, Stefan; Soldatov, Aleksey V.; Timmermann, Bernd; Tolzmann, Marius; Egholm, Michael; Affourtit, Jason; Ashworth, Dana; Attiya, Said; Bachorski, Melissa; Buglione, Eli; Burke, Adam; Caprio, Amanda; Celone, Christopher; Clark, Shauna; Conners, David; Desany, Brian; Gu, Lisa; Guccione, Lorri; Kao, Kalvin; Kebbel, Andrew; Knowlton, Jennifer; Labrecque, Matthew; McDade, Louise; Mealmaker, Craig; Minderman, Melissa; Nawrocki, Anne; Niazi, Faheem; Pareja, Kristen; Ramenani, Ravi; Riches, David; Song, Wanmin; Turcotte, Cynthia; Wang, Shally; Mardis, Elaine R.; Wilson, Richard K.; Dooling, David; Fulton, Lucinda; Fulton, Robert; Weinstock, George; Durbin, Richard M.; Burton, John; Carter, David M.; Churcher, Carol; Coffey, Alison; Cox, Anthony; Palotie, Aarno; Quail, Michael; Skelly, Tom; Stalker, James; Swerdlow, Harold P.; Turner, Daniel; De Witte, Anniek; Giles, Shane; Gibbs, Richard A.; Wheeler, David; Bainbridge, Matthew; Challis, Danny; Sabo, Aniko; Yu, Fuli; Yu, Jin; Wang, Jun; Fang, Xiaodong; Guo, Xiaosen; Li, Ruiqiang; Li, Yingrui; Luo, Ruibang; Tai, Shuaishuai; Wu, Honglong; Zheng, Hancheng; Zheng, Xiaole; Zhou, Yan; Li, Guoqing; Wang, Jian; Yang, Huanming; Marth, Gabor T.; Garrison, Erik P.; Huang, Weichun; Indap, Amit; Kural, Deniz; Lee, Wan-Ping; Leong, Wen Fung; Quinlan, Aaron R.; Stewart, Chip; Stromberg, Michael P.; Ward, Alistair N.; Wu, Jiantao; Lee, Charles; Mills, Ryan E.; Shi, Xinghua; Daly, Mark J.; DePristo, Mark A.; Altshuler, David L.; Ball, Aaron D.; Banks, Eric; Bloom, Toby; Browning, Brian L.; Cibulskis, Kristian; Fennell, Tim J.; Garimella, Kiran V.; Grossman, Sharon R.; Handsaker, Robert E.; Hanna, Matt; Hartl, Chris; Jaffe, David B.; Kernytsky, Andrew M.; Korn, Joshua M.; Li, Heng; Maguire, Jared R.; McCarroll, Steven A.; McKenna, Aaron; Nemesh, James C.; Philippakis, Anthony A.; Poplin, Ryan E.; Price, Alkes; Rivas, Manuel A.; Sabeti, Pardis C.; Schaffner, Stephen F.; Shefler, Erica; Shlyakhter, Ilya A.; Cooper, David N.; Ball, Edward V.; Mort, Matthew; Phillips, Andrew D.; Stenson, Peter D.; Sebat, Jonathan; Makarov, Vladimir; Ye, Kenny; Yoon, Seungtai C.; Bustamante, Carlos D.; Clark, Andrew G.; Boyko, Adam; Degenhardt, Jeremiah; Gravel, Simon; Gutenkunst, Ryan N.; Kaganovich, Mark; Keinan, Alon; Lacroute, Phil; Ma, Xin; Reynolds, Andy; Clarke, Laura; Flicek, Paul; Cunningham, Fiona; Herrero, Javier; Keenen, Stephen; Kulesha, Eugene; Leinonen, Rasko; McLaren, William M.; Radhakrishnan, Rajesh; Smith, Richard E.; Zalunin, Vadim; Zheng-Bradley, Xiangqun; Korbel, Jan O.; Stütz, Adrian M.; Humphray, Sean; Bauer, Markus; Cheetham, R. Keira; Cox, Tony; Eberle, Michael; James, Terena; Kahn, Scott; Murray, Lisa; Chakravarti, Aravinda; Ye, Kai; De La Vega, Francisco M.; Fu, Yutao; Hyland, Fiona C. L.; Manning, Jonathan M.; McLaughlin, Stephen F.; Peckham, Heather E.; Sakarya, Onur; Sun, Yongming A.; Tsung, Eric F.; Batzer, Mark A.; Konkel, Miriam K.; Walker, Jerilyn A.; Sudbrak, Ralf; Albrecht, Marcus W.; Amstislavskiy, Vyacheslav S.; Herwig, Ralf; Parkhomchuk, Dimitri V.; Sherry, Stephen T.; Agarwala, Richa; Khouri, Hoda M.; Morgulis, Aleksandr O.; Paschall, Justin E.; Phan, Lon D.; Rotmistrovsky, Kirill E.; Sanders, Robert D.; Shumway, Martin F.; Xiao, Chunlin; McVean, Gil A.; Auton, Adam; Iqbal, Zamin; Lunter, Gerton; Marchini, Jonathan L.; Moutsianas, Loukas; Myers, Simon; Tumian, Afidalina; Desany, Brian; Knight, James; Winer, Roger; Craig, David W.; Beckstrom-Sternberg, Steve M.; Christoforides, Alexis; Kurdoglu, Ahmet A.; Pearson, John V.; Sinari, Shripad A.; Tembe, Waibhav D.; Haussler, David; Hinrichs, Angie S.; Katzman, Sol J.; Kern, Andrew; Kuhn, Robert M.; Przeworski, Molly; Hernandez, Ryan D.; Howie, Bryan; Kelley, Joanna L.; Melton, S. Cord; Abecasis, Gonçalo R.; Li, Yun; Anderson, Paul; Blackwell, Tom; Chen, Wei; Cookson, William O.; Ding, Jun; Kang, Hyun Min; Lathrop, Mark; Liang, Liming; Moffatt, Miriam F.; Scheet, Paul; Sidore, Carlo; Snyder, Matthew; Zhan, Xiaowei; Zöllner, Sebastian; Awadalla, Philip; Casals, Ferran; Idaghdour, Youssef; Keebler, John; Stone, Eric A.; Zilversmit, Martine; Jorde, Lynn; Xing, Jinchuan; Eichler, Evan E.; Aksay, Gozde; Alkan, Can; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Kidd, Jeffrey M.; Sahinalp, S. Cenk; Sudmant, Peter H.; Mardis, Elaine R.; Chen, Ken; Chinwalla, Asif; Ding, Li; Koboldt, Daniel C.; McLellan, Mike D.; Dooling, David; Weinstock, George; Wallis, John W.; Wendl, Michael C.; Zhang, Qunyuan; Durbin, Richard M.; Albers, Cornelis A.; Ayub, Qasim; Balasubramaniam, Senduran; Barrett, Jeffrey C.; Carter, David M.; Chen, Yuan; Conrad, Donald F.; Danecek, Petr; Dermitzakis, Emmanouil T.; Hu, Min; Huang, Ni; Hurles, Matt E.; Jin, Hanjun; Jostins, Luke; Keane, Thomas M.; Le, Si Quang; Lindsay, Sarah; Long, Quan; MacArthur, Daniel G.; Montgomery, Stephen B.; Parts, Leopold; Stalker, James; Tyler-Smith, Chris; Walter, Klaudia; Zhang, Yujun; Gerstein, Mark B.; Snyder, Michael; Abyzov, Alexej; Balasubramanian, Suganthi; Bjornson, Robert; Du, Jiang; Grubert, Fabian; Habegger, Lukas; Haraksingh, Rajini; Jee, Justin; Khurana, Ekta; Lam, Hugo Y. K.; Leng, Jing; Mu, Xinmeng Jasmine; Urban, Alexander E.; Zhang, Zhengdong; Li, Yingrui; Luo, Ruibang; Marth, Gabor T.; Garrison, Erik P.; Kural, Deniz; Quinlan, Aaron R.; Stewart, Chip; Stromberg, Michael P.; Ward, Alistair N.; Wu, Jiantao; Lee, Charles; Mills, Ryan E.; Shi, Xinghua; McCarroll, Steven A.; Banks, Eric; DePristo, Mark A.; Handsaker, Robert E.; Hartl, Chris; Korn, Joshua M.; Li, Heng; Nemesh, James C.; Sebat, Jonathan; Makarov, Vladimir; Ye, Kenny; Yoon, Seungtai C.; Degenhardt, Jeremiah; Kaganovich, Mark; Clarke, Laura; Smith, Richard E.; Zheng-Bradley, Xiangqun; Korbel, Jan O.; Humphray, Sean; Cheetham, R. Keira; Eberle, Michael; Kahn, Scott; Murray, Lisa; Ye, Kai; De La Vega, Francisco M.; Fu, Yutao; Peckham, Heather E.; Sun, Yongming A.; Batzer, Mark A.; Konkel, Miriam K.; Walker, Jerilyn A.; Xiao, Chunlin; Iqbal, Zamin; Desany, Brian; Blackwell, Tom; Snyder, Matthew; Xing, Jinchuan; Eichler, Evan E.; Aksay, Gozde; Alkan, Can; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Kidd, Jeffrey M.; Chen, Ken; Chinwalla, Asif; Ding, Li; McLellan, Mike D.; Wallis, John W.; Hurles, Matt E.; Conrad, Donald F.; Walter, Klaudia; Zhang, Yujun; Gerstein, Mark B.; Snyder, Michael; Abyzov, Alexej; Du, Jiang; Grubert, Fabian; Haraksingh, Rajini; Jee, Justin; Khurana, Ekta; Lam, Hugo Y. K.; Leng, Jing; Mu, Xinmeng Jasmine; Urban, Alexander E.; Zhang, Zhengdong; Gibbs, Richard A.; Bainbridge, Matthew; Challis, Danny; Coafra, Cristian; Dinh, Huyen; Kovar, Christie; Lee, Sandy; Muzny, Donna; Nazareth, Lynne; Reid, Jeff; Sabo, Aniko; Yu, Fuli; Yu, Jin; Marth, Gabor T.; Garrison, Erik P.; Indap, Amit; Leong, Wen Fung; Quinlan, Aaron R.; Stewart, Chip; Ward, Alistair N.; Wu, Jiantao; Cibulskis, Kristian; Fennell, Tim J.; Gabriel, Stacey B.; Garimella, Kiran V.; Hartl, Chris; Shefler, Erica; Sougnez, Carrie L.; Wilkinson, Jane; Clark, Andrew G.; Gravel, Simon; Grubert, Fabian; Clarke, Laura; Flicek, Paul; Smith, Richard E.; Zheng-Bradley, Xiangqun; Sherry, Stephen T.; Khouri, Hoda M.; Paschall, Justin E.; Shumway, Martin F.; Xiao, Chunlin; McVean, Gil A.; Katzman, Sol J.; Abecasis, Gonçalo R.; Blackwell, Tom; Mardis, Elaine R.; Dooling, David; Fulton, Lucinda; Fulton, Robert; Koboldt, Daniel C.; Durbin, Richard M.; Balasubramaniam, Senduran; Coffey, Allison; Keane, Thomas M.; MacArthur, Daniel G.; Palotie, Aarno; Scott, Carol; Stalker, James; Tyler-Smith, Chris; Gerstein, Mark B.; Balasubramanian, Suganthi; Chakravarti, Aravinda; Knoppers, Bartha M.; Abecasis, Gonçalo R.; Bustamante, Carlos D.; Gharani, Neda; Gibbs, Richard A.; Jorde, Lynn; Kaye, Jane S.; Kent, Alastair; Li, Taosha; McGuire, Amy L.; McVean, Gil A.; Ossorio, Pilar N.; Rotimi, Charles N.; Su, Yeyang; Toji, Lorraine H.; TylerSmith, Chris; Brooks, Lisa D.; Felsenfeld, Adam L.; McEwen, Jean E.; Abdallah, Assya; Juenger, Christopher R.; Clemm, Nicholas C.; Collins, Francis S.; Duncanson, Audrey; Green, Eric D.; Guyer, Mark S.; Peterson, Jane L.; Schafer, Alan J.; Abecasis, Gonçalo R.; Altshuler, David L.; Auton, Adam; Brooks, Lisa D.; Durbin, Richard M.; Gibbs, Richard A.; Hurles, Matt E.; McVean, Gil A.
2011-01-01
High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2–4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. PMID:21730125
Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.
Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.
Lamendella, Regina; Li, Kent C.; Oerther, Daniel
2013-01-01
Several swine-specific microbial source tracking methods are based on PCR assays targeting Bacteroidales 16S rRNA gene sequences. The limited application of these assays can be explained by the poor understanding of their molecular diversity in fecal sources and environmental waters. In order to address this, we studied the diversity of 9,340 partial (>600 bp in length) Bacteroidales 16S rRNA gene sequences from 13 fecal sources and nine feces-contaminated watersheds. The compositions of major Bacteroidales populations were analyzed to determine which host and environmental sequences were contributing to each group. This information allowed us to identify populations which were both exclusive to swine fecal sources and detected in swine-contaminated waters. Phylogenetic and diversity analyses revealed that some markers previously believed to be highly specific to swine populations are shared by multiple hosts, potentially explaining the cross-amplification signals obtained with nontargeted hosts. These data suggest that while many Bacteroidales populations are cosmopolitan, others exhibit a preferential host distribution and may be able to survive different environmental conditions. This study further demonstrates the importance of elucidating the diversity patterns of targeted bacterial groups to develop more inclusive fecal source tracking applications. PMID:23160126
USDA-ARS?s Scientific Manuscript database
The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the...
Glass, Leslie L; Calero-Nieto, Fernando J; Jawaid, Wajid; Larraufie, Pierre; Kay, Richard G; Göttgens, Berthold; Reimann, Frank; Gribble, Fiona M
2017-10-01
To identify sub-populations of intestinal preproglucagon-expressing (PPG) cells producing Glucagon-like Peptide-1, and their associated expression profiles of sensory receptors, thereby enabling the discovery of therapeutic strategies that target these cell populations for the treatment of diabetes and obesity. We performed single cell RNA sequencing of PPG-cells purified by flow cytometry from the upper small intestine of 3 GLU-Venus mice. Cells from 2 mice were sequenced at low depth, and from the third mouse at high depth. High quality sequencing data from 234 PPG-cells were used to identify clusters by tSNE analysis. qPCR was performed to compare the longitudinal and crypt/villus locations of cluster-specific genes. Immunofluorescence and mass spectrometry were used to confirm protein expression. PPG-cells formed 3 major clusters: a group with typical characteristics of classical L-cells, including high expression of Gcg and Pyy (comprising 51% of all PPG-cells); a cell type overlapping with Gip-expressing K-cells (14%); and a unique cluster expressing Tph1 and Pzp that was predominantly located in proximal small intestine villi and co-produced 5-HT (35%). Expression of G-protein coupled receptors differed between clusters, suggesting the cell types are differentially regulated and would be differentially targetable. Our findings support the emerging concept that many enteroendocrine cell populations are highly overlapping, with individual cells producing a range of peptides previously assigned to distinct cell types. Different receptor expression profiles across the clusters highlight potential drug targets to increase gut hormone secretion for the treatment of diabetes and obesity. Copyright © 2017 The Authors. Published by Elsevier GmbH.. All rights reserved.
Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A
2011-01-01
Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.
Bybee, Seth M.; Bracken-Grissom, Heather; Haynes, Benjamin D.; Hermansen, Russell A.; Byers, Robert L.; Clement, Mark J.; Udall, Joshua A.; Wilcox, Edward R.; Crandall, Keith A.
2011-01-01
Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach. PMID:22002916
Managing the genomic revolution in cancer diagnostics.
Nguyen, Doreen; Gocke, Christopher D
2017-08-01
Molecular tumor profiling is now a routine part of patient care, revealing targetable genomic alterations and molecularly distinct tumor subtypes with therapeutic and prognostic implications. The widespread adoption of next-generation sequencing technologies has greatly facilitated clinical implementation of genomic data and opened the door for high-throughput multigene-targeted sequencing. Herein, we discuss the variability of cancer genetic profiling currently offered by clinical laboratories, the challenges of applying rapidly evolving medical knowledge to individual patients, and the need for more standardized population-based molecular profiling.
Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment
Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre
2012-01-01
Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378
Feliziani, Sofía; Moyano, Alejandro J.; Di Rienzo, Julio A.; Krogh Johansen, Helle; Molin, Søren; Smania, Andrea M.
2014-01-01
The advent of high-throughput sequencing techniques has made it possible to follow the genomic evolution of pathogenic bacteria by comparing longitudinally collected bacteria sampled from human hosts. Such studies in the context of chronic airway infections by Pseudomonas aeruginosa in cystic fibrosis (CF) patients have indicated high bacterial population diversity. Such diversity may be driven by hypermutability resulting from DNA mismatch repair system (MRS) deficiency, a common trait evolved by P. aeruginosa strains in CF infections. No studies to date have utilized whole-genome sequencing to investigate within-host population diversity or long-term evolution of mutators in CF airways. We sequenced the genomes of 13 and 14 isolates of P. aeruginosa mutator populations from an Argentinian and a Danish CF patient, respectively. Our collection of isolates spanned 6 and 20 years of patient infection history, respectively. We sequenced 11 isolates from a single sample from each patient to allow in-depth analysis of population diversity. Each patient was infected by clonal populations of bacteria that were dominated by mutators. The in vivo mutation rate of the populations was ∼100 SNPs/year–∼40-fold higher than rates in normo-mutable populations. Comparison of the genomes of 11 isolates from the same sample showed extensive within-patient genomic diversification; the populations were composed of different sub-lineages that had coexisted for many years since the initial colonization of the patient. Analysis of the mutations identified genes that underwent convergent evolution across lineages and sub-lineages, suggesting that the genes were targeted by mutation to optimize pathogenic fitness. Parallel evolution was observed in reduction of overall catabolic capacity of the populations. These findings are useful for understanding the evolution of pathogen populations and identifying new targets for control of chronic infections. PMID:25330091
Ramu, P; Kassahun, B; Senthilvel, S; Ashok Kumar, C; Jayashree, B; Folkertsma, R T; Reddy, L Ananda; Kuruvinashetti, M S; Haussmann, B I G; Hash, C T
2009-11-01
The sequencing and detailed comparative functional analysis of genomes of a number of select botanical models open new doors into comparative genomics among the angiosperms, with potential benefits for improvement of many orphan crops that feed large populations. In this study, a set of simple sequence repeat (SSR) markers was developed by mining the expressed sequence tag (EST) database of sorghum. Among the SSR-containing sequences, only those sharing considerable homology with rice genomic sequences across the lengths of the 12 rice chromosomes were selected. Thus, 600 SSR-containing sorghum EST sequences (50 homologous sequences on each of the 12 rice chromosomes) were selected, with the intention of providing coverage for corresponding homologous regions of the sorghum genome. Primer pairs were designed and polymorphism detection ability was assessed using parental pairs of two existing sorghum mapping populations. About 28% of these new markers detected polymorphism in this 4-entry panel. A subset of 55 polymorphic EST-derived SSR markers were mapped onto the existing skeleton map of a recombinant inbred population derived from cross N13 x E 36-1, which is segregating for Striga resistance and the stay-green component of terminal drought tolerance. These new EST-derived SSR markers mapped across all 10 sorghum linkage groups, mostly to regions expected based on prior knowledge of rice-sorghum synteny. The ESTs from which these markers were derived were then mapped in silico onto the aligned sorghum genome sequence, and 88% of the best hits corresponded to linkage-based positions. This study demonstrates the utility of comparative genomic information in targeted development of markers to fill gaps in linkage maps of related crop species for which sufficient genomic tools are not available.
Coutinho, Alexandra; Valverde, Guido; Fehren-Schmitz, Lars; Cooper, Alan; Barreto Romero, Maria Inés; Espinoza, Isabel Flores; Llamas, Bastien; Haak, Wolfgang
2014-01-01
Phylogeographic studies have described a reduced genetic diversity in Native American populations, indicative of one or more bottleneck events during the peopling and prehistory of the Americas. Classical sequencing approaches targeting the mitochondrial diversity have reported the presence of five major haplogroups, namely A, B, C, D and X, whereas the advent of complete mitochondrial genome sequencing has recently refined the number of founder lineages within the given diversity to 15 sub-haplogroups. We developed and optimized a SNaPshot assay to study the mitochondrial diversity in pre-Columbian Native American populations by simultaneous typing of 26 single nucleotide polymorphisms (SNPs) characterising Native American sub-haplogroups. Our assay proved to be highly sensitive with respect to starting concentrations of target DNA and could be applied successfully to a range of ancient human skeletal material from South America from various time periods. The AmericaPlex26 is a powerful assay with enhanced phylogenetic resolution that allows time- and cost-efficient mitochondrial DNA sub-typing from valuable ancient specimens. It can be applied in addition or alternative to standard sequencing of the D-loop region in forensics, ancestry testing, and population studies, or where full-resolution mitochondrial genome sequencing is not feasible. PMID:24671218
Coutinho, Alexandra; Valverde, Guido; Fehren-Schmitz, Lars; Cooper, Alan; Barreto Romero, Maria Inés; Espinoza, Isabel Flores; Llamas, Bastien; Haak, Wolfgang
2014-01-01
Phylogeographic studies have described a reduced genetic diversity in Native American populations, indicative of one or more bottleneck events during the peopling and prehistory of the Americas. Classical sequencing approaches targeting the mitochondrial diversity have reported the presence of five major haplogroups, namely A, B, C, D and X, whereas the advent of complete mitochondrial genome sequencing has recently refined the number of founder lineages within the given diversity to 15 sub-haplogroups. We developed and optimized a SNaPshot assay to study the mitochondrial diversity in pre-Columbian Native American populations by simultaneous typing of 26 single nucleotide polymorphisms (SNPs) characterising Native American sub-haplogroups. Our assay proved to be highly sensitive with respect to starting concentrations of target DNA and could be applied successfully to a range of ancient human skeletal material from South America from various time periods. The AmericaPlex26 is a powerful assay with enhanced phylogenetic resolution that allows time- and cost-efficient mitochondrial DNA sub-typing from valuable ancient specimens. It can be applied in addition or alternative to standard sequencing of the D-loop region in forensics, ancestry testing, and population studies, or where full-resolution mitochondrial genome sequencing is not feasible.
CRISPR-based screening of genomic island excision events in bacteria.
Selle, Kurt; Klaenhammer, Todd R; Barrangou, Rodolphe
2015-06-30
Genomic analysis of Streptococcus thermophilus revealed that mobile genetic elements (MGEs) likely contributed to gene acquisition and loss during evolutionary adaptation to milk. Clustered regularly interspaced short palindromic repeats-CRISPR-associated genes (CRISPR-Cas), the adaptive immune system in bacteria, limits genetic diversity by targeting MGEs including bacteriophages, transposons, and plasmids. CRISPR-Cas systems are widespread in streptococci, suggesting that the interplay between CRISPR-Cas systems and MGEs is one of the driving forces governing genome homeostasis in this genus. To investigate the genetic outcomes resulting from CRISPR-Cas targeting of integrated MGEs, in silico prediction revealed four genomic islands without essential genes in lengths from 8 to 102 kbp, totaling 7% of the genome. In this study, the endogenous CRISPR3 type II system was programmed to target the four islands independently through plasmid-based expression of engineered CRISPR arrays. Targeting lacZ within the largest 102-kbp genomic island was lethal to wild-type cells and resulted in a reduction of up to 2.5-log in the surviving population. Genotyping of Lac(-) survivors revealed variable deletion events between the flanking insertion-sequence elements, all resulting in elimination of the Lac-encoding island. Chimeric insertion sequence footprints were observed at the deletion junctions after targeting all of the four genomic islands, suggesting a common mechanism of deletion via recombination between flanking insertion sequences. These results established that self-targeting CRISPR-Cas systems may direct significant evolution of bacterial genomes on a population level, influencing genome homeostasis and remodeling.
Targeted enrichment strategies for next-generation plant biology
Richard Cronn; Brian J. Knaus; Aaron Liston; Peter J. Maughan; Matthew Parks; John V. Syring; Joshua Udall
2012-01-01
The dramatic advances offered by modem DNA sequencers continue to redefine the limits of what can be accomplished in comparative plant biology. Even with recent achievements, however, plant genomes present obstacles that can make it difficult to execute large-scale population and phylogenetic studies on next-generation sequencing platforms. Factors like large genome...
Sivadas, A; Salleh, M Z; Teh, L K; Scaria, V
2017-10-01
Expanding the scope of pharmacogenomic research by including multiple global populations is integral to building robust evidence for its clinical translation. Deep whole-genome sequencing of diverse ethnic populations provides a unique opportunity to study rare and common pharmacogenomic markers that often vary in frequency across populations. In this study, we aim to build a diverse map of pharmacogenetic variants in South East Asian (SEA) Malay population using deep whole-genome sequences of 100 healthy SEA Malay individuals. We investigated the allelic diversity of potentially deleterious pharmacogenomic variants in SEA Malay population. Our analysis revealed 227 common and 466 rare potentially functional single nucleotide variants (SNVs) in 437 pharmacogenomic genes involved in drug metabolism, transport and target genes, including 74 novel variants. This study has created one of the most comprehensive maps of pharmacogenetic markers in any population from whole genomes and will hugely benefit pharmacogenomic investigations and drug dosage recommendations in SEA Malays.
Han, Heping; Yu, Qin; Owen, Mechelle J; Cawthray, Gregory R; Powles, Stephen B
2016-02-01
Lolium rigidum populations in Australia and globally have demonstrated rapid and widespread evolution of resistance to acetyl coenzyme A carboxylase (ACCase)-inhibiting and acetolactate synthase (ALS)-inhibiting herbicides. Thirty-three resistant L. rigidum populations, randomly collected from crop fields in a most recent resistance survey, were analysed for non-target-site diclofop metabolism and all known target-site ACCase gene resistance-endowing mutations. The HPLC profile of [(14) C]-diclofop-methyl in vivo metabolism revealed that 79% of these resistant L. rigidum populations showed enhanced capacity for diclofop acid metabolism (metabolic resistance). ACCase gene sequencing identified that 91% of the populations contain plants with ACCase resistance mutation(s). Importantly, 70% of the populations exhibit both non-target-site metabolic resistance and target-site ACCase mutations. This work demonstrates that metabolic herbicide resistance is commonly occurring in L. rigidum, and coevolution of both metabolic resistance and target-site resistance is an evolutionary reality. Metabolic herbicide resistance can potentially endow resistance to many herbicides and poses a threat to herbicide sustainability and thus crop production, calling for major research and management efforts. © 2015 Society of Chemical Industry.
Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M.; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A.; Gilks, C. Blake; Huntsman, David G.; McAlpine, Jessica N.; Aparicio, Samuel
2014-01-01
The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. PMID:25060187
Visschedijk, Marijn C; Alberts, Rudi; Mucha, Soren; Deelen, Patrick; de Jong, Dirk J; Pierik, Marieke; Spekhorst, Lieke M; Imhann, Floris; van der Meulen-de Jong, Andrea E; van der Woude, C Janneke; van Bodegraven, Adriaan A; Oldenburg, Bas; Löwenberg, Mark; Dijkstra, Gerard; Ellinghaus, David; Schreiber, Stefan; Wijmenga, Cisca; Rivas, Manuel A; Franke, Andre; van Diemen, Cleo C; Weersma, Rinse K
2016-01-01
Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.
Nóbrega de Sousa, Taís; Carvalho, Luzia Helena; Alves de Brito, Cristiana Ferreira
2011-01-01
The dependence of Plasmodium vivax on invasion mediated by Duffy binding protein (DBP) makes this protein a prime candidate for development of a vaccine. However, the development of a DBP-based vaccine might be hampered by the high variability of the protein ligand (DBP(II)), known to bias the immune response toward a specific DBP variant. Here, the hypothesis being investigated is that the analysis of the worldwide DBP(II) sequences will allow us to determine the minimum number of haplotypes (MNH) to be included in a DBP-based vaccine of broad coverage. For that, all DBP(II) sequences available were compiled and MNH was based on the most frequent nonsynonymous single nucleotide polymorphisms, the majority mapped on B and T cell epitopes. A preliminary analysis of DBP(II) genetic diversity from eight malaria-endemic countries estimated that a number between two to six DBP haplotypes (17 in total) would target at least 50% of parasite population circulating in each endemic region. Aiming to avoid region-specific haplotypes, we next analyzed the MNH that broadly cover worldwide parasite population. The results demonstrated that seven haplotypes would be required to cover around 60% of DBP(II) sequences available. Trying to validate these selected haplotypes per country, we found that five out of the eight countries will be covered by the MNH (67% of parasite populations, range 48-84%). In addition, to identify related subgroups of DBP(II) sequences we used a Bayesian clustering algorithm. The algorithm grouped all DBP(II) sequences in six populations that were independent of geographic origin, with ancestral populations present in different proportions in each country. In conclusion, in this first attempt to undertake a global analysis about DBP(II) variability, the results suggest that the development of DBP-based vaccine should consider multi-haplotype strategies; otherwise a putative P. vivax vaccine may not target some parasite populations.
Bletz, Stefan; Janezic, Sandra; Harmsen, Dag; Rupnik, Maja; Mellmann, Alexander
2018-06-01
Clostridium difficile , recently renamed Clostridioides difficile , is the most common cause of antibiotic-associated nosocomial gastrointestinal infections worldwide. To differentiate endogenous infections and transmission events, highly discriminatory subtyping is necessary. Today, methods based on whole-genome sequencing data are increasingly used to subtype bacterial pathogens; however, frequently a standardized methodology and typing nomenclature are missing. Here we report a core genome multilocus sequence typing (cgMLST) approach developed for C. difficile Initially, we determined the breadth of the C. difficile population based on all available MLST sequence types with Bayesian inference (BAPS). The resulting BAPS partitions were used in combination with C. difficile clade information to select representative isolates that were subsequently used to define cgMLST target genes. Finally, we evaluated the novel cgMLST scheme with genomes from 3,025 isolates. BAPS grouping ( n = 6 groups) together with the clade information led to a total of 11 representative isolates that were included for cgMLST definition and resulted in 2,270 cgMLST genes that were present in all isolates. Overall, 2,184 to 2,268 cgMLST targets were detected in the genome sequences of 70 outbreak-associated and reference strains, and on average 99.3% cgMLST targets (1,116 to 2,270 targets) were present in 2,954 genomes downloaded from the NCBI database, underlining the representativeness of the cgMLST scheme. Moreover, reanalyzing different cluster scenarios with cgMLST were concordant to published single nucleotide variant analyses. In conclusion, the novel cgMLST is representative for the whole C. difficile population, is highly discriminatory in outbreak situations, and provides a unique nomenclature facilitating interlaboratory exchange. Copyright © 2018 American Society for Microbiology.
Exome-wide DNA capture and next generation sequencing in domestic and wild species.
Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon
2011-07-05
Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.
USDA-ARS?s Scientific Manuscript database
Sorghum population for Targeting Induced Local Lesion IN Genome (TILLING) was generated from BTx623 in 2005 and publicly available in 2009. After releasing to the public, this population was intensively screened by morphological observation in the field and a number of mutants with useful traits wer...
NASA Technical Reports Server (NTRS)
El Fantroussi, Said; Urakawa, Hidetoshi; Bernhard, Anne E.; Kelly, John J.; Noble, Peter A.; Smidt, H.; Yershov, G. M.; Stahl, David A.
2003-01-01
Oligonucleotide microarrays were used to profile directly extracted rRNA from environmental microbial populations without PCR amplification. In our initial inspection of two distinct estuarine study sites, the hybridization patterns were reproducible and varied between estuarine sediments of differing salinities. The determination of a thermal dissociation curve (i.e., melting profile) for each probe-target duplex provided information on hybridization specificity, which is essential for confirming adequate discrimination between target and nontarget sequences.
Differential effects of RNAi treatments on field populations of the western corn rootworm.
Chu, Chia-Ching; Sun, Weilin; Spencer, Joseph L; Pittendrigh, Barry R; Seufferheld, Manfredo J
2014-03-01
RNA interference (RNAi) mediated crop protection against insect pests is a technology that is greatly anticipated by the academic and industrial pest control communities. Prior to commercialization, factors influencing the potential for evolution of insect resistance to RNAi should be evaluated. While mutations in genes encoding the RNAi machinery or the sequences targeted for interference may serve as a prominent mechanism of resistance evolution, differential effects of RNAi on target pests may also facilitate such evolution. However, to date, little is known about how variation of field insect populations could influence the effectiveness of RNAi treatments. To approach this question, we evaluated the effects of RNAi treatments on adults of three western corn rootworm (WCR; Diabrotica virgifera virgifera LeConte) populations exhibiting different levels of gut cysteine protease activity, tolerance of soybean herbivory, and immune gene expression; two populations were collected from crop rotation-resistant (RR) problem areas and one from a location where RR was not observed (wild type; WT). Our results demonstrated that RNAi targeting DvRS5 (a highly expressed cysteine protease gene) reduced gut cysteine protease activity in all three WCR populations. However, the proportion of the cysteine protease activity that was inhibited varied across populations. When WCR adults were treated with double-stranded RNA of an immune gene att1, different changes in survival among WT and RR populations on soybean diets occurred. Notably, for both genes, the sequences targeted for RNAi were the same across all populations examined. These findings indicate that the effectiveness of RNAi treatments could vary among field populations depending on their physiological and genetic backgrounds and that the consistency of an RNAi trait's effectiveness on phenotypically different populations should be considered or tested prior to wide deployment. Also, genes that are potentially subjected to differential selection in the field should be avoided for RNAi-based pest control. Published by Elsevier Inc.
Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David
2018-04-11
Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banfield, Jillian; Breitbart, Mya; VerBerkmoes, Nathan
CRISPRs (clustered regularly interspaced short palindromic repeats) are adaptive immune systems in Bacteria and Archaea. Transcripts of the spacers that separate the repeats confer immunity through sequence identity with a targeted region (proto-spacer) in phage/viral, plasmid, or other foreign DNA. Short sequences immediately flanking the proto-spacer (proto-spacer adjacent motifs—PAMs) are important in both procuring spacers from and providing immunity to targeted sequences. New spacers are incorporated unidirectionally at the leader end of the CRISPR loci, thus recording a timeline of recent viral exposure. In the early phase of our research, we documented extremely rapid diversification of the CRISPR loci inmore » natural populations [Tyson and Banfield, 2008] matched by high levels of sequence variation in natural viral populations [Andersson and Banfield, 2008]. Since then, in a genetically tractable model laboratory system, we have 1) tracked phage mutation and CRISPR diversification, and in a natural model system, we have 2) examined population history via over time, 3) investigated the timescale over which spacers become ineffective and the process by which ineffective spacers are removed, and 4) analyzed viral diversity. In addition to research activities, our group has organized five international CRISPR meetings, the fifth to be held at University of California, Berkeley in June 2012. Most importantly, the project provided the majority of funding support for Christine Sun (Ph.D. 2012).« less
Faulon, Jean-Loup; Misra, Milind; Martin, Shawn; ...
2007-11-23
Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformaticsmore » representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.« less
Recent research on the high-probability instructional sequence: A brief review.
Lipschultz, Joshua; Wilder, David A
2017-04-01
The high-probability (high-p) instructional sequence consists of the delivery of a series of high-probability instructions immediately before delivery of a low-probability or target instruction. It is commonly used to increase compliance in a variety of populations. Recent research has described variations of the high-p instructional sequence and examined the conditions under which the sequence is most effective. This manuscript reviews the most recent research on the sequence and identifies directions for future research. Recommendations for practitioners regarding the use of the high-p instructional sequence are also provided. © 2017 Society for the Experimental Analysis of Behavior.
Ha, Gavin; Roth, Andrew; Khattra, Jaswinder; Ho, Julie; Yap, Damian; Prentice, Leah M; Melnyk, Nataliya; McPherson, Andrew; Bashashati, Ali; Laks, Emma; Biele, Justina; Ding, Jiarui; Le, Alan; Rosner, Jamie; Shumansky, Karey; Marra, Marco A; Gilks, C Blake; Huntsman, David G; McAlpine, Jessica N; Aparicio, Samuel; Shah, Sohrab P
2014-11-01
The evolution of cancer genomes within a single tumor creates mixed cell populations with divergent somatic mutational landscapes. Inference of tumor subpopulations has been disproportionately focused on the assessment of somatic point mutations, whereas computational methods targeting evolutionary dynamics of copy number alterations (CNA) and loss of heterozygosity (LOH) in whole-genome sequencing data remain underdeveloped. We present a novel probabilistic model, TITAN, to infer CNA and LOH events while accounting for mixtures of cell populations, thereby estimating the proportion of cells harboring each event. We evaluate TITAN on idealized mixtures, simulating clonal populations from whole-genome sequences taken from genomically heterogeneous ovarian tumor sites collected from the same patient. In addition, we show in 23 whole genomes of breast tumors that the inference of CNA and LOH using TITAN critically informs population structure and the nature of the evolving cancer genome. Finally, we experimentally validated subclonal predictions using fluorescence in situ hybridization (FISH) and single-cell sequencing from an ovarian cancer patient sample, thereby recapitulating the key modeling assumptions of TITAN. © 2014 Ha et al.; Published by Cold Spring Harbor Laboratory Press.
Diversity and function in microbial mats from the Lucky Strike hydrothermal vent field.
Crépeau, Valentin; Cambon Bonavita, Marie-Anne; Lesongeur, Françoise; Randrianalivelo, Henintsoa; Sarradin, Pierre-Marie; Sarrazin, Jozée; Godfroy, Anne
2011-06-01
Diversity and function in microbial mats from the Lucky Strike hydrothermal vent field (Mid-Atlantic Ridge) were investigated using molecular approaches. DNA and RNA were extracted from mat samples overlaying hydrothermal deposits and Bathymodiolus azoricus mussel assemblages. We constructed and analyzed libraries of 16S rRNA gene sequences and sequences of functional genes involved in autotrophic carbon fixation [forms I and II RuBisCO (cbbL/M), ATP-citrate lyase B (aclB)]; methane oxidation [particulate methane monooxygenase (pmoA)] and sulfur oxidation [adenosine-5'-phosphosulfate reductase (aprA) and soxB]. To gain new insights into the relationships between mats and mussels, we also used new domain-specific 16S rRNA gene primers targeting Bathymodiolus sp. symbionts. All identified archaeal sequences were affiliated with a single group: the marine group 1 Thaumarchaeota. In contrast, analyses of bacterial sequences revealed much higher diversity, although two phyla Proteobacteria and Bacteroidetes were largely dominant. The 16S rRNA gene sequence library revealed that species affiliated to Beggiatoa Gammaproteobacteria were the dominant active population. Analyses of DNA and RNA functional gene libraries revealed a diverse and active chemolithoautotrophic population. Most of these sequences were affiliated with Gammaproteobacteria, including hydrothermal fauna symbionts, Thiotrichales and Methylococcales. PCR and reverse transcription-PCR using 16S rRNA gene primers targeted to Bathymodiolus sp. symbionts revealed sequences affiliated with both methanotrophic and thiotrophic endosymbionts. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Metatranscriptomics of N2-fixing cyanobacteria in the Amazon River plume
Hilton, Jason A; Satinsky, Brandon M; Doherty, Mary; Zielinski, Brian; Zehr, Jonathan P
2015-01-01
Biological N2 fixation is an important nitrogen source for surface ocean microbial communities. However, nearly all information on the diversity and gene expression of organisms responsible for oceanic N2 fixation in the environment has come from targeted approaches that assay only a small number of genes and organisms. Using genomes of diazotrophic cyanobacteria to extract reads from extensive meta-genomic and -transcriptomic libraries, we examined diazotroph diversity and gene expression from the Amazon River plume, an area characterized by salinity and nutrient gradients. Diazotroph genome and transcript sequences were most abundant in the transitional waters compared with lower salinity or oceanic water masses. We were able to distinguish two genetically divergent phylotypes within the Hemiaulus-associated Richelia sequences, which were the most abundant diazotroph sequences in the data set. Photosystem (PS)-II transcripts in Richelia populations were much less abundant than those in Trichodesmium, and transcripts from several Richelia PS-II genes were absent, indicating a prominent role for cyclic electron transport in Richelia. In addition, there were several abundant regulatory transcripts, including one that targets a gene involved in PS-I cyclic electron transport in Richelia. High sequence coverage of the Richelia transcripts, as well as those from Trichodesmium populations, allowed us to identify expressed regions of the genomes that had been overlooked by genome annotations. High-coverage genomic and transcription analysis enabled the characterization of distinct phylotypes within diazotrophic populations, revealed a distinction in a core process between dominant populations and provided evidence for a prominent role for noncoding RNAs in microbial communities. PMID:25514535
Martins, Ademir Jesus; Lins, Rachel Mazzei Moura de Andrade; Linss, Jutta Gerlinde Birgitt; Peixoto, Alexandre Afranio; Valle, Denise
2009-07-01
The nature of pyrethroid resistance in Aedes aegypti Brazilian populations was investigated. Quantification of enzymes related to metabolic resistance in two distinct populations, located in the Northeast and Southeast regions, revealed increases in Glutathione-S-transferase (GST) and Esterase levels. Additionally, polymorphism was found in the IIS6 region of Ae. aegypti voltage-gated sodium channel (AaNa(V)), the pyrethroid target site. Sequences were classified in two haplotype groups, A and B, according to the size of the intron in that region. Rockefeller, a susceptible control lineage, contains only B sequences. In field populations, some A sequences present a substitution in the 1011 site (Ile/Met). When resistant and susceptible individuals were compared, the frequency of both A (with the Met mutation) and B sequences were slightly increased in resistant specimens. The involvement of the AaNa(V) polymorphism in pyrethroid resistance and the metabolic mechanisms that lead to potential cross-resistance between organophosphate and pyrethroids are discussed.
Feng, Hui; Gupta, Bhavna; Wang, Meilian; Zheng, Wenqi; Zheng, Li; Zhu, Xiaotong; Yang, Yimei; Fang, Qiang; Luo, Enjie; Fan, Qi; Tsuboi, Takafumi; Cao, Yaming; Cui, Liwang
2015-12-01
The male gamete fertilization factor P48/45 in malaria parasites is a prime transmission-blocking vaccine (TBV) candidate. Efforts to develop antimalarial vaccines are often thwarted by genetic diversity of the target antigens. Here we evaluated the genetic diversity of Pvs48/45 gene in global Plasmodium vivax populations. We determined 200 Pvs48/45 sequences collected from temperate and subtropical parasite populations in China. Population genetic and evolutionary analyses were performed to determine the levels of genetic diversity, potential signature of selection, and population differentiation. Analysis of the Pvs48/45 sequences from 200 P. vivax parasites collected in a temperate and a tropical region revealed a low level of genetic diversity (π = 0.0012) with 14 single nucleotide polymorphisms, of which 11 were nonsynonymous. Analysis of 344 Pvs48/45 sequences from nine worldwide P. vivax populations detected a total of 38 haplotypes, of which 13 haplotypes were present only once. Multiple tests for selection confirmed a signature of positive selection on Pvs48/45 with selection skewed to the second cysteine domain. Haplotype network analysis and Wright's fixation index showed large geographical differentiation with the presence of continent-or region-specific mutations in this gene. Pvs48/45 displays low levels of genetic diversity with the presence of region-specific mutations. Some of the mutations may be potential epitope targets based on their positions in the predicted structure, highlighting the need for future evaluation of these mutations in designing Pvs48/45-based TBV.
A CRISPR-Cas9 sex-ratio distortion system for genetic control
Galizi, Roberto; Hammond, Andrew; Kyrou, Kyros; Taxiarchi, Chrysanthi; Bernardini, Federica; O’Loughlin, Samantha M.; Papathanos, Philippos-Aris; Nolan, Tony; Windbichler, Nikolai; Crisanti, Andrea
2016-01-01
Genetic control aims to reduce the ability of insect pest populations to cause harm via the release of modified insects. One strategy is to bias the reproductive sex ratio towards males so that a population decreases in size or is eliminated altogether due to a lack of females. We have shown previously that sex ratio distortion can be generated synthetically in the main human malaria vector Anopheles gambiae, by selectively destroying the X-chromosome during spermatogenesis, through the activity of a naturally-occurring endonuclease that targets a repetitive rDNA sequence highly-conserved in a wide range of organisms. Here we describe a CRISPR-Cas9 sex distortion system that targets ribosomal sequences restricted to the member species of the Anopheles gambiae complex. Expression of Cas9 during spermatogenesis resulted in RNA-guided shredding of the X-chromosome during male meiosis and produced extreme male bias among progeny in the absence of any significant reduction in fertility. The flexibility of CRISPR-Cas9 combined with the availability of genomic data for a range of insects renders this strategy broadly applicable for the species-specific control of any pest or vector species with an XY sex-determination system by targeting sequences exclusive to the female sex chromosome. PMID:27484623
Phage-mediated Delivery of Targeted sRNA Constructs to Knock Down Gene Expression in E. coli.
Bernheim, Aude G; Libis, Vincent K; Lindner, Ariel B; Wintermute, Edwin H
2016-03-20
RNA-mediated knockdowns are widely used to control gene expression. This versatile family of techniques makes use of short RNA (sRNA) that can be synthesized with any sequence and designed to complement any gene targeted for silencing. Because sRNA constructs can be introduced to many cell types directly or using a variety of vectors, gene expression can be repressed in living cells without laborious genetic modification. The most common RNA knockdown technology, RNA interference (RNAi), makes use of the endogenous RNA-induced silencing complex (RISC) to mediate sequence recognition and cleavage of the target mRNA. Applications of this technique are therefore limited to RISC-expressing organisms, primarily eukaryotes. Recently, a new generation of RNA biotechnologists have developed alternative mechanisms for controlling gene expression through RNA, and so made possible RNA-mediated gene knockdowns in bacteria. Here we describe a method for silencing gene expression in E. coli that functionally resembles RNAi. In this system a synthetic phagemid is designed to express sRNA, which may designed to target any sequence. The expression construct is delivered to a population of E. coli cells with non-lytic M13 phage, after which it is able to stably replicate as a plasmid. Antisense recognition and silencing of the target mRNA is mediated by the Hfq protein, endogenous to E. coli. This protocol includes methods for designing the antisense sRNA, constructing the phagemid vector, packaging the phagemid into M13 bacteriophage, preparing a live cell population for infection, and performing the infection itself. The fluorescent protein mKate2 and the antibiotic resistance gene chloramphenicol acetyltransferase (CAT) are targeted to generate representative data and to quantify knockdown effectiveness.
On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution
NASA Astrophysics Data System (ADS)
Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein
2018-07-01
One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.
On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution
NASA Astrophysics Data System (ADS)
Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein
2017-12-01
One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.
Conservation genetics of the fisher (Martes pennanti) based on mitochondrial DNA sequencing
R.E. Drew; J.G. Hallett; K.B. Aubry; K.W. Cullings; S.M Koepf; W.J. Zielinski
2003-01-01
Translocation of animals to re-establish extirpated populations or to maintain declining ones has often been carried out without genetic information on source or target opulations, or adequate consideration of the potential...
Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya
2015-08-01
Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Identifying currents in the gene pool for bacterial populations using an integrative approach.
Tang, Jing; Hanage, William P; Fraser, Christophe; Corander, Jukka
2009-08-01
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.
Illuminator, a desktop program for mutation detection using short-read clonal sequencing.
Carr, Ian M; Morgan, Joanne E; Diggle, Christine P; Sheridan, Eamonn; Markham, Alexander F; Logan, Clare V; Inglehearn, Chris F; Taylor, Graham R; Bonthron, David T
2011-10-01
Current methods for sequencing clonal populations of DNA molecules yield several gigabases of data per day, typically comprising reads of < 100 nt. Such datasets permit widespread genome resequencing and transcriptome analysis or other quantitative tasks. However, this huge capacity can also be harnessed for the resequencing of smaller (gene-sized) target regions, through the simultaneous parallel analysis of multiple subjects, using sample "tagging" or "indexing". These methods promise to have a huge impact on diagnostic mutation analysis and candidate gene testing. Here we describe a software package developed for such studies, offering the ability to resolve pooled samples carrying barcode tags and to align reads to a reference sequence using a mutation-tolerant process. The program, Illuminator, can identify rare sequence variants, including insertions and deletions, and permits interactive data analysis on standard desktop computers. It facilitates the effective analysis of targeted clonal sequencer data without dedicated computational infrastructure or specialized training. Copyright © 2011 Elsevier Inc. All rights reserved.
Wei, Lin; Wu, Xian-Jin
2012-01-01
Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included. PMID:22942696
Wei, Lin; Wu, Xian-Jin
2012-01-01
Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included.
Shilaih, Mohaned; Marzel, Alex; Yang, Wan Lin; Scherrer, Alexandra U.; Schüpbach, Jörg; Böni, Jürg; Yerly, Sabine; Hirsch, Hans H.; Aubert, Vincent; Cavassini, Matthias; Klimkait, Thomas; Vernazza, Pietro L.; Bernasconi, Enos; Furrer, Hansjakob; Günthard, Huldrych F.; Kouyos, Roger; Battegay, Manuel; Braun, Dominique; Bucher, Heiner; Burton-Jeangros, Claudine; Calmy, Alexandra; Dollenmaier, Günter; Egger, Matthias; Elzi, Luigia; Fehr, Jan; Fellay, Jaque; Fux, Christoph; Gorgievski, Meri; Haerry, David; Hasse, Barbara; Hoffmann, Matthias; Hösli, Irene; Kahlert, Christian; Kaiser, Laurent; Keiser, Olivia; Kovari, Helen; Ledergerber, Bruno; Martinetti, Gladys; de Tejada, Begoña Martinez; Marzolini, Catia; Metzner, Karin; Müller, Nicolas; Nadal, David; Nicca, Dunja; Pantaleo, Giuseppe; Rauch, Andre; Regenass, Stephan; Rudin, Christoph; Schöni-Affolter, Franziska; Schmid, Patrick; Speck, Roberto; Stöckle, Marcel; Tarr, Philip; Trkola, Alexandra; Weber, Reiner
2016-01-01
Targeting hard-to-reach/marginalized populations is essential for preventing HIV-transmission. A unique opportunity to identify such populations in Switzerland is provided by a database of all genotypic-resistance-tests from Switzerland, including both sequences from the Swiss HIV Cohort Study (SHCS) and non-cohort sequences. A phylogenetic tree was built using 11,127 SHCS and 2,875 Swiss non-SHCS sequences. Demographics were imputed for non-SHCS patients using a phylogenetic proximity approach. Factors associated with non-cohort outbreaks were determined using logistic regression. Non-B subtype (univariable odds-ratio (OR): 1.9; 95% confidence interval (CI): 1.8–2.1), female gender (OR: 1.6; 95% CI: 1.4–1.7), black ethnicity (OR: 1.9; 95% CI: 1.7–2.1) and heterosexual transmission group (OR:1.8; 95% CI: 1.6–2.0), were all associated with underrepresentation in the SHCS. We found 344 purely non-SHCS transmission clusters, however, these outbreaks were small (median 2, maximum 7 patients) with a strong overlap with the SHCS’. 65% of non-SHCS sequences were part of clusters composed of >= 50% SHCS sequences. Our data suggests that marginalized-populations are underrepresented in the SHCS. However, the limited size of outbreaks among non-SHCS patients in-care implies that no major HIV outbreak in Switzerland was missed by the SHCS surveillance. This study demonstrates the potential of sequence data to assess and extend the scope of infectious-disease surveillance. PMID:27297284
Shilaih, Mohaned; Marzel, Alex; Yang, Wan Lin; Scherrer, Alexandra U; Schüpbach, Jörg; Böni, Jürg; Yerly, Sabine; Hirsch, Hans H; Aubert, Vincent; Cavassini, Matthias; Klimkait, Thomas; Vernazza, Pietro L; Bernasconi, Enos; Furrer, Hansjakob; Günthard, Huldrych F; Kouyos, Roger
2016-06-14
Targeting hard-to-reach/marginalized populations is essential for preventing HIV-transmission. A unique opportunity to identify such populations in Switzerland is provided by a database of all genotypic-resistance-tests from Switzerland, including both sequences from the Swiss HIV Cohort Study (SHCS) and non-cohort sequences. A phylogenetic tree was built using 11,127 SHCS and 2,875 Swiss non-SHCS sequences. Demographics were imputed for non-SHCS patients using a phylogenetic proximity approach. Factors associated with non-cohort outbreaks were determined using logistic regression. Non-B subtype (univariable odds-ratio (OR): 1.9; 95% confidence interval (CI): 1.8-2.1), female gender (OR: 1.6; 95% CI: 1.4-1.7), black ethnicity (OR: 1.9; 95% CI: 1.7-2.1) and heterosexual transmission group (OR:1.8; 95% CI: 1.6-2.0), were all associated with underrepresentation in the SHCS. We found 344 purely non-SHCS transmission clusters, however, these outbreaks were small (median 2, maximum 7 patients) with a strong overlap with the SHCS'. 65% of non-SHCS sequences were part of clusters composed of >= 50% SHCS sequences. Our data suggests that marginalized-populations are underrepresented in the SHCS. However, the limited size of outbreaks among non-SHCS patients in-care implies that no major HIV outbreak in Switzerland was missed by the SHCS surveillance. This study demonstrates the potential of sequence data to assess and extend the scope of infectious-disease surveillance.
A Public Health Model for the Molecular Surveillance of HIV Transmission in San Diego, California
May, Susanne; Tweeten, Samantha; Drumright, Lydia; Pacold, Mary E.; Kosakovsky Pond, Sergei L.; Pesano, Rick L.; Lie, Yolanda S.; Richman, Douglas D.; Frost, Simon D.W.; Woelk, Christopher H.; Little, Susan J.
2009-01-01
Background Current public health efforts often use molecular technologies to identify and contain communicable disease networks, but not for HIV. Here, we investigate how molecular epidemiology can be used to identify highly-related HIV networks within a population and how voluntary contact tracing of sexual partners can be used to selectively target these networks. Methods We evaluated the use of HIV-1 pol sequences obtained from participants of a community-recruited cohort (n=268) and a primary infection research cohort (n=369) to define highly related transmission clusters and the use of contact tracing to link other individuals (n=36) within these clusters. The presence of transmitted drug resistance was interpreted from the pol sequences (Calibrated Population Resistance v3.0). Results Phylogenetic clustering was conservatively defined when the genetic distance between any two pol sequences was <1%, which identified 34 distinct transmission clusters within the combined community-recruited and primary infection research cohorts containing 160 individuals. Although sequences from the epidemiologically-linked partners represented approximately 5% of the total sequences, they clustered with 60% of the sequences that clustered from the combined cohorts (O.R. 21.7; p=<0.01). Major resistance to at least one class of antiretroviral medication was found in 19% of clustering sequences. Conclusions Phylogenetic methods can be used to identify individuals who are within highly related transmission groups, and contact tracing of epidemiologically-linked partners of recently infected individuals can be used to link into previously-defined transmission groups. These methods could be used to implement selectively targeted prevention interventions. PMID:19098493
Identification and characterization of microRNAs in Phaseolus vulgaris by high-throughput sequencing
2012-01-01
Background MicroRNAs (miRNAs) are endogenously encoded small RNAs that post-transcriptionally regulate gene expression. MiRNAs play essential roles in almost all plant biological processes. Currently, few miRNAs have been identified in the model food legume Phaseolus vulgaris (common bean). Recent advances in next generation sequencing technologies have allowed the identification of conserved and novel miRNAs in many plant species. Here, we used Illumina's sequencing by synthesis (SBS) technology to identify and characterize the miRNA population of Phaseolus vulgaris. Results Small RNA libraries were generated from roots, flowers, leaves, and seedlings of P. vulgaris. Based on similarity to previously reported plant miRNAs,114 miRNAs belonging to 33 conserved miRNA families were identified. Stem-loop precursors and target gene sequences for several conserved common bean miRNAs were determined from publicly available databases. Less conserved miRNA families and species-specific common bean miRNA isoforms were also characterized. Moreover, novel miRNAs based on the small RNAs were found and their potential precursors were predicted. In addition, new target candidates for novel and conserved miRNAs were proposed. Finally, we studied organ-specific miRNA family expression levels through miRNA read frequencies. Conclusions This work represents the first massive-scale RNA sequencing study performed in Phaseolus vulgaris to identify and characterize its miRNA population. It significantly increases the number of miRNAs, precursors, and targets identified in this agronomically important species. The miRNA expression analysis provides a foundation for understanding common bean miRNA organ-specific expression patterns. The present study offers an expanded picture of P. vulgaris miRNAs in relation to those of other legumes. PMID:22394504
Nabavi, Reza; Conneely, Brendan; McCarthy, Elaine; Good, Barbara; Shayan, Parviz; DE Waal, Theo
2014-09-01
Accurate identification of sheep nematodes is a critical point in epidemiological studies and monitoring of drug resistance in flocks. However, due to a close morphological similarity between the eggs and larval stages of many of these nematodes, such identification is not a trivial task. There are a number of studies showing that molecular targets in ribosomal DNA (Internal transcribed spacer 1, 2 and Intergenic spacer) are suitable for accurate identification of sheep bursate nematodes. The objective of present study was to compare the ITS1, ITS2 and IGS regions of Iranian common bursate nematodes in order to choose best target for specific identification methods. The first and second internal transcribed spacers (ITS1and ITS2) and intergenic spacer (IGS) of the ribosomal DNA (rDNA) of 5 common Iranian bursate nematodes of sheep were sequenced. The sequences of some non-Iranian isolates were used for comparison in order to evaluate the variation in sequence homology between geographically different nematode populations. Comparison of the ITS1 and ITS2 sequences of Iranian nematodes showed greatest similarity among Teladorsagia circumcincta and Marshallagia marshalli of 94% and 88%, respectively. While Trichostrongylus colubriformis and M. marshalli showed the highest homology (99%) in the IGS sequences. Comparison of the spacer sequences of Iranian with non-Iranian isolates showed significantly higher variation in Haemonchus contortus compared to the other species. Both the ITS1 and ITS2 sequences are convenient targets to have species-specific identification of Iranian bursate nematodes. On the other hand the IGS region may be a less suitable molecular target.
de Lillo, A.; Booth, V.; Kyriacou, L.; Weightman, A. J.; Wade, W. G.
2004-01-01
Periodontitis is the commonest bacterial disease of humans and is the major cause of adult tooth loss. About half of the oral microflora is unculturable; and 16S rRNA PCR, cloning, and sequencing techniques have demonstrated the high level of species richness of the oral microflora. In the present study, a PCR primer set specific for the genera Porphyromonas and Tannerella was designed and used to analyze the bacterial populations in subgingival plaque samples from inflamed shallow and deep sites in subjects with periodontitis and shallow sites in age- and sex-matched controls. A total of 308 clones were sequenced and found to belong to one of six Porphyromonas or Tannerella species or phylotypes, one of which, Porphyromonas P3, was novel. Tannerella forsythensis was found in significantly higher proportions in patients than in controls. Porphyromonas catoniae and Tannerella phylotype BU063 appeared to be associated with shallow sites. Targeted culture-independent molecular ecology studies have a valuable role to play in the identification of bacterial targets for further investigations of the pathogenesis of bacterial infections. PMID:15583276
[Genetic variants in miRNAs and its association with breast cancer].
Méndez-Gómez, Susana; Ruiz Esparza-Garrido, Ruth; Velázquez-Flores, Miguel; Dolores-Vergara, Maria; Salamanca-Gómez, Fabio; Arenas-Aranda, Diego Julio
2014-01-01
In Mexico, breast cancer represents the first cause of cancer death in females. At the molecular level, non-coding RNAs and especially microRNAs have played an important role in the origin and development of this neoplasm In the Anglo-Saxon population, diverse genetic variants in microRNA genes and in their targets are associated with the development of this disease. In the Mexican population it is not known if these or other variants exist. Identification of these or new variants in our population is fundamental in order to have a better understanding of cancer development and to help establish a better diagnostic strategy. DNA was isolated from mammary tumors, adjacent tissue and peripheral blood of Mexican females with or without cancer. From DNA, five microRNA genes and three of their targets were amplified and sequenced. Genetic variants associated with breast cancer in an Anglo- Saxon population have been previously identified in these sequences. In the samples studied we identified seven single nucleotide polymorphisms (SNPs). Two had not been previously described and were identified only in women with cancer. The new variants may be genetic predisposition factors for the development of breast cancer in our population. Further experiments are needed to determine the involvement of these variants in the development, establishment and progression of breast cancer.
Laser pulses for coherent xuv Raman excitation
NASA Astrophysics Data System (ADS)
Greenman, Loren; Koch, Christiane P.; Whaley, K. Birgitta
2015-07-01
We combine multichannel electronic structure theory with quantum optimal control to derive femtosecond-time-scale Raman pulse sequences that coherently populate a valence excited state. For a neon atom, Raman target populations of up to 13% are obtained. Superpositions of the ground and valence Raman states with a controllable relative phase are found to be reachable with up to 4.5% population and arbitrary phase control facilitated by the pump pulse carrier-envelope phase. Analysis of the optimized pulse structure reveals a sequential mechanism in which the valence excitation is reached via a fast (femtosecond) population transfer through an intermediate resonance state in the continuum rather than avoiding intermediate-state population with simultaneous or counterintuitive (stimulated Raman adiabatic passage) pulse sequences. Our results open a route to coupling valence excitations and core-hole excitations in molecules and aggregates that locally address specific atoms and represent an initial step towards realization of multidimensional spectroscopy in the xuv and x-ray regimes.
Devesse, Laurence; Ballard, David; Davenport, Lucinda; Riethorst, Immy; Mason-Buck, Gabriella; Syndercombe Court, Denise
2018-05-01
By using sequencing technology to genotype loci of forensic interest it is possible to simultaneously target autosomal, X and Y STRs as well as identity, ancestry and phenotypic informative SNPs, resulting in a breadth of data obtained from a single run that is considerable when compared to that generated with standard technologies. It is important however that this information aligns with the genotype data currently obtained using commercially available kits for CE-based investigations such that results are compatible with existing databases and hence can be of use to the forensic community. In this work, 400 samples were typed using commercially available STR kits and CE, as well as using the Ilumina ForenSeq™ DNA Signature Prep Kit and MiSeq ® FGx to assess concordance of autosomal STRs and population variability. Results show a concordance rate between the two technologies exceeding 99.98% while numerous novel sequence based alleles are described. In order to make use of the sequence variation observed, sequence specific allele frequencies were generated for White British and British Chinese populations. Copyright © 2017 Elsevier B.V. All rights reserved.
Alcántara-de la Cruz, Ricardo; Fernández-Moreno, Pablo T.; Ozuna, Carmen V.; Rojano-Delgado, Antonia M.; Cruz-Hipolito, Hugo E.; Domínguez-Valenzuela, José A.; Barro, Francisco; De Prado, Rafael
2016-01-01
In 2014 hairy beggarticks (Bidens pilosa L.) has been identified as being glyphosate-resistant in citrus orchards from Mexico. The target and non-target site mechanisms involved in the response to glyphosate of two resistant populations (R1 and R2) and one susceptible (S) were studied. Experiments of dose-response, shikimic acid accumulation, uptake-translocation, enzyme activity and 5-enolpyruvyl shikimate-3-phosphate synthase (EPSPS) gene sequencing were carried out in each population. The R1 and R2 populations were 20.4 and 2.8-fold less glyphosate sensitive, respectively, than the S population. The resistant populations showed a lesser shikimic acid accumulation than the S population. In the latter one, 24.9% of 14C-glyphosate was translocated to the roots at 96 h after treatment; in the R1 and R2 populations only 12.9 and 15.5%, respectively, was translocated. Qualitative results confirmed the reduced 14C-glyphosate translocation in the resistant populations. The EPSPS enzyme activity of the S population was 128.4 and 8.5-fold higher than the R1 and R2 populations of glyphosate-treated plants, respectively. A single (Pro-106-Ser), and a double (Thr-102-Ile followed by Pro-106-Ser) mutations were identified in the EPSPS2 gene conferred high resistance in R1 population. Target-site mutations associated with a reduced translocation were responsible for the higher glyphosate resistance in the R1 population. The low-intermediate resistance of the R2 population was mediated by reduced translocation. This is the first glyphosate resistance case confirmed in hairy beggarticks in the world. PMID:27752259
PCR detection of uncultured rumen bacteria.
Rosero, Jaime A; Strosová, Lenka; Mrázek, Jakub; Fliegerová, Kateřina; Kopečný, Jan
2012-07-01
16S rRNA sequences of ruminal uncultured bacterial clones from public databases were phylogenetically examined. The sequences were found to form two unique clusters not affiliated with any known bacterial species: cluster of unidentified sequences of free floating rumen fluid uncultured bacteria (FUB) and cluster of unidentified sequences of bacteria associated with rumen epithelium (AUB). A set of PCR primers targeting 16S rRNA of ruminal free uncultured bacteria and rumen epithelium adhering uncultured bacteria was designed based on these sequences. FUB primers were used for relative quantification of uncultured bacteria in ovine rumen samples. The effort to increase the population size of FUB group has been successful in sulfate reducing broth and culture media supplied with cellulose.
Rubio, Justin P.; Topp, Simon; Warren, Liling; St Jean, Pamela L.; Wegmann, Daniel; Kessner, Darren; Novembre, John; Shen, Judong; Fraser, Dana; Aponte, Jennifer; Nangle, Keith; Cardon, Lon R.; Ehm, Margaret G.; Chissoe, Stephanie L.; Whittaker, John C.; Nelson, Matthew R.; Mooser, Vincent E.
2012-01-01
Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single nucleotide variants (SNVs), 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Amongst Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci. PMID:22415848
Rubio, Justin P; Topp, Simon; Warren, Liling; St Jean, Pamela L; Wegmann, Daniel; Kessner, Darren; Novembre, John; Shen, Judong; Fraser, Dana; Aponte, Jennifer; Nangle, Keith; Cardon, Lon R; Ehm, Margaret G; Chissoe, Stephanie L; Whittaker, John C; Nelson, Matthew R; Mooser, Vincent E
2012-07-01
Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single-nucleotide variants, 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Among Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci. © 2012 Wiley Periodicals, Inc.
A massively parallel strategy for STR marker development, capture, and genotyping.
Kistler, Logan; Johnson, Stephen M; Irwin, Mitchell T; Louis, Edward E; Ratan, Aakrosh; Perry, George H
2017-09-06
Short tandem repeat (STR) variants are highly polymorphic markers that facilitate powerful population genetic analyses. STRs are especially valuable in conservation and ecological genetic research, yielding detailed information on population structure and short-term demographic fluctuations. Massively parallel sequencing has not previously been leveraged for scalable, efficient STR recovery. Here, we present a pipeline for developing STR markers directly from high-throughput shotgun sequencing data without a reference genome, and an approach for highly parallel target STR recovery. We employed our approach to capture a panel of 5000 STRs from a test group of diademed sifakas (Propithecus diadema, n = 3), endangered Malagasy rainforest lemurs, and we report extremely efficient recovery of targeted loci-97.3-99.6% of STRs characterized with ≥10x non-redundant sequence coverage. We then tested our STR capture strategy on P. diadema fecal DNA, and report robust initial results and suggestions for future implementations. In addition to STR targets, this approach also generates large, genome-wide single nucleotide polymorphism (SNP) panels from flanking regions. Our method provides a cost-effective and scalable solution for rapid recovery of large STR and SNP datasets in any species without needing a reference genome, and can be used even with suboptimal DNA more easily acquired in conservation and ecological studies. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.
Quantitative Tracking of Combinatorially Engineered Populations with Multiplexed Binary Assemblies.
Zeitoun, Ramsey I; Pines, Gur; Grau, Willliam C; Gill, Ryan T
2017-04-21
Advances in synthetic biology and genomics have enabled full-scale genome engineering efforts on laboratory time scales. However, the absence of sufficient approaches for mapping engineered genomes at system-wide scales onto performance has limited the adoption of more sophisticated algorithms for engineering complex biological systems. Here we report on the development and application of a robust approach to quantitatively map combinatorially engineered populations at scales up to several dozen target sites. This approach works by assembling genome engineered sites with cell-specific barcodes into a format compatible with high-throughput sequencing technologies. This approach, called barcoded-TRACE (bTRACE) was applied to assess E. coli populations engineered by recursive multiplex recombineering across both 6-target sites and 31-target sites. The 31-target library was then tracked throughout growth selections in the presence and absence of isopentenol (a potential next-generation biofuel). We also use the resolution of bTRACE to compare the influence of technical and biological noise on genome engineering efforts.
Liu, Jing; Wang, Zheng; He, Guanglin; Zhao, Xueying; Wang, Mengge; Luo, Tao; Li, Chengtao; Hou, Yiping
2018-07-01
Massively parallel sequencing (MPS) technologies can sequence many targeted regions of multiple samples simultaneously and are gaining great interest in the forensic community. The Precision ID Identity Panel contains 90 autosomal SNPs and 34 upper Y-Clade SNPs, which was designed with small amplicons and optimized for forensic degraded or challenging samples. Here, 184 unrelated individuals from three East Asian minority ethnicities (Tibetan, Uygur and Hui) were analyzed using the Precision ID Identity Panel and the Ion PGM System. The sequencing performance and corresponding forensic statistical parameters of this MPS-SNP panel were investigated. The inter-population relationships and substructures among three investigated populations and 30 worldwide populations were further investigated using PCA, MDS, cladogram and STRUCTURE. No significant deviation from Hardy-Weinberg equilibrium (HWE) and Linkage Disequilibrium (LD) tests was observed across all 90 autosomal SNPs. The combined matching probability (CMP) for Tibetan, Uygur and Hui were 2.5880 × 10 -33 , 1.7480 × 10 -35 and 4.6326 × 10 -34 respectively, and the combined power of exclusion (CPE) were 0.999999386152271, 0.999999607712827 and 0.999999696360182 respectively. For 34 Y-SNPs, only 16 haplogroups were obtained, but the haplogroup distributions differ among the three populations. Tibetans from the Sino-Tibetan population and Hui with multiple ethnicities with an admixture population have genetic affinity with East Asian populations, while Uygurs of a Eurasian admixture population have similar genetic components to the South Asian populations and are distributed between East Asian and European populations. The aforementioned results suggest that the Precision ID Identity Panel is informative and polymorphic in three investigated populations and could be used as an effective tool for human forensics. Copyright © 2018 Elsevier B.V. All rights reserved.
Mitochondrial targeting sequence variants of the CHCHD2 gene are a risk for Lewy body disorders
Ogaki, Kotaro; Koga, Shunsuke; Heckman, Michael G.; Fiesel, Fabienne C.; Ando, Maya; Labbé, Catherine; Lorenzo-Betancor, Oswaldo; Moussaud-Lamodière, Elisabeth L.; Soto-Ortolaza, Alexandra I.; Walton, Ronald L.; Strongosky, Audrey J.; Uitti, Ryan J.; McCarthy, Allan; Lynch, Timothy; Siuda, Joanna; Opala, Grzegorz; Rudzinska, Monika; Krygowska-Wajs, Anna; Barcikowska, Maria; Czyzewski, Krzysztof; Puschmann, Andreas; Nishioka, Kenya; Funayama, Manabu; Hattori, Nobutaka; Parisi, Joseph E.; Petersen, Ronald C.; Graff-Radford, Neill R.; Boeve, Bradley F.; Springer, Wolfdieter; Wszolek, Zbigniew K.; Dickson, Dennis W.
2015-01-01
Objective: To assess the role of CHCHD2 variants in patients with Parkinson disease (PD) and Lewy body disease (LBD) in Caucasian populations. Methods: All exons of the CHCHD2 gene were sequenced in a US Caucasian patient-control series (878 PD, 610 LBD, and 717 controls). Subsequently, exons 1 and 2 were sequenced in an Irish series (355 PD and 365 controls) and a Polish series (394 PD and 350 controls). Immunohistochemistry and immunofluorescence studies were performed on pathologic LBD cases with rare CHCHD2 variants. Results: We identified 9 rare exonic variants of unknown significance. These variants were more frequent in the combined group of PD and LBD patients compared to controls (0.6% vs 0.1%, p = 0.013). In addition, the presence of any rare variant was more common in patients with LBD (2.5% vs 1.0%, p = 0.050) compared to controls. Eight of these 9 variants were located within the gene's mitochondrial targeting sequence. Conclusions: Although the role of variants of the CHCHD2 gene in PD and LBD remains to be further elucidated, the rare variants in the mitochondrial targeting sequence may be a risk factor for Lewy body disorders, which may link CHCHD2 to other genetic forms of parkinsonism with mitochondrial dysfunction. PMID:26561290
Chen, Shun-Li; Wu, Shiaw-Lin; Huang, Li-Juan; Huang, Jia-Bao; Chen, Shu-Hui
2013-06-01
Liquid chromatography-tandem mass spectrometry-based proteomics for peptide mapping and sequencing was used to characterize the marketed monoclonal antibody trastuzumab and compare it with two biosimilar products, mAb A containing D359E and L361M variations at the Fc site and mAb B without variants. Complete sequence coverage (100%) including disulfide linkages, glycosylations and other commonly occurring modifications (i.e., deamidation, oxidation, dehydration and K-clipping) were identified using maps generated from multi-enzyme digestions. In addition to the targeted comparison for the relative populations of targeted modification forms, a non-targeted approach was used to globally compare ion intensities in tryptic maps. The non-targeted comparison provided an extra-dimensional view to examine any possible differences related to variants or modifications. A peptide containing the two variants in mAb A, D359E and L361M, was revealed using the non-targeted comparison of the tryptic maps. In contrast, no significant differences were observed when trastuzumab was self-compared or compared with mAb B. These results were consistent with the data derived from peptide sequencing via collision induced dissociation/electron transfer dissociation. Thus, combined targeted and non-targeted approaches using powerful mass spectrometry-based proteomic tools hold great promise for the structural characterization of biosimilar products. Copyright © 2013 Elsevier B.V. All rights reserved.
Dynamics of adaptive immunity against phage in bacterial populations
NASA Astrophysics Data System (ADS)
Bradde, Serena; Vucelja, Marija; Tesileanu, Tiberiu; Balasubramanian, Vijay
The CRISPR (clustered regularly interspaced short palindromic repeats) mechanism allows bacteria to adaptively defend against phages by acquiring short genomic sequences (spacers) that target specific sequences in the viral genome. We propose a population dynamical model where immunity can be both acquired and lost. The model predicts regimes where bacterial and phage populations can co-exist, others where the populations oscillate, and still others where one population is driven to extinction. Our model considers two key parameters: (1) ease of acquisition and (2) spacer effectiveness in conferring immunity. Analytical calculations and numerical simulations show that if spacers differ mainly in ease of acquisition, or if the probability of acquiring them is sufficiently high, bacteria develop a diverse population of spacers. On the other hand, if spacers differ mainly in their effectiveness, their final distribution will be highly peaked, akin to a ``winner-take-all'' scenario, leading to a specialized spacer distribution. Bacteria can interpolate between these limiting behaviors by actively tuning their overall acquisition rate.
Alkowari, Moza K; Vozzi, Diego; Bhagat, Shruti; Krishnamoorthy, Navaneethakrishnan; Morgan, Anna; Hayder, Yousra; Logendra, Barathy; Najjar, Nehal; Gandin, Ilaria; Gasparini, Paolo; Badii, Ramin; Girotto, Giorgia; Abdulhadi, Khalid
2017-08-01
Hereditary hearing loss is characterized by a very high genetic heterogeneity. In the Qatari population the role of GJB2, the worldwide HHL major player, seems to be quite limited compared to Caucasian populations. In this study we analysed 18 Qatari families affected by non-syndromic hearing loss using a targeted sequencing approach that allowed us to analyse 81 genes simultaneously. Thanks to this approach, 50% of these families (9 out of 18) resulted positive for the presence of likely causative alleles in 6 different genes: CDH23, MYO6, GJB6, OTOF, TMC1 and OTOA. In particular, 4 novel alleles were detected while the remaining ones were already described to be associated to HHL in other ethnic groups. Molecular modelling has been used to further investigate the role of novel alleles identified in CDH23 and TMC1 genes demonstrating their crucial role in Ca2+ binding and therefore possible functional role in proteins. Present study showed that an accurate molecular diagnosis based on next generation sequencing technologies might largely improve molecular diagnostics outcome leading to benefits for both genetic counseling and definition of recurrence risk. Copyright © 2017 Elsevier B.V. All rights reserved.
Hierarchy and extremes in selections from pools of randomized proteins
Boyer, Sébastien; Biswas, Dipanwita; Kumar Soshee, Ananda; Scaramozzino, Natale; Nizak, Clément; Rivoire, Olivier
2016-01-01
Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different “frameworks” typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution). PMID:26969726
Hierarchy and extremes in selections from pools of randomized proteins.
Boyer, Sébastien; Biswas, Dipanwita; Kumar Soshee, Ananda; Scaramozzino, Natale; Nizak, Clément; Rivoire, Olivier
2016-03-29
Variation and selection are the core principles of Darwinian evolution, but quantitatively relating the diversity of a population to its capacity to respond to selection is challenging. Here, we examine this problem at a molecular level in the context of populations of partially randomized proteins selected for binding to well-defined targets. We built several minimal protein libraries, screened them in vitro by phage display, and analyzed their response to selection by high-throughput sequencing. A statistical analysis of the results reveals two main findings. First, libraries with the same sequence diversity but built around different "frameworks" typically have vastly different responses; second, the distribution of responses of the best binders in a library follows a simple scaling law. We show how an elementary probabilistic model based on extreme value theory rationalizes the latter finding. Our results have implications for designing synthetic protein libraries, estimating the density of functional biomolecules in sequence space, characterizing diversity in natural populations, and experimentally investigating evolvability (i.e., the potential for future evolution).
Maisano Delser, Pierpaolo; Corrigan, Shannon; Hale, Matthew; Li, Chenhong; Veuille, Michel; Planes, Serge; Naylor, Gavin; Mona, Stefano
2016-01-01
Population genetics studies on non-model organisms typically involve sampling few markers from multiple individuals. Next-generation sequencing approaches open up the possibility of sampling many more markers from fewer individuals to address the same questions. Here, we applied a target gene capture method to deep sequence ~1000 independent autosomal regions of a non-model organism, the blacktip reef shark (Carcharhinus melanopterus). We devised a sampling scheme based on the predictions of theoretical studies of metapopulations to show that sampling few individuals, but many loci, can be extremely informative to reconstruct the evolutionary history of species. We collected data from a single deme (SID) from Northern Australia and from a scattered sampling representing various locations throughout the Indian Ocean (SCD). We explored the genealogical signature of population dynamics detected from both sampling schemes using an ABC algorithm. We then contrasted these results with those obtained by fitting the data to a non-equilibrium finite island model. Both approaches supported an Nm value ~40, consistent with philopatry in this species. Finally, we demonstrate through simulation that metapopulations exhibit greater resilience to recent changes in effective size compared to unstructured populations. We propose an empirical approach to detect recent bottlenecks based on our sampling scheme. PMID:27651217
Maisano Delser, Pierpaolo; Corrigan, Shannon; Hale, Matthew; Li, Chenhong; Veuille, Michel; Planes, Serge; Naylor, Gavin; Mona, Stefano
2016-09-21
Population genetics studies on non-model organisms typically involve sampling few markers from multiple individuals. Next-generation sequencing approaches open up the possibility of sampling many more markers from fewer individuals to address the same questions. Here, we applied a target gene capture method to deep sequence ~1000 independent autosomal regions of a non-model organism, the blacktip reef shark (Carcharhinus melanopterus). We devised a sampling scheme based on the predictions of theoretical studies of metapopulations to show that sampling few individuals, but many loci, can be extremely informative to reconstruct the evolutionary history of species. We collected data from a single deme (SID) from Northern Australia and from a scattered sampling representing various locations throughout the Indian Ocean (SCD). We explored the genealogical signature of population dynamics detected from both sampling schemes using an ABC algorithm. We then contrasted these results with those obtained by fitting the data to a non-equilibrium finite island model. Both approaches supported an Nm value ~40, consistent with philopatry in this species. Finally, we demonstrate through simulation that metapopulations exhibit greater resilience to recent changes in effective size compared to unstructured populations. We propose an empirical approach to detect recent bottlenecks based on our sampling scheme.
Many Routes to an Antibody Heavy-Chain CDR3: Necessary, Yet Insufficient, for Specific Binding
D'Angelo, Sara; Ferrara, Fortunato; Naranjo, Leslie; ...
2018-03-08
Because of its great potential for diversity, the immunoglobulin heavy-chain complementarity-determining region 3 (HCDR3) is taken as an antibody molecule’s most important component in conferring binding activity and specificity. For this reason, HCDR3s have been used as unique identifiers to investigate adaptive immune responses in vivo and to characterize in vitro selection outputs where display systems were employed. Here, we show that many different HCDR3s can be identified within a target-specific antibody population after in vitro selection. For each identified HCDR3, a number of different antibodies bearing differences elsewhere can be found. In such selected populations, all antibodies with themore » same HCDR3 recognize the target, albeit at different affinities. In contrast, within unselected populations, the majority of antibodies with the same HCDR3 sequence do not bind the target. In one HCDR3 examined in depth, all target-specific antibodies were derived from the same VDJ rearrangement, while non-binding antibodies with the same HCDR3 were derived from many different V and D gene rearrangements. Careful examination of previously published in vivo datasets reveals that HCDR3s shared between, and within, different individuals can also originate from rearrangements of different V and D genes, with up to 26 different rearrangements yielding the same identical HCDR3 sequence. On the basis of these observations, we conclude that the same HCDR3 can be generated by many different rearrangements, but that specific target binding is an outcome of unique rearrangements and VL pairing: the HCDR3 is necessary, albeit insufficient, for specific antibody binding.« less
Many Routes to an Antibody Heavy-Chain CDR3: Necessary, Yet Insufficient, for Specific Binding
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Angelo, Sara; Ferrara, Fortunato; Naranjo, Leslie
Because of its great potential for diversity, the immunoglobulin heavy-chain complementarity-determining region 3 (HCDR3) is taken as an antibody molecule’s most important component in conferring binding activity and specificity. For this reason, HCDR3s have been used as unique identifiers to investigate adaptive immune responses in vivo and to characterize in vitro selection outputs where display systems were employed. Here, we show that many different HCDR3s can be identified within a target-specific antibody population after in vitro selection. For each identified HCDR3, a number of different antibodies bearing differences elsewhere can be found. In such selected populations, all antibodies with themore » same HCDR3 recognize the target, albeit at different affinities. In contrast, within unselected populations, the majority of antibodies with the same HCDR3 sequence do not bind the target. In one HCDR3 examined in depth, all target-specific antibodies were derived from the same VDJ rearrangement, while non-binding antibodies with the same HCDR3 were derived from many different V and D gene rearrangements. Careful examination of previously published in vivo datasets reveals that HCDR3s shared between, and within, different individuals can also originate from rearrangements of different V and D genes, with up to 26 different rearrangements yielding the same identical HCDR3 sequence. On the basis of these observations, we conclude that the same HCDR3 can be generated by many different rearrangements, but that specific target binding is an outcome of unique rearrangements and VL pairing: the HCDR3 is necessary, albeit insufficient, for specific antibody binding.« less
Determining mutation density using Restriction Enzyme Sequence Comparative Analysis (RESCAN)
USDA-ARS?s Scientific Manuscript database
The average mutation density of a mutant population is a major consideration when developing resources for the efficient, cost-effective implementation of reverse genetics methods such as Targeting of Induced Local Lesions in Genomes (TILLING). Reliable estimates of mutation density can be achieved ...
Meisner, Joshua K.; Price, Richard J.
2010-01-01
Arterial occlusive disease (AOD) is the leading cause of morbidity and mortality through the developed world, which creates a significant need for effective therapies to halt disease progression. Despite success of animal and small-scale human therapeutic arteriogenesis studies, this promising concept for treating AOD has yielded largely disappointing results in large-scale clinical trials. One reason for this lack of successful translation is that endogenous arteriogenesis is highly dependent on a poorly understood sequence of events and interactions between bone marrow derived cells (BMCs) and vascular cells, which makes designing effective therapies difficult. We contend that the process follows a complex, ordered sequence of events with multiple, specific BMC populations recruited at specific times and locations. Here we present the evidence suggesting roles for multiple BMC populations from neutrophils and mast cells to progenitor cells and propose how and where these cell populations fit within the sequence of events during arteriogenesis. Disruptions in these various BMC populations can impair the arteriogenesis process in patterns that characterize specific patient populations. We propose that an improved understanding of how arteriogenesis functions as a system can reveal individual BMC populations and functions that can be targeted for overcoming particular impairments in collateral vessel development. PMID:21044213
Evidence for label-retaining tumour-initiating cells in human glioblastoma
Deleyrolle, Loic P.; Harding, Angus; Cato, Kathleen; Siebzehnrubl, Florian A.; Rahman, Maryam; Azari, Hassan; Olson, Sarah; Gabrielli, Brian; Osborne, Geoffrey; Vescovi, Angelo
2011-01-01
Individual tumour cells display diverse functional behaviours in terms of proliferation rate, cell–cell interactions, metastatic potential and sensitivity to therapy. Moreover, sequencing studies have demonstrated surprising levels of genetic diversity between individual patient tumours of the same type. Tumour heterogeneity presents a significant therapeutic challenge as diverse cell types within a tumour can respond differently to therapies, and inter-patient heterogeneity may prevent the development of general treatments for cancer. One strategy that may help overcome tumour heterogeneity is the identification of tumour sub-populations that drive specific disease pathologies for the development of therapies targeting these clinically relevant sub-populations. Here, we have identified a dye-retaining brain tumour population that displays all the hallmarks of a tumour-initiating sub-population. Using a limiting dilution transplantation assay in immunocompromised mice, label-retaining brain tumour cells display elevated tumour-initiation properties relative to the bulk population. Importantly, tumours generated from these label-retaining cells exhibit all the pathological features of the primary disease. Together, these findings confirm dye-retaining brain tumour cells exhibit tumour-initiation ability and are therefore viable targets for the development of therapeutics targeting this sub-population. PMID:21515906
Microsatellite DNA capture from enriched libraries.
Gonzalez, Elena G; Zardoya, Rafael
2013-01-01
Microsatellites are DNA sequences of tandem repeats of one to six nucleotides, which are highly polymorphic, and thus the molecular markers of choice in many kinship, population genetic, and conservation studies. There have been significant technical improvements since the early methods for microsatellite isolation were developed, and today the most common procedures take advantage of the hybrid capture methods of enriched-targeted microsatellite DNA. Furthermore, recent advents in sequencing technologies (i.e., next-generation sequencing, NGS) have fostered the mining of microsatellite markers in non-model organisms, affording a cost-effective way of obtaining a large amount of sequence data potentially useful for loci characterization. The rapid improvements of NGS platforms together with the increase in available microsatellite information open new avenues to the understanding of the evolutionary forces that shape genetic structuring in wild populations. Here, we provide detailed methodological procedures for microsatellite isolation based on the screening of GT microsatellite-enriched libraries, either by cloning and Sanger sequencing of positive clones or by direct NGS. Guides for designing new species-specific primers and basic genotyping are also given.
Yeo, Matthew; Mauricio, Isabel L; Messenger, Louisa A; Lewis, Michael D; Llewellyn, Martin S; Acosta, Nidia; Bhattacharyya, Tapan; Diosque, Patricio; Carrasco, Hernan J; Miles, Michael A
2011-06-01
Multilocus sequence typing (MLST) is a powerful and highly discriminatory method for analysing pathogen population structure and epidemiology. Trypanosoma cruzi, the protozoan agent of American trypanosomiasis (Chagas disease), has remarkable genetic and ecological diversity. A standardised MLST protocol that is suitable for assignment of T. cruzi isolates to genetic lineage and for higher resolution diversity studies has not been developed. We have sequenced and diplotyped nine single copy housekeeping genes and assessed their value as part of a systematic MLST scheme for T. cruzi. A minimum panel of four MLST targets (Met-III, RB19, TcGPXII, and DHFR-TS) was shown to provide unambiguous assignment of isolates to the six known T. cruzi lineages (Discrete Typing Units, DTUs TcI-TcVI). In addition, we recommend six MLST targets (Met-II, Met-III, RB19, TcMPX, DHFR-TS, and TR) for more in depth diversity studies on the basis that diploid sequence typing (DST) with this expanded panel distinguished 38 out of 39 reference isolates. Phylogenetic analysis implies a subdivision between North and South American TcIV isolates. Single Nucleotide Polymorphism (SNP) data revealed high levels of heterozygosity among DTUs TcI, TcIII, TcIV and, for three targets, putative corresponding homozygous and heterozygous loci within DTUs TcI and TcIII. Furthermore, individual gene trees gave incongruent topologies at inter- and intra-DTU levels, inconsistent with a model of strict clonality. We demonstrate the value of systematic MLST diplotyping for describing inter-DTU relationships and for higher resolution diversity studies of T. cruzi, including presence of recombination events. The high levels of heterozygosity will facilitate future population genetics analysis based on MLST haplotypes.
Visualization and Sequencing of Membrane Remodeling Leading to Influenza Virus Fusion
Gui, Long; Ebner, Jamie L.; Mileant, Alexander; Williams, James A.
2016-01-01
ABSTRACT Protein-mediated membrane fusion is an essential step in many fundamental biological events, including enveloped virus infection. The nature of protein and membrane intermediates and the sequence of membrane remodeling during these essential processes remain poorly understood. Here we used cryo-electron tomography (cryo-ET) to image the interplay between influenza virus and vesicles with a range of lipid compositions. By following the population kinetics of membrane fusion intermediates imaged by cryo-ET, we found that membrane remodeling commenced with the hemagglutinin fusion protein spikes grappling onto the target membrane, followed by localized target membrane dimpling as local clusters of hemagglutinin started to undergo conformational refolding. The local dimples then transitioned to extended, tightly apposed contact zones where the two proximal membrane leaflets were in most cases indistinguishable from each other, suggesting significant dehydration and possible intermingling of the lipid head groups. Increasing the content of fusion-enhancing cholesterol or bis-monoacylglycerophosphate in the target membrane led to an increase in extended contact zone formation. Interestingly, hemifused intermediates were found to be extremely rare in the influenza virus fusion system studied here, most likely reflecting the instability of this state and its rapid conversion to postfusion complexes, which increased in population over time. By tracking the populations of fusion complexes over time, the architecture and sequence of membrane reorganization leading to efficient enveloped virus fusion were thus resolved. IMPORTANCE Enveloped viruses employ specialized surface proteins to mediate fusion of cellular and viral membranes that results in the formation of pores through which the viral genetic material is delivered to the cell. For influenza virus, the trimeric hemagglutinin (HA) glycoprotein spike mediates host cell attachment and membrane fusion. While structures of a subset of conformations and parts of the fusion machinery have been characterized, the nature and sequence of membrane deformations during fusion have largely eluded characterization. Building upon studies that focused on early stages of HA-mediated membrane remodeling, here cryo-electron tomography (cryo-ET) was used to image the three-dimensional organization of intact influenza virions at different stages of fusion with liposomes, leading all the way to completion of the fusion reaction. By monitoring the evolution of fusion intermediate populations over the course of acid-induced fusion, we identified the progression of membrane reorganization that leads to efficient fusion by an enveloped virus. PMID:27226364
Programmable removal of bacterial strains by use of genome-targeting CRISPR-Cas systems.
Gomaa, Ahmed A; Klumpe, Heidi E; Luo, Michelle L; Selle, Kurt; Barrangou, Rodolphe; Beisel, Chase L
2014-01-28
CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems in bacteria and archaea employ CRISPR RNAs to specifically recognize the complementary DNA of foreign invaders, leading to sequence-specific cleavage or degradation of the target DNA. Recent work has shown that the accidental or intentional targeting of the bacterial genome is cytotoxic and can lead to cell death. Here, we have demonstrated that genome targeting with CRISPR-Cas systems can be employed for the sequence-specific and titratable removal of individual bacterial strains and species. Using the type I-E CRISPR-Cas system in Escherichia coli as a model, we found that this effect could be elicited using native or imported systems and was similarly potent regardless of the genomic location, strand, or transcriptional activity of the target sequence. Furthermore, the specificity of targeting with CRISPR RNAs could readily distinguish between even highly similar strains in pure or mixed cultures. Finally, varying the collection of delivered CRISPR RNAs could quantitatively control the relative number of individual strains within a mixed culture. Critically, the observed selectivity and programmability of bacterial removal would be virtually impossible with traditional antibiotics, bacteriophages, selectable markers, or tailored growth conditions. Once delivery challenges are addressed, we envision that this approach could offer a novel means to quantitatively control the composition of environmental and industrial microbial consortia and may open new avenues for the development of "smart" antibiotics that circumvent multidrug resistance and differentiate between pathogenic and beneficial microorganisms. Controlling the composition of microbial populations is a critical aspect in medicine, biotechnology, and environmental cycles. While different antimicrobial strategies, such as antibiotics, antimicrobial peptides, and lytic bacteriophages, offer partial solutions, what remains elusive is a generalized and programmable strategy that can distinguish between even closely related microorganisms and that allows for fine control over the composition of a microbial population. This study demonstrates that RNA-directed immune systems in bacteria and archaea called CRISPR-Cas systems can provide such a strategy. These systems can be employed to selectively and quantitatively remove individual bacterial strains based purely on sequence information, creating opportunities in the treatment of multidrug-resistant infections, the control of industrial fermentations, and the study of microbial consortia.
Yousri, Noha A; Fakhro, Khalid A; Robay, Amal; Rodriguez-Flores, Juan L; Mohney, Robert P; Zeriri, Hassina; Odeh, Tala; Kader, Sara Abdul; Aldous, Eman K; Thareja, Gaurav; Kumar, Manish; Al-Shakaki, Alya; Chidiac, Omar M; Mohamoud, Yasmin A; Mezey, Jason G; Malek, Joel A; Crystal, Ronald G; Suhre, Karsten
2018-01-23
Metabolomics-genome-wide association studies (mGWAS) have uncovered many metabolic quantitative trait loci (mQTLs) influencing human metabolic individuality, though predominantly in European cohorts. By combining whole-exome sequencing with a high-resolution metabolomics profiling for a highly consanguineous Middle Eastern population, we discover 21 common variant and 12 functional rare variant mQTLs, of which 45% are novel altogether. We fine-map 10 common variant mQTLs to new metabolite ratio associations, and 11 common variant mQTLs to putative protein-altering variants. This is the first work to report common and rare variant mQTLs linked to diseases and/or pharmacological targets in a consanguineous Arab cohort, with wide implications for precision medicine in the Middle East.
In-depth resistome analysis by targeted metagenomics.
Lanza, Val F; Baquero, Fernando; Martínez, José Luís; Ramos-Ruíz, Ricardo; González-Zorn, Bruno; Andremont, Antoine; Sánchez-Valenzuela, Antonio; Ehrlich, Stanislav Dusko; Kennedy, Sean; Ruppé, Etienne; van Schaik, Willem; Willems, Rob J; de la Cruz, Fernando; Coque, Teresa M
2018-01-15
Antimicrobial resistance is a major global health challenge. Metagenomics allows analyzing the presence and dynamics of "resistomes" (the ensemble of genes encoding antimicrobial resistance in a given microbiome) in disparate microbial ecosystems. However, the low sensitivity and specificity of available metagenomic methods preclude the detection of minority populations (often present below their detection threshold) and/or the identification of allelic variants that differ in the resulting phenotype. Here, we describe a novel strategy that combines targeted metagenomics using last generation in-solution capture platforms, with novel bioinformatics tools to establish a standardized framework that allows both quantitative and qualitative analyses of resistomes. We developed ResCap, a targeted sequence capture platform based on SeqCapEZ (NimbleGene) technology, which includes probes for 8667 canonical resistance genes (7963 antibiotic resistance genes and 704 genes conferring resistance to metals or biocides), and 2517 relaxase genes (plasmid markers) and 78,600 genes homologous to the previous identified targets (47,806 for antibiotics and 30,794 for biocides or metals). Its performance was compared with metagenomic shotgun sequencing (MSS) for 17 fecal samples (9 humans, 8 swine). ResCap significantly improves MSS to detect "gene abundance" (from 2.0 to 83.2%) and "gene diversity" (26 versus 14.9 genes unequivocally detected per sample per million of reads; the number of reads unequivocally mapped increasing up to 300-fold by using ResCap), which were calculated using novel bioinformatic tools. ResCap also facilitated the analysis of novel genes potentially involved in the resistance to antibiotics, metals, biocides, or any combination thereof. ResCap, the first targeted sequence capture, specifically developed to analyze resistomes, greatly enhances the sensitivity and specificity of available metagenomic methods and offers the possibility to analyze genes related to the selection and transfer of antimicrobial resistance (biocides, heavy metals, plasmids). The model opens the possibility to study other complex microbial systems in which minority populations play a relevant role.
Harrow, Sally A.; Ravindran, Velmurugu; Butler, Ruth C.; Marshall, John W.; Tannock, Gerald W.
2007-01-01
A real-time quantitative PCR assay targeting a 16S-23S intergenic spacer region sequence was devised to measure the sizes of populations of Lactobacillus salivarius present in ileal digesta collected from broiler chickens. This species has been associated with deconjugation of bile salts in the small bowel and reduced broiler productivity. The assay was tested as a means of monitoring the sizes of L. salivarius populations from broilers fed diets with different compositions, maintained at different stocking densities, or given the antimicrobial drugs bacitracin and monensin in the feed. Stocking densities did not influence the numbers of L. salivarius cells in the ileum. A diet containing meat and bone meal reduced the size of the L. salivarius population relative to that of chickens given the control diet, as did administration of bacitracin and monensin in the feed. These changes in the target bacterial population were associated with improved broiler weight gain. PMID:17890342
Murine mesenchymal and embryonic stem cells express a similar Hox gene profile.
Phinney, Donald G; Gray, Andrew J; Hill, Katy; Pandey, Amitabh
2005-12-30
Using degenerate oligonucleotide primers targeting the homeobox domain, we amplified by PCR and sequenced 723 clones from five murine cell populations and lines derived from embryonic mesoderm and adult bone marrow. Transcripts from all four vertebrate Hox clusters were expressed by the different populations. Hierarchical clustering of the data revealed that mesenchymal stem cells (MSCs) and the embryonic stem (ES) cell line D3 shared a similar Hox expression profile. These populations exclusively expressed Hoxb2, Hoxb5, Hoxb7, and Hoxc4, transcripts regulating self-renewal and differentiation of other stem cells. Additionally, Hoxa7 transcript quantified by real-time PCR strongly correlated (r2=0.89) with the number of Hoxa7 clones identified by sequencing, validating that data from the PCR screen reflects differences in Hox mRNA abundance between populations. This is the first study to catalogue Hox transcripts in murine MSCs and by comparative analyses identify specific Hox genes that may contribute to their stem cell character.
Rivera-Torres, Natalia; Banas, Kelly; Bialk, Pawel; Bloh, Kevin M; Kmiec, Eric B
2017-01-01
CRISPR/Cas9 and single-stranded DNA oligonucleotides (ssODNs) have been used to direct the repair of a single base mutation in human genes. Here, we examine a method designed to increase the precision of RNA guided genome editing in human cells by utilizing a CRISPR/Cas9 ribonucleoprotein (RNP) complex to initiate DNA cleavage. The RNP is assembled in vitro and induces a double stranded break at a specific site surrounding the mutant base designated for correction by the ssODN. We use an integrated mutant eGFP gene, bearing a single base change rendering the expressed protein nonfunctional, as a single copy target in HCT 116 cells. We observe significant gene correction activity of the mutant base, promoted by the RNP and single-stranded DNA oligonucleotide with validation through genotypic and phenotypic readout. We demonstrate that all individual components must be present to obtain successful gene editing. Importantly, we examine the genotype of individually sorted corrected and uncorrected clonally expanded cell populations for the mutagenic footprint left by the action of these gene editing tools. While the DNA sequence of the corrected population is exact with no adjacent sequence modification, the uncorrected population exhibits heterogeneous mutagenicity with a wide variety of deletions and insertions surrounding the target site. We designate this type of DNA aberration as on-site mutagenicity. Analyses of two clonal populations bearing specific DNA insertions surrounding the target site, indicate that point mutation repair has occurred at the level of the gene. The phenotype, however, is not rescued because a section of the single-stranded oligonucleotide has been inserted altering the reading frame and generating truncated proteins. These data illustrate the importance of analysing mutagenicity in uncorrected cells. Our results also form the basis of a simple model for point mutation repair directed by a short single-stranded DNA oligonucleotides and CRISPR/Cas9 ribonucleoprotein complex.
Rivera-Torres, Natalia; Bialk, Pawel; Bloh, Kevin M.; Kmiec, Eric B.
2017-01-01
CRISPR/Cas9 and single-stranded DNA oligonucleotides (ssODNs) have been used to direct the repair of a single base mutation in human genes. Here, we examine a method designed to increase the precision of RNA guided genome editing in human cells by utilizing a CRISPR/Cas9 ribonucleoprotein (RNP) complex to initiate DNA cleavage. The RNP is assembled in vitro and induces a double stranded break at a specific site surrounding the mutant base designated for correction by the ssODN. We use an integrated mutant eGFP gene, bearing a single base change rendering the expressed protein nonfunctional, as a single copy target in HCT 116 cells. We observe significant gene correction activity of the mutant base, promoted by the RNP and single-stranded DNA oligonucleotide with validation through genotypic and phenotypic readout. We demonstrate that all individual components must be present to obtain successful gene editing. Importantly, we examine the genotype of individually sorted corrected and uncorrected clonally expanded cell populations for the mutagenic footprint left by the action of these gene editing tools. While the DNA sequence of the corrected population is exact with no adjacent sequence modification, the uncorrected population exhibits heterogeneous mutagenicity with a wide variety of deletions and insertions surrounding the target site. We designate this type of DNA aberration as on-site mutagenicity. Analyses of two clonal populations bearing specific DNA insertions surrounding the target site, indicate that point mutation repair has occurred at the level of the gene. The phenotype, however, is not rescued because a section of the single-stranded oligonucleotide has been inserted altering the reading frame and generating truncated proteins. These data illustrate the importance of analysing mutagenicity in uncorrected cells. Our results also form the basis of a simple model for point mutation repair directed by a short single-stranded DNA oligonucleotides and CRISPR/Cas9 ribonucleoprotein complex. PMID:28052104
Hamula, Camille L A; Peng, Hanyong; Wang, Zhixin; Tyrrell, Gregory J; Li, Xing-Fang; Le, X Chris
2016-03-15
Streptococcus pyogenes is a clinically important pathogen consisting of various serotypes determined by different M proteins expressed on the cell surface. The M type is therefore a useful marker to monitor the spread of invasive S. pyogenes in a population. Serotyping and nucleic acid amplification/sequencing methods for the identification of M types are laborious, inconsistent, and usually confined to reference laboratories. The primary objective of this work is to develop a technique that enables generation of aptamers binding to specific M-types of S. pyogenes. We describe here an in vitro technique that directly used live bacterial cells and the Systematic Evolution of Ligands by Exponential Enrichment (SELEX) strategy. Live S. pyogenes cells were incubated with DNA libraries consisting of 40-nucleotides randomized sequences. Those sequences that bound to the cells were separated, amplified using polymerase chain reaction (PCR), purified using gel electrophoresis, and served as the input DNA pool for the next round of SELEX selection. A specially designed forward primer containing extended polyA20/5Sp9 facilitated gel electrophoresis purification of ssDNA after PCR amplification. A counter-selection step using non-target cells was introduced to improve selectivity. DNA libraries of different starting sequence diversity (10(16) and 10(14)) were compared. Aptamer pools from each round of selection were tested for their binding to the target and non-target cells using flow cytometry. Selected aptamer pools were then cloned and sequenced. Individual aptamer sequences were screened on the basis of their binding to the 10 M-types that were used as targets. Aptamer pools obtained from SELEX rounds 5-8 showed high affinity to the target S. pyogenes cells. Tests against non-target Streptococcus bovis, Streptococcus pneumoniae, and Enterococcus species demonstrated selectivity of these aptamers for binding to S. pyogenes. Several aptamer sequences were found to bind preferentially to the M11 M-type of S. pyogenes. Estimated binding dissociation constants (Kd) were in the low nanomolar range for the M11 specific sequences; for example, sequence E-CA20 had a Kd of 7±1 nM. These affinities are comparable to those of a monoclonal antibody. The improved bacterial cell-SELEX technique is successful in generating aptamers selective for S. pyogenes and some of its M-types. These aptamers are potentially useful for detecting S. pyogenes, achieving binding profiles of the various M-types, and developing new M-typing technologies for non-specialized laboratories or point-of-care testing. Copyright © 2015 Elsevier Inc. All rights reserved.
Carrier screening in the era of expanding genetic technology.
Arjunan, Aishwarya; Litwack, Karen; Collins, Nick; Charrow, Joel
2016-12-01
The Center for Jewish Genetics provides genetic education and carrier screening to individuals of Jewish descent. Carrier screening has traditionally been performed by targeted mutation analysis for founder mutations with an enzyme assay for Tay-Sachs carrier detection. The development of next-generation sequencing (NGS) allows for higher detection rates regardless of ethnicity. Here, we explore differences in carrier detection rates between genotyping and NGS in a primarily Jewish population. Peripheral blood samples or saliva samples were obtained from 506 individuals. All samples were analyzed by sequencing, targeted genotyping, triplet-repeat detection, and copy-number analysis; the analyses were carried out at Counsyl. Of 506 individuals screened, 288 were identified as carriers of at least 1 condition and 8 couples were carriers for the same disorder. A total of 434 pathogenic variants were identified. Three hundred twelve variants would have been detected via genotyping alone. Although no additional mutations were detected by NGS in diseases routinely screened for in the Ashkenazi Jewish population, 26.5% of carrier results and 2 carrier couples would have been missed without NGS in the larger panel. In a primarily Jewish population, NGS reveals a larger number of pathogenic variants and provides individuals with valuable information for family planning.Genet Med 18 12, 1214-1217.
Legacy lost: genetic variability and population size of extirpated US grey wolves (Canis lupus).
Leonard, Jennifer A; Vilà, Carles; Wayne, Robert K
2005-01-01
By the mid 20th century, the grey wolf (Canis lupus) was exterminated from most of the conterminous United States (cUS) and Mexico. However, because wolves disperse over long distances, extant populations in Canada and Alaska might have retained a substantial proportion of the genetic diversity once found in the cUS. We analysed mitochondrial DNA sequences of 34 pre-extermination wolves and found that they had more than twice the diversity of their modern conspecifics, implying a historic population size of several hundred thousand wolves in the western cUS and Mexico. Further, two-thirds of the haplotypes found in the historic sample are unique. Sequences from Mexican grey wolves (C. l. baileyi) and some historic grey wolves defined a unique southern clade supporting a much wider geographical mandate for the reintroduction of Mexican wolves than currently planned. Our results highlight the genetic consequences of population extinction within Ice Age refugia and imply that restoration goals for grey wolves in the western cUS include far less area and target vastly lower population sizes than existed historically.
Pfaff, Florian; Müller, Thomas; Freuling, Conrad M; Fehlner-Gardiner, Christine; Nadin-Davis, Susan; Robardet, Emmanuelle; Cliquet, Florence; Vuta, Vlad; Hostnik, Peter; Mettenleiter, Thomas C; Beer, Martin; Höper, Dirk
2018-02-10
Live-attenuated rabies virus strains such as those derived from the field isolate Street Alabama Dufferin (SAD) have been used extensively and very effectively as oral rabies vaccines for the control of fox rabies in both Europe and Canada. Although these vaccines are safe, some cases of vaccine-derived rabies have been detected during rabies surveillance accompanying these campaigns. In recent analysis it was shown that some commercial SAD vaccines consist of diverse viral populations, rather than clonal genotypes. For cases of vaccine-derived rabies, only consensus sequence data have been available to date and information concerning their population diversity was thus lacking. In our study, we used high-throughput sequencing to analyze 11 cases of vaccine-derived rabies, and compared their viral population diversity to the related oral rabies vaccines using pairwise Manhattan distances. This extensive deep sequencing analysis of vaccine-derived rabies cases observed during oral vaccination programs provided deeper insights into the effect of accidental in vivo replication of genetically diverse vaccine strains in the central nervous system of target and non-target species under field conditions. The viral population in vaccine-derived cases appeared to be clonal in contrast to their parental vaccines. The change from a state of high population diversity present in the vaccine batches to a clonal genotype in the affected animal may indicate the presence of a strong bottleneck during infection. In conclusion, it is very likely that these few cases are the consequence of host factors and not the result of the selection of a more virulent genotype. Furthermore, this type of vaccine-derived rabies leads to the selection of clonal genotypes and the selected variants were genetically very similar to potent SAD vaccines that have undergone a history of in vitro selection. Copyright © 2018. Published by Elsevier Ltd.
Pattaradilokrat, Sittiporn; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Siripoon, Napaporn; Harnyuttanakorn, Pongchai
2016-10-21
An effective malaria vaccine is an urgently needed tool to fight against human malaria, the most deadly parasitic disease of humans. One promising candidate is the merozoite surface protein-3 (MSP-3) of Plasmodium falciparum. This antigenic protein, encoded by the merozoite surface protein (msp-3) gene, is polymorphic and classified according to size into the two allelic types of K1 and 3D7. A recent study revealed that both the K1 and 3D7 alleles co-circulated within P. falciparum populations in Thailand, but the extent of the sequence diversity and variation within each allelic type remains largely unknown. The msp-3 gene was sequenced from 59 P. falciparum samples collected from five endemic areas (Mae Hong Son, Kanchanaburi, Ranong, Trat and Ubon Ratchathani) in Thailand and analysed for nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity. The gene was also subject to population genetic analysis (F st ) and neutrality tests (Tajima's D, Fu and Li D* and Fu and Li' F* tests) to determine any signature of selection. The sequence analyses revealed eight unique DNA haplotypes and seven amino acid sequence variants, with a haplotype and nucleotide diversity of 0.828 and 0.049, respectively. Neutrality tests indicated that the polymorphism detected in the alanine heptad repeat region of MSP-3 was maintained by positive diversifying selection, suggesting its role as a potential target of protective immune responses and supporting its role as a vaccine candidate. Comparison of MSP-3 variants among parasite populations in Thailand, India and Nigeria also inferred a close genetic relationship between P. falciparum populations in Asia. This study revealed the extent of the msp-3 gene diversity in P. falciparum in Thailand, providing the fundamental basis for the better design of future blood stage malaria vaccines against P. falciparum.
Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.
2012-01-01
High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309
Kamneva, Olga K; Syring, John; Liston, Aaron; Rosenberg, Noah A
2017-08-04
Hybridization is observed in many eukaryotic lineages and can lead to the formation of polyploid species. The study of hybridization and polyploidization faces challenges both in data generation and in accounting for population-level phenomena such as coalescence processes in phylogenetic analysis. Genus Fragaria is one example of a set of plant taxa in which a range of ploidy levels is observed across species, but phylogenetic origins are unknown. Here, using 20 diploid and polyploid Fragaria species, we combine approaches from NGS data analysis and phylogenetics to infer evolutionary origins of polyploid strawberries, taking into account coalescence processes. We generate haplotype sequences for 257 low-copy nuclear markers assembled from Illumina target capture sequence data. We then identify putative hybridization events by analyzing gene tree topologies, and further test predicted hybridizations in a coalescence framework. This approach confirms the allopolyploid ancestry of F. chiloensis and F. virginiana, and provides new allopolyploid ancestry hypotheses for F. iturupensis, F. moschata, and F. orientalis. Evidence of gene flow between diploids F. bucharica and F. vesca is also detected, suggesting that it might be appropriate to consider these groups as conspecifics. This study is one of the first in which target capture sequencing followed by computational deconvolution of individual haplotypes is used for tracing origins of polyploid taxa. The study also provides new perspectives on the evolutionary history of Fragaria.
Qiu, Biyuan; Ma, Tao; Peng, Chunyan; Zheng, Xiaoqin; Yang, Jiyun
2018-04-01
The diagnosis of oculocutaneous albinism (OCA) is established using clinical signs and symptoms. OCA is, however, a highly genetically heterogeneous disease with mutations identified in at least nineteen unique genes, many of which produce overlapping phenotypic traits. Thus, differentiating genetic OCA subtypes for diagnoses and genetic counseling is challenging, based on clinical presentation alone, and would benefit from a comprehensive molecular diagnostic. To develop and validate a more comprehensive, targeted, next-generation-sequencing-based diagnostic for the identification of OCA-causing variants. The genomic DNA samples from 28 OCA probands were analyzed by targeted next-generation sequencing (NGS), and the candidate variants were confirmed through Sanger sequencing. We observed mutations in the TYR, OCA2, and SLC45A2 genes in 25/28 (89%) patients with OCA. We identified 38 pathogenic variants among these three genes, including 5 novel variants: c.1970G>T (p.Gly657Val), c.1669A>C (p.Thr557Pro), c.2339-2A>C, and c.1349C>G (p.Thr450Arg) in OCA2; c.459_470delTTTTGCTGCCGA (p.Ala155_Phe158del) in SLC45A2. Our findings expand the mutational spectrum of OCA in the Chinese population, and the assay we developed should be broadly useful as a molecular diagnostic, and as an aid for genetic counseling for OCA patients.
Scala, Giovanni; Affinito, Ornella; Palumbo, Domenico; Florio, Ermanno; Monticelli, Antonella; Miele, Gennaro; Chiariotti, Lorenzo; Cocozza, Sergio
2016-11-25
CpG sites in an individual molecule may exist in a binary state (methylated or unmethylated) and each individual DNA molecule, containing a certain number of CpGs, is a combination of these states defining an epihaplotype. Classic quantification based approaches to study DNA methylation are intrinsically unable to fully represent the complexity of the underlying methylation substrate. Epihaplotype based approaches, on the other hand, allow methylation profiles of cell populations to be studied at the single molecule level. For such investigations, next-generation sequencing techniques can be used, both for quantitative and for epihaplotype analysis. Currently available tools for methylation analysis lack output formats that explicitly report CpG methylation profiles at the single molecule level and that have suited statistical tools for their interpretation. Here we present ampliMethProfiler, a python-based pipeline for the extraction and statistical epihaplotype analysis of amplicons from targeted deep bisulfite sequencing of multiple DNA regions. ampliMethProfiler tool provides an easy and user friendly way to extract and analyze the epihaplotype composition of reads from targeted bisulfite sequencing experiments. ampliMethProfiler is written in python language and requires a local installation of BLAST and (optionally) QIIME tools. It can be run on Linux and OS X platforms. The software is open source and freely available at http://amplimethprofiler.sourceforge.net .
Botti, Sara; Giuffra, Elisabetta
2010-08-23
DNA barcodes are a global standard for species identification and have countless applications in the medical, forensic and alimentary fields, but few barcoding methods work efficiently in samples in which DNA is degraded, e.g. foods and archival specimens. This limits the choice of target regions harbouring a sufficient number of diagnostic polymorphisms. The method described here uses existing PCR and sequencing methodologies to detect mitochondrial DNA polymorphisms in complex matrices such as foods. The reported application allowed the discrimination among 17 fish species of the Scombridae family with high commercial interest such as mackerels, bonitos and tunas which are often present in processed seafood. The approach can be easily upgraded with the release of new genetic diversity information to increase the range of detected species. Cocktail of primers are designed for PCR using publicly available sequences of the target sequence. They are composed of a fixed 5' region and of variable 3' cocktail portions that allow amplification of any member of a group of species of interest. The population of short amplicons is directly sequenced and indexed using primers containing a longer 5' region and the non polymorphic portion of the cocktail portion. A 226 bp region of CytB was selected as target after collection and screening of 148 online sequences; 85 SNPs were found, of which 75 were present in at least two sequences. Primers were also designed for two shorter sub-fragments that could be amplified from highly degraded samples. The test was used on 103 samples of seafood (canned tuna and scomber, tuna salad, tuna sauce) and could successfully detect the presence of different or additional species that were not identified on the labelling of canned tuna, tuna salad and sauce samples. The described method is largely independent of the degree of degradation of DNA source and can thus be applied to processed seafood. Moreover, the method is highly flexible: publicly available sequence information on mitochondrial genomes are rapidly increasing for most species, facilitating the choice of target sequences and the improvement of resolution of the test. This is particularly important for discrimination of marine and aquaculture species for which genome information is still limited.
Molecular profiling of childhood cancer: Biomarkers and novel therapies.
Saletta, Federica; Wadham, Carol; Ziegler, David S; Marshall, Glenn M; Haber, Michelle; McCowage, Geoffrey; Norris, Murray D; Byrne, Jennifer A
2014-06-01
Technological advances including high-throughput sequencing have identified numerous tumor-specific genetic changes in pediatric and adolescent cancers that can be exploited as targets for novel therapies. This review provides a detailed overview of recent advances in the application of target-specific therapies for childhood cancers, either as single agents or in combination with other therapies. The review summarizes preclinical evidence on which clinical trials are based, early phase clinical trial results, and the incorporation of predictive biomarkers into clinical practice, according to cancer type. There is growing evidence that molecularly targeted therapies can valuably add to the arsenal available for treating childhood cancers, particularly when used in combination with other therapies. Nonetheless the introduction of molecularly targeted agents into practice remains challenging, due to the use of unselected populations in some clinical trials, inadequate methods to evaluate efficacy, and the need for improved preclinical models to both evaluate dosing and safety of combination therapies. The increasing recognition of the heterogeneity of molecular causes of cancer favors the continued development of molecularly targeted agents, and their transfer to pediatric and adolescent populations.
A map of human genome variation from population-scale sequencing.
Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A
2010-10-28
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
Wada, K; Wada, Y; Iwasaki, Y; Ikemura, T
2017-10-01
Oligonucleotides are key elements of nucleic acid therapeutics such as small interfering RNAs (siRNAs). Influenza and Ebolaviruses are zoonotic RNA viruses mutating very rapidly, and their sequence changes must be characterized intensively to design therapeutic oligonucleotides with long utility. Focusing on a total of 182 experimentally validated siRNAs for influenza A, B and Ebolaviruses compiled by the siRNA database, we conducted time-series analyses of occurrences of siRNA targets in these viral genomes. Reflecting their high mutation rates, occurrences of target oligonucleotides evidently fluctuate in viral populations and often disappear. Time-series analysis of the one-base changed sequences derived from each original target identified the oligonucleotide that shows a compensatory increase and will potentially become the 'awaiting-type oligonucleotide'; the combined use of this oligonucleotide with the original can provide therapeutics with long utility. This strategy is also useful for assigning diagnostic reverse transcription-PCR primers with long utility.
Wada, K; Wada, Y; Iwasaki, Y; Ikemura, T
2017-01-01
Oligonucleotides are key elements of nucleic acid therapeutics such as small interfering RNAs (siRNAs). Influenza and Ebolaviruses are zoonotic RNA viruses mutating very rapidly, and their sequence changes must be characterized intensively to design therapeutic oligonucleotides with long utility. Focusing on a total of 182 experimentally validated siRNAs for influenza A, B and Ebolaviruses compiled by the siRNA database, we conducted time-series analyses of occurrences of siRNA targets in these viral genomes. Reflecting their high mutation rates, occurrences of target oligonucleotides evidently fluctuate in viral populations and often disappear. Time-series analysis of the one-base changed sequences derived from each original target identified the oligonucleotide that shows a compensatory increase and will potentially become the ‘awaiting-type oligonucleotide’ the combined use of this oligonucleotide with the original can provide therapeutics with long utility. This strategy is also useful for assigning diagnostic reverse transcription-PCR primers with long utility. PMID:28905886
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.
Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi
2017-07-01
PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
The application of the high throughput sequencing technology in the transposable elements.
Liu, Zhen; Xu, Jian-hong
2015-09-01
High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.
CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites
Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko
2015-01-01
Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360
Gupta, Sonal; Nawaz, Kashif; Parween, Sabiha; Roy, Riti; Sahu, Kamlesh; Kumar Pole, Anil; Khandal, Hitaishi; Srivastava, Rishi; Kumar Parida, Swarup; Chattopadhyay, Debasis
2017-02-01
Cicer reticulatum L. is the wild progenitor of the fourth most important legume crop chickpea (C. arietinum L.). We assembled short-read sequences into 416 Mb draft genome of C. reticulatum and anchored 78% (327 Mb) of this assembly to eight linkage groups. Genome annotation predicted 25,680 protein-coding genes covering more than 90% of predicted gene space. The genome assembly shared a substantial synteny and conservation of gene orders with the genome of the model legume Medicago truncatula. Resistance gene homologs of wild and domesticated chickpeas showed high sequence homology and conserved synteny. Comparison of gene sequences and nucleotide diversity using 66 wild and domesticated chickpea accessions suggested that the desi type chickpea was genetically closer to the wild species than the kabuli type. Comparative analyses predicted gene flow between the wild and the cultivated species during domestication. Molecular diversity and population genetic structure determination using 15,096 genome-wide single nucleotide polymorphisms revealed an admixed domestication pattern among cultivated (desi and kabuli) and wild chickpea accessions belonging to three population groups reflecting significant influence of parentage or geographical origin for their cultivar-specific population classification. The assembly and the polymorphic sequence resources presented here would facilitate the study of chickpea domestication and targeted use of wild Cicer germplasms for agronomic trait improvement in chickpea. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Donaldson, Michael E; Rico, Yessica; Hueffer, Karsten; Rando, Halie M; Kukekova, Anna V; Kyle, Christopher J
2018-01-01
Pathogens are recognized as major drivers of local adaptation in wildlife systems. By determining which gene variants are favored in local interactions among populations with and without disease, spatially explicit adaptive responses to pathogens can be elucidated. Much of our current understanding of host responses to disease comes from a small number of genes associated with an immune response. High-throughput sequencing (HTS) technologies, such as genotype-by-sequencing (GBS), facilitate expanded explorations of genomic variation among populations. Hybridization-based GBS techniques can be leveraged in systems not well characterized for specific variants associated with disease outcome to "capture" specific genes and regulatory regions known to influence expression and disease outcome. We developed a multiplexed, sequence capture assay for red foxes to simultaneously assess ~300-kbp of genomic sequence from 116 adaptive, intrinsic, and innate immunity genes of predicted adaptive significance and their putative upstream regulatory regions along with 23 neutral microsatellite regions to control for demographic effects. The assay was applied to 45 fox DNA samples from Alaska, where three arctic rabies strains are geographically restricted and endemic to coastal tundra regions, yet absent from the boreal interior. The assay provided 61.5% on-target enrichment with relatively even sequence coverage across all targeted loci and samples (mean = 50×), which allowed us to elucidate genetic variation across introns, exons, and potential regulatory regions (4,819 SNPs). Challenges remained in accurately describing microsatellite variation using this technique; however, longer-read HTS technologies should overcome these issues. We used these data to conduct preliminary analyses and detected genetic structure in a subset of red fox immune-related genes between regions with and without endemic arctic rabies. This assay provides a template to assess immunogenetic variation in wildlife disease systems.
Method to amplify variable sequences without imposing primer sequences
Bradbury, Andrew M.; Zeytun, Ahmet
2006-11-14
The present invention provides methods of amplifying target sequences without including regions flanking the target sequence in the amplified product or imposing amplification primer sequences on the amplified product. Also provided are methods of preparing a library from such amplified target sequences.
Kumar, Dhananjay; Dutta, Summi; Singh, Dharmendra; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal
2017-01-01
Deep sequencing identified 497 conserved and 559 novel miRNAs in wheat, while degradome analysis revealed 701 targets genes. QRT-PCR demonstrated differential expression of miRNAs during stages of leaf rust progression. Bread wheat (Triticum aestivum L.) is an important cereal food crop feeding 30 % of the world population. Major threat to wheat production is the rust epidemics. This study was targeted towards identification and functional characterizations of micro(mi)RNAs and their target genes in wheat in response to leaf rust ingression. High-throughput sequencing was used for transcriptome-wide identification of miRNAs and their expression profiling in retort to leaf rust using mock and pathogen-inoculated resistant and susceptible near-isogenic wheat plants. A total of 1056 mature miRNAs were identified, of which 497 miRNAs were conserved and 559 miRNAs were novel. The pathogen-inoculated resistant plants manifested more miRNAs compared with the pathogen infected susceptible plants. The miRNA counts increased in susceptible isoline due to leaf rust, conversely, the counts decreased in the resistant isoline in response to pathogenesis illustrating precise spatial tuning of miRNAs during compatible and incompatible interaction. Stem-loop quantitative real-time PCR was used to profile 10 highly differentially expressed miRNAs obtained from high-throughput sequencing data. The spatio-temporal profiling validated the differential expression of miRNAs between the isolines as well as in retort to pathogen infection. Degradome analysis provided 701 predicted target genes associated with defense response, signal transduction, development, metabolism, and transcriptional regulation. The obtained results indicate that wheat isolines employ diverse arrays of miRNAs that modulate their target genes during compatible and incompatible interaction. Our findings contribute to increase knowledge on roles of microRNA in wheat-leaf rust interactions and could help in rust resistance breeding programs.
The Genome Sequence of a Widespread Apex Predator, the Golden Eagle (Aquila chrysaetos)
Doyle, Jacqueline M.; Katzner, Todd E.; Bloom, Peter H.; Ji, Yanzhu; Wijayawardena, Bhagya K.; DeWoody, J. Andrew
2014-01-01
Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male golden eagle (Aquila chrysaetos) captured in western North America. We constructed genomic libraries that were sequenced using Illumina technology and assembled the high-quality data to a depth of ∼40x coverage. The genome assembly includes 2,552 scaffolds >10 Kb and 415 scaffolds >1.2 Mb. We annotated 16,571 genes that are involved in myriad biological processes, including such disparate traits as beak formation and color vision. We also identified repetitive regions spanning 92 Mb (∼6% of the assembly), including LINES, SINES, LTR-RTs and DNA transposons. The mitochondrial genome encompasses 17,332 bp and is ∼91% identical to the Mountain Hawk-Eagle (Nisaetus nipalensis). Finally, the data reveal that several anonymous microsatellites commonly used for population studies are embedded within protein-coding genes and thus may not have evolved in a neutral fashion. Because the genome sequence includes ∼800,000 novel polymorphisms, markers can now be chosen based on their proximity to functional genes involved in migration, carnivory, and other biological processes. PMID:24759626
Carmi, Shai; Hui, Ken Y.; Kochav, Ethan; Liu, Xinmin; Xue, James; Grady, Fillan; Guha, Saurav; Upadhyay, Kinnari; Ben-Avraham, Dan; Mukherjee, Semanti; Bowen, B. Monica; Thomas, Tinu; Vijai, Joseph; Cruts, Marc; Froyen, Guy; Lambrechts, Diether; Plaisance, Stéphane; Van Broeckhoven, Christine; Van Damme, Philip; Van Marck, Herwig; Barzilai, Nir; Darvasi, Ariel; Offit, Kenneth; Bressman, Susan; Ozelius, Laurie J.; Peter, Inga; Cho, Judy H.; Ostrer, Harry; Atzmon, Gil; Clark, Lorraine N.; Lencz, Todd; Pe’er, Itsik
2014-01-01
The Ashkenazi Jewish (AJ) population is a genetic isolate close to European and Middle Eastern groups, with genetic diversity patterns conducive to disease mapping. Here we report high-depth sequencing of 128 complete genomes of AJ controls. Compared with European samples, our AJ panel has 47% more novel variants per genome and is eightfold more effective at filtering benign variants out of AJ clinical genomes. Our panel improves imputation accuracy for AJ SNP arrays by 28%, and covers at least one haplotype in ≈67% of any AJ genome with long, identical-by-descent segments. Reconstruction of recent AJ history from such segments confirms a recent bottleneck of merely ≈350 individuals. Modelling of ancient histories for AJ and European populations using their joint allele frequency spectrum determines AJ to be an even admixture of European and likely Middle Eastern origins. We date the split between the two ancestral populations to ≈12–25 Kyr, suggesting a predominantly Near Eastern source for the repopulation of Europe after the Last Glacial Maximum. PMID:25203624
Ağladıoğlu, Sebahat Yılmaz; Aycan, Zehra; Çetinkaya, Semra; Baş, Veysel Nijat; Önder, Aşan; Peltek Kendirci, Havva Nur; Doğan, Haldun; Ceylaner, Serdar
2016-04-01
Maturity-onset diabetes of the youth (MODY), is a genetically and clinically heterogeneous group of diseasesand is often misdiagnosed as type 1 or type 2 diabetes. The aim of this study is to investigate both novel and proven mutations of 11 MODY genes in Turkish children by using targeted next generation sequencing. A panel of 11 MODY genes were screened in 43 children with MODY diagnosed by clinical criterias. Studies of index cases was done with MISEQ-ILLUMINA, and family screenings and confirmation studies of mutations was done by Sanger sequencing. We identified 28 (65%) point mutations among 43 patients. Eighteen patients have GCK mutations, four have HNF1A, one has HNF4A, one has HNF1B, two have NEUROD1, one has PDX1 gene variations and one patient has both HNF1A and HNF4A heterozygote mutations. This is the first study including molecular studies of 11 MODY genes in Turkish children. GCK is the most frequent type of MODY in our study population. Very high frequency of novel mutations (42%) in our study population, supports that in heterogenous disorders like MODY sequence analysis provides rapid, cost effective and accurate genetic diagnosis.
Jupe, Florian; Witek, Kamil; Verweij, Walter; Śliwka, Jadwiga; Pritchard, Leighton; Etherington, Graham J; Maclean, Dan; Cock, Peter J; Leggett, Richard M; Bryan, Glenn J; Cardle, Linda; Hein, Ingo; Jones, Jonathan DG
2013-01-01
Summary RenSeq is a NB-LRR (nucleotide binding-site leucine-rich repeat) gene-targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB-LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB-LRRs and can be accessed through a genome browser that we provide. We compared these NB-LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ∼80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum ‘Heinz 1706’ extended the NB-LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co-segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi-ber2) and S. ruiz-ceballosii (Rpi-rzc1), we were able to apply RenSeq successfully to identify markers that co-segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy-to-adapt Galaxy pipelines. PMID:23937694
Takacs-Vesbach, Cristina; Inskeep, William P.; Jay, Zackary J.; Herrgard, Markus J.; Rusch, Douglas B.; Tringe, Susannah G.; Kozubal, Mark A.; Hamamura, Natsuko; Macur, Richard E.; Fouke, Bruce W.; Reysenbach, Anna-Louise; McDermott, Timothy R.; Jennings, Ryan deM.; Hengartner, Nicolas W.; Xie, Gary
2013-01-01
The Aquificales are thermophilic microorganisms that inhabit hydrothermal systems worldwide and are considered one of the earliest lineages of the domain Bacteria. We analyzed metagenome sequence obtained from six thermal “filamentous streamer” communities (∼40 Mbp per site), which targeted three different groups of Aquificales found in Yellowstone National Park (YNP). Unassembled metagenome sequence and PCR-amplified 16S rRNA gene libraries revealed that acidic, sulfidic sites were dominated by Hydrogenobaculum (Aquificaceae) populations, whereas the circum-neutral pH (6.5–7.8) sites containing dissolved sulfide were dominated by Sulfurihydrogenibium spp. (Hydrogenothermaceae). Thermocrinis (Aquificaceae) populations were found primarily in the circum-neutral sites with undetectable sulfide, and to a lesser extent in one sulfidic system at pH 8. Phylogenetic analysis of assembled sequence containing 16S rRNA genes as well as conserved protein-encoding genes revealed that the composition and function of these communities varied across geochemical conditions. Each Aquificales lineage contained genes for CO2 fixation by the reverse-TCA cycle, but only the Sulfurihydrogenibium populations perform citrate cleavage using ATP citrate lyase (Acl). The Aquificaceae populations use an alternative pathway catalyzed by two separate enzymes, citryl-CoA synthetase (Ccs), and citryl-CoA lyase (Ccl). All three Aquificales lineages contained evidence of aerobic respiration, albeit due to completely different types of heme Cu oxidases (subunit I) involved in oxygen reduction. The distribution of Aquificales populations and differences among functional genes involved in energy generation and electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, H2, O2) have resulted in niche specialization among members of the Aquificales. PMID:23755042
Novel genetic tools for studying food-borne Salmonella.
Andrews-Polymenis, Helene L; Santiviago, Carlos A; McClelland, Michael
2009-04-01
Nontyphoidal Salmonellae are highly prevalent food-borne pathogens. High-throughput sequencing of Salmonella genomes is expanding our knowledge of the evolution of serovars and epidemic isolates. Genome sequences have also allowed the creation of complete microarrays. Microarrays have improved the throughput of in vivo expression technology (IVET) used to uncover promoters active during infection. In another method, signature tagged mutagenesis (STM), pools of mutants are subjected to selection. Changes in the population are monitored on a microarray, revealing genes under selection. Complete genome sequences permit the construction of pools of targeted in-frame deletions that have improved STM by minimizing the number of clones and the polarity of each mutant. Together, genome sequences and the continuing development of new tools for functional genomics will drive a revolution in the understanding of Salmonellae in many different niches that are critical for food safety.
Dan, Tong; Liu, Wenjun; Sun, Zhihong; Lv, Qiang; Xu, Haiyan; Song, Yuqin; Zhang, Heping
2014-06-09
Economically, Leuconostoc lactis is one of the most important species in the genus Leuconostoc. It plays an important role in the food industry including the production of dextrans and bacteriocins. Currently, traditional molecular typing approaches for characterisation of this species at the isolate level are either unavailable or are not sufficiently reliable for practical use. Multilocus sequence typing (MLST) is a robust and reliable method for characterising bacterial and fungal species at the molecular level. In this study, a novel MLST protocol was developed for 50 L. lactis isolates from Mongolia and China. Sequences from eight targeted genes (groEL, carB, recA, pheS, murC, pyrG, rpoB and uvrC) were obtained. Sequence analysis indicated 20 different sequence types (STs), with 13 of them being represented by a single isolate. Phylogenetic analysis based on the sequences of eight MLST loci indicated that the isolates belonged to two major groups, A (34 isolates) and B (16 isolates). Linkage disequilibrium analyses indicated that recombination occurred at a low frequency in L. lactis, indicating a clonal population structure. Split-decomposition analysis indicated that intraspecies recombination played a role in generating genotypic diversity amongst isolates. Our results indicated that MLST is a valuable tool for typing L. lactis isolates that can be used for further monitoring of evolutionary changes and population genetics.
Error catastrophe and phase transition in the empirical fitness landscape of HIV
NASA Astrophysics Data System (ADS)
Hart, Gregory R.; Ferguson, Andrew L.
2015-03-01
We have translated clinical sequence databases of the p6 HIV protein into an empirical fitness landscape quantifying viral replicative capacity as a function of the amino acid sequence. We show that the viral population resides close to a phase transition in sequence space corresponding to an "error catastrophe" beyond which there is lethal accumulation of mutations. Our model predicts that the phase transition may be induced by drug therapies that elevate the mutation rate, or by forcing mutations at particular amino acids. Applying immune pressure to any combination of killer T-cell targets cannot induce the transition, providing a rationale for why the viral protein can exist close to the error catastrophe without sustaining fatal fitness penalties due to adaptive immunity.
Kit for detecting nucleic acid sequences using competitive hybridization probes
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
2001-01-01
A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.
Vicente-Dólera, Nelly; Troadec, Christelle; Moya, Manuel; del Río-Celestino, Mercedes; Pomares-Viciana, Teresa; Bendahmane, Abdelhafid; Picó, Belén; Román, Belén; Gómez, Pedro
2014-01-01
Although the availability of genetic and genomic resources for Cucurbita pepo has increased significantly, functional genomic resources are still limited for this crop. In this direction, we have developed a high throughput reverse genetic tool: the first TILLING (Targeting Induced Local Lesions IN Genomes) resource for this species. Additionally, we have used this resource to demonstrate that the previous EMS mutant population we developed has the highest mutation density compared with other cucurbits mutant populations. The overall mutation density in this first C. pepo TILLING platform was estimated to be 1/133 Kb by screening five additional genes. In total, 58 mutations confirmed by sequencing were identified in the five targeted genes, thirteen of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was studied in a peroxidase gene, revealing that the phenotype of seedling homozygous for one of the isolated mutant alleles was albino. These results indicate that the TILLING approach in this species was successful at providing new mutations and can address the major challenge of linking sequence information to biological function and also the identification of novel variation for crop breeding. PMID:25386735
Jun, Goo; Wing, Mary Kate; Abecasis, Gonçalo R; Kang, Hyun Min
2015-06-01
The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies. © 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devarakonda, M.S.
1988-01-01
Control over population dynamics and organism selection in a biological waste treatment system provides an effective means of engineering process efficiency. Examples of applications of organism selection include control of filamentous organisms, biological nutrient removal, industrial waste treatment requiring the removal of specific substrates, and hazardous waste treatment. Inherently, full scale biological waste treatment systems are unsteady state systems due to the variations in the waste streams and mass flow rates of the substrates. Some systems, however, have the capacity to impose controlled selective pressures on the biological population by means of their operation. An example of such a systemmore » is the Sequencing Batch Reactor (SBR) which was the experimental system utilized in this research work. The concepts of organism selection were studied in detail for the biodegradation of a herbicide waste stream, with glyphosate as the target compound. The SBR provided a reactor configuration capable of exerting the necessary selective pressures to select and enrich for a glyphosate degrading population. Based on results for bench scale SBRs, a hypothesis was developed to explain population dynamics in glyphosate degrading systems.« less
Combined hairpin-antisense compositions and methods for modulating expression
Shanklin, John; Nguyen, Tam
2014-08-05
A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.
Combined hairpin-antisense compositions and methods for modulating expression
Shanklin, John; Nguyen, Tam Huu
2015-11-24
A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.
Brownstein, Zippora; Abu-Rayyan, Amal; Karfunkel-Doron, Daphne; Sirigu, Serena; Davidov, Bella; Shohat, Mordechai; Frydman, Moshe; Houdusse, Anne; Kanaan, Moien; Avraham, Karen B
2014-01-01
Hereditary hearing loss is genetically heterogeneous, with a large number of genes and mutations contributing to this sensory, often monogenic, disease. This number, as well as large size, precludes comprehensive genetic diagnosis of all known deafness genes. A combination of targeted genomic capture and massively parallel sequencing (MPS), also referred to as next-generation sequencing, was applied to determine the deafness-causing genes in hearing-impaired individuals from Israeli Jewish and Palestinian Arab families. Among the mutations detected, we identified nine novel mutations in the genes encoding myosin VI, myosin VIIA and myosin XVA, doubling the number of myosin mutations in the Middle East. Myosin VI mutations were identified in this population for the first time. Modeling of the mutations provided predicted mechanisms for the damage they inflict in the molecular motors, leading to impaired function and thus deafness. The myosin mutations span all regions of these molecular motors, leading to a wide range of hearing phenotypes, reinforcing the key role of this family of proteins in auditory function. This study demonstrates that multiple mutations responsible for hearing loss can be identified in a relatively straightforward manner by targeted-gene MPS technology and concludes that this is the optimal genetic diagnostic approach for identification of mutations responsible for hearing loss. PMID:24105371
Genomic Rearrangements in Arabidopsis Considered as Quantitative Traits.
Imprialou, Martha; Kahles, André; Steffen, Joshua G; Osborne, Edward J; Gan, Xiangchao; Lempe, Janne; Bhomra, Amarjit; Belfield, Eric; Visscher, Anne; Greenhalgh, Robert; Harberd, Nicholas P; Goram, Richard; Hein, Jotun; Robert-Seilaniantz, Alexandre; Jones, Jonathan; Stegle, Oliver; Kover, Paula; Tsiantis, Miltos; Nordborg, Magnus; Rätsch, Gunnar; Clark, Richard M; Mott, Richard
2017-04-01
To understand the population genetics of structural variants and their effects on phenotypes, we developed an approach to mapping structural variants that segregate in a population sequenced at low coverage. We avoid calling structural variants directly. Instead, the evidence for a potential structural variant at a locus is indicated by variation in the counts of short-reads that map anomalously to that locus. These structural variant traits are treated as quantitative traits and mapped genetically, analogously to a gene expression study. Association between a structural variant trait at one locus, and genotypes at a distant locus indicate the origin and target of a transposition. Using ultra-low-coverage (0.3×) population sequence data from 488 recombinant inbred Arabidopsis thaliana genomes, we identified 6502 segregating structural variants. Remarkably, 25% of these were transpositions. While many structural variants cannot be delineated precisely, we validated 83% of 44 predicted transposition breakpoints by polymerase chain reaction. We show that specific structural variants may be causative for quantitative trait loci for germination and resistance to infection by the fungus Albugo laibachii , isolate Nc14. Further we show that the phenotypic heritability attributable to read-mapping anomalies differs from, and, in the case of time to germination and bolting, exceeds that due to standard genetic variation. Genes within structural variants are also more likely to be silenced or dysregulated. This approach complements the prevalent strategy of structural variant discovery in fewer individuals sequenced at high coverage. It is generally applicable to large populations sequenced at low-coverage, and is particularly suited to mapping transpositions. Copyright © 2017 by the Genetics Society of America.
Wiersma, Andrew T; Gaines, Todd A; Preston, Christopher; Hamilton, John P; Giacomini, Darci; Robin Buell, C; Leach, Jan E; Westra, Philip
2015-02-01
Field-evolved resistance to the herbicide glyphosate is due to amplification of one of two EPSPS alleles, increasing transcription and protein with no splice variants or effects on other pathway genes. The widely used herbicide glyphosate inhibits the shikimate pathway enzyme 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS). Globally, the intensive use of glyphosate for weed control has selected for glyphosate resistance in 31 weed species. Populations of suspected glyphosate-resistant Kochia scoparia were collected from fields located in the US central Great Plains. Glyphosate dose response verified glyphosate resistance in nine populations. The mechanism of resistance to glyphosate was investigated using targeted sequencing, quantitative PCR, immunoblotting, and whole transcriptome de novo sequencing to characterize the sequence and expression of EPSPS. Sequence analysis showed no mutation of the EPSPS Pro106 codon in glyphosate-resistant K. scoparia, whereas EPSPS genomic copy number and transcript abundance were elevated three- to ten-fold in resistant individuals relative to susceptible individuals. Glyphosate-resistant individuals with increased relative EPSPS copy numbers had consistently lower shikimate accumulation in leaf disks treated with 100 μM glyphosate and EPSPS protein levels were higher in glyphosate-resistant individuals with increased gene copy number compared to glyphosate-susceptible individuals. RNA sequence analysis revealed seven nucleotide positions with two different expressed alleles in glyphosate-susceptible reads. However, one nucleotide at the seven positions was predominant in glyphosate-resistant sequences, suggesting that only one of two EPSPS alleles was amplified in glyphosate-resistant individuals. No alternatively spliced EPSPS transcripts were detected. Expression of five other genes in the chorismate pathway was unaffected in glyphosate-resistant individuals with increased EPSPS expression. These results indicate increased EPSPS expression is a mechanism for glyphosate resistance in these K. scoparia populations.
Adamiak, Paul; Vanderkooi, Otto G; Kellner, James D; Schryvers, Anthony B; Bettinger, Julie A; Alcantara, Joenel
2014-06-03
Multi-locus sequence typing (MLST) is a portable, broadly applicable method for classifying bacterial isolates at an intra-species level. This methodology provides clinical and scientific investigators with a standardized means of monitoring evolution within bacterial populations. MLST uses the DNA sequences from a set of genes such that each unique combination of sequences defines an isolate's sequence type. In order to reliably determine the sequence of a typing gene, matching sequence reads for both strands of the gene must be obtained. This study assesses the ability of both the standard, and an alternative set of, Streptococcus pneumoniae MLST primers to completely sequence, in both directions, the required typing alleles. The results demonstrated that for five (aroE, recP, spi, xpt, ddl) of the seven S. pneumoniae typing alleles, the standard primers were unable to obtain the complete forward and reverse sequences. This is due to the standard primers annealing too closely to the target regions, and current sequencing technology failing to sequence the bases that are too close to the primer. The alternative primer set described here, which includes a combination of primers proposed by the CDC and several designed as part of this study, addresses this limitation by annealing to highly conserved segments further from the target region. This primer set was subsequently employed to sequence type 105 S. pneumoniae isolates collected by the Canadian Immunization Monitoring Program ACTive (IMPACT) over a period of 18 years. The inability of several of the standard S. pneumoniae MLST primers to fully sequence the required region was consistently observed and is the result of a shift in sequencing technology occurring after the original primers were designed. The results presented here introduce clear documentation describing this phenomenon into the literature, and provide additional guidance, through the introduction of a widely validated set of alternative primers, to research groups seeking to undertake S. pneumoniae MLST based studies.
Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus
Kinoti, Wycliff M.; Constable, Fiona E.; Nancarrow, Narelle; Plummer, Kim M.; Rodoni, Brendan
2017-01-01
The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus, occurring in 48 of the 61 Ilarvirus-positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus-like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus-like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus-like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the need for a standardized approach to accurately determine what constitutes an active, viable virus infection after detection by molecular based methods. PMID:28713347
Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus.
Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan
2017-01-01
The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus , occurring in 48 of the 61 Ilarvirus -positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus -like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus -like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus -like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the need for a standardized approach to accurately determine what constitutes an active, viable virus infection after detection by molecular based methods.
Birth control vaccine targeting leukemia inhibitory factor.
Lemons, Angela R; Naz, Rajesh K
2012-02-01
The population explosion and unintended pregnancies resulting in elective abortions continue to impose major public health issues. This calls for a better method of contraception. Immunocontraception has been proposed as a valuable alternative that can fulfill most, if not all, of the properties of an ideal contraceptive. There are several targets that are being explored for contraceptive vaccine development. Leukemia inhibitory factor (LIF), a member of interleukin-6 family, is required for embryo development and successful blastocyst implantation in several mammalian species. The present study was conducted to examine if LIF can be a target for the development of a birth control vaccine. Three sequences from LIF and two sequences from LIF-receptor (LIF-R) that span the regions involved in ligand-receptor binding were delineated, and peptides were synthesized based upon these sequences. Antibodies raised against these five peptides reduced LIF bioactivity in an in vitro culture assay using BA/F3 mLIF-R-mpg130 cells. Vaccines were prepared by conjugating these peptides to various carrier proteins. Immunization of female mice with these peptide vaccines induced a long-lasting, circulating as well as local antibody response in various parts of the genital tract, and resulted in a significant (P ≤ 0.05) inhibition in fertility in all the three trials; the LIF-R peptide vaccines proved to be a better vaccine target. The data indicate that LIF/LIF-R is an excellent target for the development of a birth control vaccine. This is the first study, to our knowledge, that examined LIF/LIF-R as a target for immunocontraception. The findings of this study can be easily translated to humans since LIF/LIF-R is also important for implantation and pregnancy in women. Copyright © 2011 Wiley Periodicals, Inc.
Hofmann, Natalie; Mwingira, Felista; Shekalaghe, Seif; Robinson, Leanne J.; Mueller, Ivo; Felger, Ingrid
2015-01-01
Background Planning and evaluating malaria control strategies relies on accurate definition of parasite prevalence in the population. A large proportion of asymptomatic parasite infections can only be identified by surveillance with molecular methods, yet these infections also contribute to onward transmission to mosquitoes. The sensitivity of molecular detection by PCR is limited by the abundance of the target sequence in a DNA sample; thus, detection becomes imperfect at low densities. We aimed to increase PCR diagnostic sensitivity by targeting multi-copy genomic sequences for reliable detection of low-density infections, and investigated the impact of these PCR assays on community prevalence data. Methods and Findings Two quantitative PCR (qPCR) assays were developed for ultra-sensitive detection of Plasmodium falciparum, targeting the high-copy telomere-associated repetitive element 2 (TARE-2, ∼250 copies/genome) and the var gene acidic terminal sequence (varATS, 59 copies/genome). Our assays reached a limit of detection of 0.03 to 0.15 parasites/μl blood and were 10× more sensitive than standard 18S rRNA qPCR. In a population cross-sectional study in Tanzania, 295/498 samples tested positive using ultra-sensitive assays. Light microscopy missed 169 infections (57%). 18S rRNA qPCR failed to identify 48 infections (16%), of which 40% carried gametocytes detected by pfs25 quantitative reverse-transcription PCR. To judge the suitability of the TARE-2 and varATS assays for high-throughput screens, their performance was tested on sample pools. Both ultra-sensitive assays correctly detected all pools containing one low-density P. falciparum–positive sample, which went undetected by 18S rRNA qPCR, among nine negatives. TARE-2 and varATS qPCRs improve estimates of prevalence rates, yet other infections might still remain undetected when absent in the limited blood volume sampled. Conclusions Measured malaria prevalence in communities is largely determined by the sensitivity of the diagnostic tool used. Even when applying standard molecular diagnostics, prevalence in our study population was underestimated by 8% compared to the new assays. Our findings highlight the need for highly sensitive tools such as TARE-2 and varATS qPCR in community surveillance and for monitoring interventions to better describe malaria epidemiology and inform malaria elimination efforts. PMID:25734259
Clonal evolution in hematologic malignancies and therapeutic implications
Landau, Dan A.; Carter, Scott L.; Getz, Gad; Wu, Catherine J.
2014-01-01
The ability of cancer to evolve and adapt is a principal challenge to therapy in general, and to the paradigm of targeted therapy in particular. This ability is fueled by the co-existence of multiple, genetically heterogeneous subpopulations within the cancer cell population. Increasing evidence has supported the idea that these subpopulations are selected in a Darwinian fashion, by which the genetic landscape of the tumor is continuously reshaped. Massively parallel sequencing has enabled a recent surge in our ability to study this process, adding to previous efforts using cytogenetic methods and targeted sequencing. Altogether, these studies reveal the complex evolutionary trajectories occurring across individual hematological malignancies. They also suggest that while clonal evolution may contribute to resistance to therapy, treatment may also hasten the evolutionary process. New insights into this process challenge us to understand the impact of treatment on clonal evolution, and inspire the development of novel prognostic and therapeutic strategies. PMID:23979521
smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni
2012-01-01
Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits. PMID:23116282
smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni.
Mandhan, Vibha; Kaur, Jagdeep; Singh, Kashmir
2012-11-01
MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits.
Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).
Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E
2017-01-01
Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.
NASA Astrophysics Data System (ADS)
Li, Ren; Zhou, Mingxing; Li, Jine; Wang, Zihua; Zhang, Weikai; Yue, Chunyan; Ma, Yan; Peng, Hailin; Wei, Zewen; Hu, Zhiyuan
2018-03-01
EGFR mutations companion diagnostics have been proved to be crucial for the efficacy of tyrosine kinase inhibitor targeted cancer therapies. To uncover multiple mutations occurred in minority of EGFR-mutated cells, which may be covered by the noises from majority of un-mutated cells, is currently becoming an urgent clinical requirement. Here we present the validation of a microfluidic-chip-based method for detecting EGFR multi-mutations at single-cell level. By trapping and immunofluorescently imaging single cells in specifically designed silicon microwells, the EGFR-expressed cells were easily identified. By in situ lysing single cells, the cell lysates of EGFR-expressed cells were retrieved without cross-contamination. Benefited from excluding the noise from cells without EGFR expression, the simple and cost-effective Sanger's sequencing, but not the expensive deep sequencing of the whole cell population, was used to discover multi-mutations. We verified the new method with precisely discovering three most important EGFR drug-related mutations from a sample in which EGFR-mutated cells only account for a small percentage of whole cell population. The microfluidic chip is capable of discovering not only the existence of specific EGFR multi-mutations, but also other valuable single-cell-level information: on which specific cells the mutations occurred, or whether different mutations coexist on the same cells. This microfluidic chip constitutes a promising method to promote simple and cost-effective Sanger's sequencing to be a routine test before performing targeted cancer therapy.[Figure not available: see fulltext.
Wang, Dan; Liang, Shengyun; Zhang, Zhao; Zhao, Guoru; Hu, Yuan; Liang, Shengran; Zhang, Xipeng; Banerjee, Santasree
2017-03-28
Familial adenomatous polyposis (FAP) is an autosomal dominant precancerous condition, clinically characterized by the presence of multiple colorectal adenomas or polyps. Patients with FAP has a high risk of developing colorectal cancer (CRC) from these colorectal adenomatous polyps by the mean age of diagnosis at 40 years. Germline mutations of the APC gene cause familial adenomatous polyposis (FAP). Colectomy has recommended for the FAP patients with significant polyposis. Here, we present a clinical molecular study of a four generation Chinese family with FAP. Clinical diagnosis of FAP has been done according to the phenotype, family history and medical records. Patient's blood samples were collected and genomic DNA was extracted. In order to identify the pathogenic mutation underlying the disease phenotype targeted next-generation sequencing and confirmatory sanger sequencing has undertaken. Targeted next generation sequencing identified a novel heterozygous splice-acceptor site mutation [c.1744-1G>A] in intron 14 of APC gene, which is co-segregated with the FAP phenotypes in the proband and amongst all the affected family members. This mutation is not present in unaffected family members and in normal healthy controls of same ethnic origin. According to the LOVD database for Chinese colorectal cancer patients, in Chinese population, 60% of the previously reported APC gene mutations causes FAP, are missense mutations. This novel splice-acceptor site mutation causing FAP in this Chinese family expands the germline mutation spectrum of the APC gene in the Chinese population.
Shanks, Orin C.; White, Karen; Kelty, Catherine A.; Hayes, Sam; Sivaganesan, Mano; Jenkins, Michael; Varma, Manju; Haugland, Richard A.
2010-01-01
There are numerous PCR-based assays available to characterize bovine fecal pollution in ambient waters. The determination of which approaches are most suitable for field applications can be difficult because each assay targets a different gene, in many cases from different microorganisms, leading to variation in assay performance. We describe a performance evaluation of seven end-point PCR and real-time quantitative PCR (qPCR) assays reported to be associated with either ruminant or bovine feces. Each assay was tested against a reference collection of DNA extracts from 247 individual bovine fecal samples representing 11 different populations and 175 fecal DNA extracts from 24 different animal species. Bovine-associated genetic markers were broadly distributed among individual bovine samples ranging from 39 to 93%. Specificity levels of the assays spanned 47.4% to 100%. End-point PCR sensitivity also varied between assays and among different bovine populations. For qPCR assays, the abundance of each host-associated genetic marker was measured within each bovine population and compared to results of a qPCR assay targeting 16S rRNA gene sequences from Bacteroidales. Experiments indicate large discrepancies in the performance of bovine-associated assays across different bovine populations. Variability in assay performance between host populations suggests that the use of bovine microbial source-tracking applications will require a priori characterization at each watershed of interest. PMID:20061457
Ferme, D; Banjac, M; Calsamiglia, S; Busquet, M; Kamel, C; Avgustin, G
2004-01-01
An in vitro study in dual-flow continuous-culture fermentors was conducted with two different concentrations of monensin, cinnamaldehyde or garlic extract added to 1:1 forage-to-concentrate diet in order to determine their effects on selected rumen bacterial populations. Samples were subjected to total DNA extraction, restriction analysis of PCR amplified parts of 16S rRNA genes (ARDRA) and subsequent analysis of the restriction profiles by lab-on-chip technology with the Agilent's Bioanalyser 2100. Eub338-BacPre primer pair was used to select for the bacteria from the genera Bacteroides, Porphyromonas and Prevotella, especially the latter representing the dominant Gram-negative bacterial population in the rumen. Preliminary results of HaeIII restriction analysis show that the effects of monensin, cinnamaldehyde and garlic extract on the BacPre targeted ruminal bacteria are somewhat different in regard to targeted populations and to the nature of the effect. Garlic extract was found to trigger the most intensive changes in the structure of the BacPre targeted population. Comparison of the in silico restriction analysis of BacPre sequences deposited in different DNA databanks and of the results of performed amplified ribosomal DNA restriction analysis showed differences between the predicted and obtained HaeIII restriction profiles, and suggested the presence of novel, still unknown Prevotella populations in studied samples.
Lynch, David S; Koutsis, Georgios; Tucci, Arianna; Panas, Marios; Baklou, Markella; Breza, Marianthi; Karadima, Georgia; Houlden, Henry
2016-06-01
Hereditary Spastic Paraplegia (HSP) is a syndrome characterised by lower limb spasticity, occurring alone or in association with other neurological manifestations, such as cognitive impairment, seizures, ataxia or neuropathy. HSP occurs worldwide, with different populations having different frequencies of causative genes. The Greek population has not yet been characterised. The purpose of this study was to describe the clinical presentation and molecular epidemiology of the largest cohort of HSP in Greece, comprising 54 patients from 40 families. We used a targeted next-generation sequencing (NGS) approach to genetically assess a proband from each family. We made a genetic diagnosis in >50% of cases and identified 11 novel variants. Variants in SPAST and KIF5A were the most common causes of autosomal dominant HSP, whereas SPG11 and CYP7B1 were the most common cause of autosomal recessive HSP. We identified a novel variant in SPG11, which led to disease with later onset and may be unique to the Greek population and report the first nonsense mutation in KIF5A. Interestingly, the frequency of HSP mutations in the Greek population, which is relatively isolated, was very similar to other European populations. We confirm that NGS approaches are an efficient diagnostic tool and should be employed early in the assessment of HSP patients.
Single molecule targeted sequencing for cancer gene mutation detection.
Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui
2016-05-19
With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis.
Reitzel, A M; Herrera, S; Layden, M J; Martindale, M Q; Shank, T M
2013-06-01
Characterization of large numbers of single-nucleotide polymorphisms (SNPs) throughout a genome has the power to refine the understanding of population demographic history and to identify genomic regions under selection in natural populations. To this end, population genomic approaches that harness the power of next-generation sequencing to understand the ecology and evolution of marine invertebrates represent a boon to test long-standing questions in marine biology and conservation. We employed restriction-site-associated DNA sequencing (RAD-seq) to identify SNPs in natural populations of the sea anemone Nematostella vectensis, an emerging cnidarian model with a broad geographic range in estuarine habitats in North and South America, and portions of England. We identified hundreds of SNP-containing tags in thousands of RAD loci from 30 barcoded individuals inhabiting four locations from Nova Scotia to South Carolina. Population genomic analyses using high-confidence SNPs resulted in a highly-resolved phylogeography, a result not achieved in previous studies using traditional markers. Plots of locus-specific FST against heterozygosity suggest that a majority of polymorphic sites are neutral, with a smaller proportion suggesting evidence for balancing selection. Loci inferred to be under balancing selection were mapped to the genome, where 90% were located in gene bodies, indicating potential targets of selection. The results from analyses with and without a reference genome supported similar conclusions, further highlighting RAD-seq as a method that can be efficiently applied to species lacking existing genomic resources. We discuss the utility of RAD-seq approaches in burgeoning Nematostella research as well as in other cnidarian species, particularly corals and jellyfishes, to determine phylogeographic relationships of populations and identify regions of the genome undergoing selection. © 2013 John Wiley & Sons Ltd.
Reitzel, A.M.; Herrera, S.; Layden, M.J.; Martindale, M.Q.; Shank, T.M.
2013-01-01
Characterization of large numbers of single nucleotide polymorphisms (SNPs) throughout a genome has the power to refine the understanding of population demographic history and to identify genomic regions under selection in natural populations. To this end, population genomic approaches that harness the power of next-generation sequencing to understand the ecology and evolution of marine invertebrates represent a boon to test long-standing questions in marine biology and conservation. We employed restriction-site-associated DNA sequencing (RAD-seq) to identify SNPs in natural populations of the sea anemone Nematostella vectensis, an emerging cnidarian model with a broad geographic range in estuarine habitats in North and South America, and portions of England. We identified hundreds of SNP-containing tags in thousands of RAD loci from 30 barcoded individuals inhabiting four locations from Nova Scotia to South Carolina. Population genomic analyses using high-confidence SNPs resulted in a highly-resolved phylogeography, a result not achieved in previous studies using traditional markers. Plots of locus-specific FST against heterozygosity suggest that a majority of polymorphic sites are neutral, with a smaller proportion suggesting evidence for balancing selection. Loci inferred to be under balancing selection were mapped to the genome, where 90% were located in gene bodies, indicating potential targets of selection. Results from analyses with and without a reference genome supported similar conclusions, further supporting RAD-seq as a method that can be efficiently applied to species lacking existing genomic resources. We discuss the utility of RAD-seq approaches in burgeoning Nematostella research as well as in other cnidarian species, particularly corals, to determine phylogeographic relationships of populations and identify regions of the genome undergoing selection. PMID:23473066
Targeted Re-Sequencing Emulsion PCR Panel for Myopathies: Results in 94 Cases.
Punetha, Jaya; Kesari, Akanchha; Uapinyoying, Prech; Giri, Mamta; Clarke, Nigel F; Waddell, Leigh B; North, Kathryn N; Ghaoui, Roula; O'Grady, Gina L; Oates, Emily C; Sandaradura, Sarah A; Bönnemann, Carsten G; Donkervoort, Sandra; Plotz, Paul H; Smith, Edward C; Tesi-Rocha, Carolina; Bertorini, Tulio E; Tarnopolsky, Mark A; Reitter, Bernd; Hausmanowa-Petrusewicz, Irena; Hoffman, Eric P
2016-05-27
Molecular diagnostics in the genetic myopathies often requires testing of the largest and most complex transcript units in the human genome (DMD, TTN, NEB). Iteratively targeting single genes for sequencing has traditionally entailed high costs and long turnaround times. Exome sequencing has begun to supplant single targeted genes, but there are concerns regarding coverage and needed depth of the very large and complex genes that frequently cause myopathies. To evaluate efficiency of next-generation sequencing technologies to provide molecular diagnostics for patients with previously undiagnosed myopathies. We tested a targeted re-sequencing approach, using a 45 gene emulsion PCR myopathy panel, with subsequent sequencing on the Illumina platform in 94 undiagnosed patients. We compared the targeted re-sequencing approach to exome sequencing for 10 of these patients studied. We detected likely pathogenic mutations in 33 out of 94 patients with a molecular diagnostic rate of approximately 35%. The remaining patients showed variants of unknown significance (35/94 patients) or no mutations detected in the 45 genes tested (26/94 patients). Mutation detection rates for targeted re-sequencing vs. whole exome were similar in both methods; however exome sequencing showed better distribution of reads and fewer exon dropouts. Given that costs of highly parallel re-sequencing and whole exome sequencing are similar, and that exome sequencing now takes considerably less laboratory processing time than targeted re-sequencing, we recommend exome sequencing as the standard approach for molecular diagnostics of myopathies.
Sunshine, Justine E.; Larsen, Brendan B.; Maust, Brandon; Casey, Ellie; Deng, Wenje; Chen, Lennie; Westfall, Dylan H.; Kim, Moon; Zhao, Hong; Ghorai, Suvankar; Lanxon-Cookson, Erinn; Rolland, Morgane; Collier, Ann C.; Maenza, Janine; Mullins, James I.
2015-01-01
ABSTRACT To understand the interplay between host cytotoxic T-lymphocyte (CTL) responses and the mechanisms by which HIV-1 evades them, we studied viral evolutionary patterns associated with host CTL responses in six linked transmission pairs. HIV-1 sequences corresponding to full-length p17 and p24 gag were generated by 454 pyrosequencing for all pairs near the time of transmission, and seroconverting partners were followed for a median of 847 days postinfection. T-cell responses were screened by gamma interferon/interleukin-2 (IFN-γ/IL-2) FluoroSpot using autologous peptide sets reflecting any Gag variant present in at least 5% of sequence reads in the individual's viral population. While we found little evidence for the occurrence of CTL reversions, CTL escape processes were found to be highly dynamic, with multiple epitope variants emerging simultaneously. We found a correlation between epitope entropy and the number of epitope variants per response (r = 0.43; P = 0.05). In cases in which multiple escape mutations developed within a targeted epitope, a variant with no fitness cost became fixed in the viral population. When multiple mutations within an epitope achieved fitness-balanced escape, these escape mutants were each maintained in the viral population. Additional mutations found to confer escape but undetected in viral populations incurred high fitness costs, suggesting that functional constraints limit the available sites tolerable to escape mutations. These results further our understanding of the impact of CTL escape and reversion from the founder virus in HIV infection and contribute to the identification of immunogenic Gag regions most vulnerable to a targeted T-cell attack. IMPORTANCE Rapid diversification of the viral population is a hallmark of HIV-1 infection, and understanding the selective forces driving the emergence of viral variants can provide critical insight into the interplay between host immune responses and viral evolution. We used deep sequencing to comprehensively follow viral evolution over time in six linked HIV transmission pairs. We then mapped T-cell responses to explore if mutations arose due to adaption to the host and found that escape processes were often highly dynamic, with multiple mutations arising within targeted epitopes. When we explored the impact of these mutations on replicative capacity, we found that dynamic escape processes only resolve with the selection of mutations that conferred escape with no fitness cost to the virus. These results provide further understanding of the complicated viral-host interactions that occur during early HIV-1 infection and may help inform the design of future vaccine immunogens. PMID:26223634
Diagnostic application of clinical exome sequencing in Leber congenital amaurosis.
Han, Jinu; Rim, John Hoon; Hwang, In Sik; Kim, Jieun; Shin, Saeam; Lee, Seung-Tae; Choi, Jong Rak
2017-01-01
Leber congenital amaurosis (LCA) is a hereditary retinal dystrophy with wide genetic heterogeneity. Next-generation sequencing (NGS) targeting multiple genes can be a good option for the diagnosis of LCA, and we tested a clinical exome panel in patients with LCA. A total of nine unrelated Korean patients with LCA were sequenced using the Illumina TruSight One panel, which targets 4,813 clinically associated genes, followed by confirmation using Sanger sequencing. Patients' clinical information and familial study results were obtained and used for comprehensive interpretation. In all nine patients, we identified pathogenic variations in LCA-associated genes: NMNAT1 (n=3), GUCY2D (n=2), RPGRIP1 (n=2), CRX (n=1), and CEP290 or SPATA7 . Six patients had one or two mutations in accordance with inheritance patterns, all consistent with clinical phenotypes. Two patients had only one pathogenic mutation in recessive genes ( NMNAT1 and RPGRIP1 ), and the clinical features were specific to disorders associated with those genes. Six patients were solved for genetic causes, and it remains unclear for three patients with the clinical exome panel. With subsequent targeted panel sequencing with 113 genes associated with infantile nystagmus syndrome, a likely pathogenic allele in CEP290 was detected in one patient. Interestingly, one pathogenic variant (p.Arg237Cys) in NMNAT1 was present in three patients, and it had a high allele frequency (0.24%) in the general Korean population, suggesting that NMNAT1 could be a major gene responsible for LCA in Koreans. We confirmed that a commercial clinical exome panel can be effectively used in the diagnosis of LCA. Careful interpretation and clinical correlation could promote the successful implementation of clinical exome panels in routine diagnoses of retinal dystrophies, including LCA.
Shinkai, Yoichi; Kuramochi, Masahiro; Doi, Motomichi
2018-05-03
Recently, advances in next-generation sequencing technologies have enabled genome-wide analyses of epigenetic modifications; however, it remains difficult to analyze the states of histone modifications at a single-cell resolution in living multicellular organisms because of the heterogeneity within cellular populations. Here we describe a simple method to visualize histone modifications on the specific sequence of target locus at a single-cell resolution in living Caenorhabditis elegans , by combining the LacO/LacI system and a genetically-encoded H4K20me1-specific probe, "mintbody". We demonstrate that Venus-labeled mintbody and mTurquoise2-labeled LacI can co-localize on an artificial chromosome carrying both the target locus and LacO sequences, where H4K20me1 marks the target locus. We demonstrate that our visualization method can precisely detect H4K20me1 depositions on the her-1 gene sequences on the artificial chromosome, to which the dosage compensation complex binds to regulate sex determination. The degree of H4K20me1 deposition on the her-1 sequences on the artificial chromosome correlated strongly with sex, suggesting that, using the artificial chromosome, this method can reflect context-dependent changes of H4K20me1 on endogenous genomes. Furthermore, we demonstrate live imaging of H4K20me1 depositions on the artificial chromosome. Combined with ChIP assays, this mintbody-LacO/LacI visualization method will enable analysis of developmental and context-dependent alterations of locus-specific histone modifications in specific cells and elucidation of the underlying molecular mechanisms. Copyright © 2018, G3: Genes, Genomes, Genetics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mortensen, Holly M., E-mail: mortensen.holly@epa.gov; Euling, Susan Y.
Response to environmental chemicals can vary widely among individuals and between population groups. In human health risk assessment, data on susceptibility can be utilized by deriving risk levels based on a study of a susceptible population and/or an uncertainty factor may be applied to account for the lack of information about susceptibility. Defining genetic susceptibility in response to environmental chemicals across human populations is an area of interest in the NAS' new paradigm of toxicity pathway-based risk assessment. Data from high-throughput/high content (HT/HC), including -omics (e.g., genomics, transcriptomics, proteomics, metabolomics) technologies, have been integral to the identification and characterization ofmore » drug target and disease loci, and have been successfully utilized to inform the mechanism of action for numerous environmental chemicals. Large-scale population genotyping studies may help to characterize levels of variability across human populations at identified target loci implicated in response to environmental chemicals. By combining mechanistic data for a given environmental chemical with next generation sequencing data that provides human population variation information, one can begin to characterize differential susceptibility due to genetic variability to environmental chemicals within and across genetically heterogeneous human populations. The integration of such data sources will be informative to human health risk assessment.« less
Genomics and museum specimens.
Nachman, Michael W
2013-12-01
Nearly 25 years ago, Allan Wilson and colleagues isolated DNA sequences from museum specimens of kangaroo rats (Dipodomys panamintinus) and compared these sequences with those from freshly collected animals (Thomas et al. 1990). The museum specimens had been collected up to 78 years earlier, so the two samples provided a direct temporal comparison of patterns of genetic variation. This was not the first time DNA sequences had been isolated from preserved material, but it was the first time it had been carried out with a population sample. Population geneticists often try to make inferences about the influence of historical processes such as selection, drift, mutation and migration on patterns of genetic variation in the present. The work of Wilson and colleagues was important in part because it suggested a way in which population geneticists could actually study genetic change in natural populations through time, much the same way that experimentalists can do with artificial populations in the laboratory. Indeed, the work of Thomas et al. (1990) spawned dozens of studies in which museum specimens were used to compare historical and present-day genetic diversity (reviewed in Wandeler et al. 2007). All of these studies, however, were limited by the same fundamental problem: old DNA is degraded into short fragments. As a consequence, these studies mostly involved PCR amplification of short templates, usually short stretches of mitochondrial DNA or microsatellites. In this issue, Bi et al. (2013) report a breakthrough that should open the door to studies of genomic variation in museum specimens. They used target enrichment (exon capture) and next-generation (Illumina) sequencing to compare patterns of genetic variation in historic and present-day population samples of alpine chipmunks (Tamias alpinus) (Fig. 1). The historic samples came from specimens collected in 1915, so the temporal span of this comparison is nearly 100 years. © 2013 John Wiley & Sons Ltd.
Regan, P. H.; Wheldon, C.; Yamamoto, A. D.; ...
2005-04-01
The near-yrast states of 42 101Mo 59 and 44 103,4Ru 59,60 have been studied following their population via heavy-ion multinucleon transfer reactions between a 136 Xe beam and a thin, self-supporting 100Mo target. The ground state sequence in 104Ru can be understood as demonstrating a simple evolution from a quasi-vibrational structure at lower spins to statically deformed, quasi-rotational excitation involving the population of a pair of low-Ω h 11/2 neutron orbitals. The effect of the decoupled h 11/2 orbital on this vibration-to-rotational evolution is demonstrated by an extension of the "E-GOS" prescription to include odd-A nuclei. The experimental results aremore » also compared with self-consistent Total Routhian Surface calculations which also highlight the polarising role of the highly aligned neutron h 11/2 orbital in these nuclei.« less
Pure Perceptual-Based Sequence Learning: A Role for Visuospatial Attention
ERIC Educational Resources Information Center
Remillard, Gilbert
2009-01-01
Learning the structure of a sequence of target locations when target location is not the response dimension and the sequence of target locations is uncorrelated with the sequence of responses is called pure perceptual-based sequence learning. The paradigm introduced by G. Remillard (2003) was used to determine whether orienting of visuospatial…
Accurate and exact CNV identification from targeted high-throughput sequence data.
Nord, Alex S; Lee, Ming; King, Mary-Claire; Walsh, Tom
2011-04-12
Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data. Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate. Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.
Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam
2010-04-15
High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.
Everts-van der Wind, Annelie; Kata, Srinivas R.; Band, Mark R.; Rebeiz, Mark; Larkin, Denis M.; Everts, Robin E.; Green, Cheryl A.; Liu, Lei; Natarajan, Shreedhar; Goldammer, Tom; Lee, Jun Heon; McKay, Stephanie; Womack, James E.; Lewin, Harris A.
2004-01-01
A second-generation 5000 rad radiation hybrid (RH) map of the cattle genome was constructed primarily using cattle ESTs that were targeted to gaps in the existing cattle–human comparative map, as well as to sparsely populated map intervals. A total of 870 targeted markers were added, bringing the number of markers mapped on the RH5000 panel to 1913. Of these, 1463 have significant BLASTN hits (E < e–5) against the human genome sequence. A cattle–human comparative map was created using human genome sequence coordinates of the paired orthologs. One-hundred and ninety-five conserved segments (defined by two or more genes) were identified between the cattle and human genomes, of which 31 are newly discovered and 34 were extended singletons on the first-generation map. The new map represents an improvement of 20% genome-wide comparative coverage compared with the first-generation map. Analysis of gene content within human genome regions where there are gaps in the comparative map revealed gaps with both significantly greater and significantly lower gene content. The new, more detailed cattle–human comparative map provides an improved resource for the analysis of mammalian chromosome evolution, the identification of candidate genes for economically important traits, and for proper alignment of sequence contigs on cattle chromosomes. PMID:15231756
Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H
2015-08-19
Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.
2014-01-01
Background Economically, Leuconostoc lactis is one of the most important species in the genus Leuconostoc. It plays an important role in the food industry including the production of dextrans and bacteriocins. Currently, traditional molecular typing approaches for characterisation of this species at the isolate level are either unavailable or are not sufficiently reliable for practical use. Multilocus sequence typing (MLST) is a robust and reliable method for characterising bacterial and fungal species at the molecular level. In this study, a novel MLST protocol was developed for 50 L. lactis isolates from Mongolia and China. Results Sequences from eight targeted genes (groEL, carB, recA, pheS, murC, pyrG, rpoB and uvrC) were obtained. Sequence analysis indicated 20 different sequence types (STs), with 13 of them being represented by a single isolate. Phylogenetic analysis based on the sequences of eight MLST loci indicated that the isolates belonged to two major groups, A (34 isolates) and B (16 isolates). Linkage disequilibrium analyses indicated that recombination occurred at a low frequency in L. lactis, indicating a clonal population structure. Split-decomposition analysis indicated that intraspecies recombination played a role in generating genotypic diversity amongst isolates. Conclusions Our results indicated that MLST is a valuable tool for typing L. lactis isolates that can be used for further monitoring of evolutionary changes and population genetics. PMID:24912963
Microfluidic droplet enrichment for targeted sequencing
Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.
2015-01-01
Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629
Global sequence diversity of the lactate dehydrogenase gene in Plasmodium falciparum.
Simpalipan, Phumin; Pattaradilokrat, Sittiporn; Harnyuttanakorn, Pongchai
2018-01-09
Antigen-detecting rapid diagnostic tests (RDTs) have been recommended by the World Health Organization for use in remote areas to improve malaria case management. Lactate dehydrogenase (LDH) of Plasmodium falciparum is one of the main parasite antigens employed by various commercial RDTs. It has been hypothesized that the poor detection of LDH-based RDTs is attributed in part to the sequence diversity of the gene. To test this, the present study aimed to investigate the genetic diversity of the P. falciparum ldh gene in Thailand and to construct the map of LDH sequence diversity in P. falciparum populations worldwide. The ldh gene was sequenced for 50 P. falciparum isolates in Thailand and compared with hundreds of sequences from P. falciparum populations worldwide. Several indices of molecular variation were calculated, including the proportion of polymorphic sites, the average nucleotide diversity index (π), and the haplotype diversity index (H). Tests of positive selection and neutrality tests were performed to determine signatures of natural selection on the gene. Mean genetic distance within and between species of Plasmodium ldh was analysed to infer evolutionary relationships. Nucleotide sequences of P. falciparum ldh could be classified into 9 alleles, encoding 5 isoforms of LDH. L1a was the most common allelic type and was distributed in P. falciparum populations worldwide. Plasmodium falciparum ldh sequences were highly conserved, with haplotype and nucleotide diversity values of 0.203 and 0.0004, respectively. The extremely low genetic diversity was maintained by purifying selection, likely due to functional constraints. Phylogenetic analysis inferred the close genetic relationship of P. falciparum to malaria parasites of great apes, rather than to other human malaria parasites. This study revealed the global genetic variation of the ldh gene in P. falciparum, providing knowledge for improving detection of LDH-based RDTs and supporting the candidacy of LDH as a therapeutic drug target.
Su, Jiao; Zhang, Haijie; Jiang, Bingying; Zheng, Huzhi; Chai, Yaqin; Yuan, Ruo; Xiang, Yun
2011-11-15
We report an ultrasensitive electrochemical approach for the detection of uropathogen sequence-specific DNA target. The sensing strategy involves a dual signal amplification process, which combines the signal enhancement by the enzymatic target recycling technique with the sensitivity improvement by the quantum dot (QD) layer-by-layer (LBL) assembled labels. The enzyme-based catalytic target DNA recycling process results in the use of each target DNA sequence for multiple times and leads to direct amplification of the analytical signal. Moreover, the LBL assembled QD labels can further enhance the sensitivity of the sensing system. The coupling of these two effective signal amplification strategies thus leads to low femtomolar (5fM) detection of the target DNA sequences. The proposed strategy also shows excellent discrimination between the target DNA and the single-base mismatch sequences. The advantageous intrinsic sequence-independent property of exonuclease III over other sequence-dependent enzymes makes our new dual signal amplification system a general sensing platform for monitoring ultralow level of various types of target DNA sequences. Copyright © 2011 Elsevier B.V. All rights reserved.
Attentional awakening: gradual modulation of temporal attention in rapid serial visual presentation.
Ariga, Atsunori; Yokosawa, Kazuhiko
2008-03-01
Orienting attention to a point in time facilitates processing of an item within rapidly changing surroundings. We used a one-target RSVP task to look for differences in accuracy in reporting a target related to when the target temporally appeared in the sequence. The results show that observers correctly report a target early in the sequence less frequently than later in the sequence. Previous RSVP studies predicted equivalently accurate performances for one target wherever it appeared in the sequence. We named this new phenomenon attentional awakening, which reflects a gradual modulation of temporal attention in a rapid sequence.
Clinical utility of genetic testing in pediatric drug-resistant epilepsy: a pilot study.
Ream, Margie A; Mikati, Mohamad A
2014-08-01
The utility of genetic testing in pediatric drug-resistant epilepsy (PDRE), its yield in "real life" clinical practice, and the practical implications of such testing are yet to be determined. To start to address the above gaps in our knowledge as they apply to a patient population seen in a tertiary care center. We retrospectively reviewed our experience with the use of clinically available genetic tests in the diagnosis and management of PDRE in one clinic over one year. Genetic testing included, depending on clinical judgment, one or more of the following: karyotype, chromosomal microarray, single gene sequencing, gene sequencing panels, and/or whole exome sequencing (WES). We were more likely to perform genetic testing in patients with developmental delay, epileptic encephalopathy, and generalized epilepsy. In our unique population, the yield of specific genetic diagnosis was relatively high: karyotype 14.3%, microarray 16.7%, targeted single gene sequencing 15.4%, gene panels 46.2%, and WES 16.7%. Overall yield of diagnosis from at least one of the above tests was 34.5%. Disease-causing mutations that were not clinically suspected based on the patients' phenotypes and representing novel phenotypes were found in 6.9% (2/29), with an additional 17.2% (5/29) demonstrating pharmacologic variants. Three patients were incidentally found to be carriers of recessive neurologic diseases (10.3%). Variants of unknown significance (VUSs) were identified in 34.5% (10/29). We conclude that genetic testing had at least some utility in our patient population of PDRE, that future similar larger studies in various populations are warranted, and that clinics offering such tests must be prepared to address the complicated questions raised by the results of such testing. Copyright © 2014. Published by Elsevier Inc.
Ion Torrent sequencing as a tool for mutation discovery in the flax (Linum usitatissimum L.) genome.
Galindo-González, Leonardo; Pinzón-Latorre, David; Bergen, Erik A; Jensen, Dustin C; Deyholos, Michael K
2015-01-01
Detection of induced mutations is valuable for inferring gene function and for developing novel germplasm for crop improvement. Many reverse genetics approaches have been developed to identify mutations in genes of interest within a mutagenized population, including some approaches that rely on next-generation sequencing (e.g. exome capture, whole genome resequencing). As an alternative to these genome or exome-scale methods, we sought to develop a scalable and efficient method for detection of induced mutations that could be applied to a small number of target genes, using Ion Torrent technology. We developed this method in flax (Linum usitatissimum), to demonstrate its utility in a crop species. We used an amplicon-based approach in which DNA samples from an ethyl methanesulfonate (EMS)-mutagenized population were pooled and used as template in PCR reactions to amplify a region of each gene of interest. Barcodes were incorporated during PCR, and the pooled amplicons were sequenced using an Ion Torrent PGM. A pilot experiment with known SNPs showed that they could be detected at a frequency > 0.3% within the pools. We then selected eight genes for which we wanted to discover novel mutations, and applied our approach to screen 768 individuals from the EMS population, using either the Ion 314 or Ion 316 chips. Out of 29 potential mutations identified after processing the NGS reads, 16 mutations were confirmed using Sanger sequencing. The methodology presented here demonstrates the utility of Ion Torrent technology in detecting mutation variants in specific genome regions for large populations of a species such as flax. The methodology could be scaled-up to test >100 genes using the higher capacity chips now available from Ion Torrent.
Cirera, S; Clop, A; Jacobsen, M J; Guerin, M; Lesnik, P; Jørgensen, C B; Fredholm, M; Karlskov-Mortensen, P
2018-04-01
Taste receptors (TASRs) and appetite and reward (AR) mechanisms influence eating behaviour, which in turn affects food intake and risk of obesity. In a previous study, we used next generation sequencing to identify potentially functional mutations in TASR and AR genes and found indications for genetic associations between identified variants and growth and fat deposition in a subgroup of animals (n = 38) from the UNIK resource pig population. This population was created for studying obesity and obesity-related diseases. In the present study we validated results from our previous study by investigating genetic associations between 24 selected single nucleotide variants in TASR and AR gene variants and 35 phenotypes describing obesity and metabolism in the entire UNIK population (n = 564). Fifteen variants showed significant association with specific obesity-related phenotypes after Bonferroni correction. Six of the 15 genes, namely SIM1, FOS, TAS2R4, TAS2R9, MCHR2 and LEPR, showed good correlation between known biological function and associated phenotype. We verified a genetic association between potentially functional variants in TASR/AR genes and growth/obesity and conclude that the combination of identification of potentially functional variants by next generation sequencing followed by targeted genotyping and association studies is a powerful and cost-effective approach for increasing the power of genetic association studies. © 2018 Stichting International Foundation for Animal Genetics.
Genetic Misdiagnoses and the Potential for Health Disparities.
Manrai, Arjun K; Funke, Birgit H; Rehm, Heidi L; Olesen, Morten S; Maron, Bradley A; Szolovits, Peter; Margulies, David M; Loscalzo, Joseph; Kohane, Isaac S
2016-08-18
For more than a decade, risk stratification for hypertrophic cardiomyopathy has been enhanced by targeted genetic testing. Using sequencing results, clinicians routinely assess the risk of hypertrophic cardiomyopathy in a patient's relatives and diagnose the condition in patients who have ambiguous clinical presentations. However, the benefits of genetic testing come with the risk that variants may be misclassified. Using publicly accessible exome data, we identified variants that have previously been considered causal in hypertrophic cardiomyopathy and that are overrepresented in the general population. We studied these variants in diverse populations and reevaluated their initial ascertainments in the medical literature. We reviewed patient records at a leading genetic-testing laboratory for occurrences of these variants during the near-decade-long history of the laboratory. Multiple patients, all of whom were of African or unspecified ancestry, received positive reports, with variants misclassified as pathogenic on the basis of the understanding at the time of testing. Subsequently, all reported variants were recategorized as benign. The mutations that were most common in the general population were significantly more common among black Americans than among white Americans (P<0.001). Simulations showed that the inclusion of even small numbers of black Americans in control cohorts probably would have prevented these misclassifications. We identified methodologic shortcomings that contributed to these errors in the medical literature. The misclassification of benign variants as pathogenic that we found in our study shows the need for sequencing the genomes of diverse populations, both in asymptomatic controls and the tested patient population. These results expand on current guidelines, which recommend the use of ancestry-matched controls to interpret variants. As additional populations of different ancestry backgrounds are sequenced, we expect variant reclassifications to increase, particularly for ancestry groups that have historically been less well studied. (Funded by the National Institutes of Health.).
DNA sequencing using polymerase substrate-binding kinetics
Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min
2015-01-01
Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848
N-terminal dual lipidation-coupled molecular targeting into the primary cilium.
Kumeta, Masahiro; Panina, Yulia; Yamazaki, Hiroya; Takeyasu, Kunio; Yoshimura, Shige H
2018-06-13
The primary cilium functions as an "antenna" for cell signaling, studded with characteristic transmembrane receptors and soluble protein factors, raised above the cell surface. In contrast to the transmembrane proteins, targeting mechanisms of nontransmembrane ciliary proteins are poorly understood. We focused on a pathogenic mutation that abolishes ciliary localization of retinitis pigmentosa 2 protein and revealed a dual acylation-dependent ciliary targeting pathway. Short N-terminal sequences which contain myristoylation and palmitoylation sites are sufficient to target a marker protein into the cilium in a palmitoylation-dependent manner. A Golgi-localized palmitoyltransferase DHHC-21 was identified as the key enzyme controlling this targeting pathway. Rapid turnover of the targeted protein was ensured by cholesterol-dependent membrane fluidity, which balances highly and less-mobile populations of the molecules within the cilium. This targeting signal was found in a set of signal transduction molecules, suggesting a general role of this pathway in proper ciliary organization, and dysfunction in ciliary disorders. © 2018 Molecular Biology Society of Japan and John Wiley & Sons Australia, Ltd.
The siRNA Non-seed Region and Its Target Sequences Are Auxiliary Determinants of Off-Target Effects.
Kamola, Piotr J; Nakano, Yuko; Takahashi, Tomoko; Wilson, Paul A; Ui-Tei, Kumiko
2015-12-01
RNA interference (RNAi) is a powerful tool for post-transcriptional gene silencing. However, the siRNA guide strand may bind unintended off-target transcripts via partial sequence complementarity by a mechanism closely mirroring micro RNA (miRNA) silencing. To better understand these off-target effects, we investigated the correlation between sequence features within various subsections of siRNA guide strands, and its corresponding target sequences, with off-target activities. Our results confirm previous reports that strength of base-pairing in the siRNA seed region is the primary factor determining the efficiency of off-target silencing. However, the degree of downregulation of off-target transcripts with shared seed sequence is not necessarily similar, suggesting that there are additional auxiliary factors that influence the silencing potential. Here, we demonstrate that both the melting temperature (Tm) in a subsection of siRNA non-seed region, and the GC contents of its corresponding target sequences, are negatively correlated with the efficiency of off-target effect. Analysis of experimentally validated miRNA targets demonstrated a similar trend, indicating a putative conserved mechanistic feature of seed region-dependent targeting mechanism. These observations may prove useful as parameters for off-target prediction algorithms and improve siRNA 'specificity' design rules.
Muñoz-Alía, Miguel Ángel; Fernández-Muñoz, Rafael; Casasnovas, José María; Porras-Mansilla, Rebeca; Serrano-Pardo, Ángela; Pagán, Israel; Ordobás, María; Ramírez, Rosa; Celma, María Luisa
2015-01-22
Measles virus circulates endemically in African and Asian large urban populations, causing outbreaks worldwide in populations with up-to-95% immune protection. We studied the natural genetic variability of genotype B3.1 in a population with 95% vaccine coverage throughout an imported six month measles outbreak. From first pass viral isolates of 47 patients we performed direct sequencing of genomic cDNA. Whilst no variation from index case sequence occurred in the Nucleocapsid gene hyper-variable carboxy end, in the Hemagglutinin gene, main target for neutralizing antibodies, we observed gradual nucleotide divergence from index case along the outbreak (0% to 0.380%, average 0.138%) with the emergence of transient and persistent non-synonymous and synonymous mutations. Little or no variation was observed between the index and last outbreak cases in Phosphoprotein, Nucleocapsid, Matrix and Fusion genes. Most of the H non-synonymous mutations were mapped on the protein surface near antigenic and receptors binding sites. We estimated a MV-Hemagglutinin nucleotide substitution rate of 7.28 × 10-6 substitutions/site/day by a Bayesian phylogenetic analysis. The dN/dS analysis did not suggest significant immune or other selective pressures on the H gene during the outbreak. These results emphasize the usefulness of MV-H sequence analysis in measles epidemiological surveillance and elimination programs, and in detection of potentially emergence of measles virus neutralization-resistant mutants. Copyright © 2014 Elsevier B.V. All rights reserved.
Huys, Geert; Vanhoutte, Tom; Vandamme, Peter
2008-01-01
Sequence-dependent electrophoresis (SDE) fingerprinting techniques such as denaturing gradient gel electrophoresis (DGGE) have become commonplace in the field of molecular microbial ecology. The success of the SDE technology lays in the fact that it allows visualization of the predominant members of complex microbial ecosystems independent of their culturability and without prior knowledge on the complexity and diversity of the ecosystem. Mainly using the prokaryotic 16S rRNA gene as PCR amplification target, SDE-based community fingerprinting turned into one of the leading molecular tools to unravel the diversity and population dynamics of human intestinal microbiota. The first part of this review covers the methodological concept of SDE fingerprinting and the technical hurdles for analyzing intestinal samples. Subsequently, the current state-of-the-art of DGGE and related techniques to analyze human intestinal microbiota from healthy individuals and from patients with intestinal disorders is surveyed. In addition, the applicability of SDE analysis to monitor intestinal population changes upon nutritional or therapeutic interventions is critically evaluated. PMID:19277102
2011-01-01
Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434
A programmable method for massively parallel targeted sequencing
Hopmans, Erik S.; Natsoulis, Georges; Bell, John M.; Grimes, Susan M.; Sieh, Weiva; Ji, Hanlee P.
2014-01-01
We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy. PMID:24782526
Therese, K Lily; Gayathri, R; Balasubramanian, S; Natrajan, S; Madhavan, H N
2012-01-01
Multidrug-resistant TB (MDR-TB) has been reported in almost all parts of the world. Childhood TB is accorded low priority by national TB control programs. Probable reasons include diagnostic difficulties, limited resources, misplaced faith in BCG and lack of data on treatment. Good data on the burden of all forms of TB among children in India are not available. To study the drug sensitivity pattern of tuberculosis in children aged from 3 months to 18 years and the outcome of drug-resistant tuberculosis by BACTEC culture system and PCR-based DNA sequencing technique. This is a retrospective study. One hundred and fifty-nine clinical specimens were processed for Ziehl-Neelsen stain, Mycobacterial culture by BACTEC method, phenotypic DST for first-line drugs for Mycobacterium tuberculosis (M. tuberculosis) isolates and PCR-based DNA sequencing was performed for the M. tuberculosis isolates targeting rpoB, katG, inhA, oxyR-ahpC, rpsL, rrs and pncA. Out of the 159 Mycobacterial cultures performed during the study period, 17 clinical specimens (10.7%) were culture positive for M. tuberculosis. Among the 17 M. tuberculosis isolates, 2 were multidrug-resistant TB. PCR-based DNA sequencing revealed the presence of many novel mutations targeting katG, inhA, oxyR-ahpC and pncA and the most commonly reported mutation Ser531Leu in the rpoB gene. This study underlines the urgent need to take efforts to develop methods for rapid detection and drug susceptibility of tubercle bacilli in the pediatric population.
Lynch, David S; Koutsis, Georgios; Tucci, Arianna; Panas, Marios; Baklou, Markella; Breza, Marianthi; Karadima, Georgia; Houlden, Henry
2016-01-01
Hereditary Spastic Paraplegia (HSP) is a syndrome characterised by lower limb spasticity, occurring alone or in association with other neurological manifestations, such as cognitive impairment, seizures, ataxia or neuropathy. HSP occurs worldwide, with different populations having different frequencies of causative genes. The Greek population has not yet been characterised. The purpose of this study was to describe the clinical presentation and molecular epidemiology of the largest cohort of HSP in Greece, comprising 54 patients from 40 families. We used a targeted next-generation sequencing (NGS) approach to genetically assess a proband from each family. We made a genetic diagnosis in >50% of cases and identified 11 novel variants. Variants in SPAST and KIF5A were the most common causes of autosomal dominant HSP, whereas SPG11 and CYP7B1 were the most common cause of autosomal recessive HSP. We identified a novel variant in SPG11, which led to disease with later onset and may be unique to the Greek population and report the first nonsense mutation in KIF5A. Interestingly, the frequency of HSP mutations in the Greek population, which is relatively isolated, was very similar to other European populations. We confirm that NGS approaches are an efficient diagnostic tool and should be employed early in the assessment of HSP patients. PMID:26374131
Kashiwagi, Tom; Maxwell, Elisabeth A; Marshall, Andrea D; Christensen, Ana B
2015-01-01
Sharks and rays are increasingly being identified as high-risk species for extinction, prompting urgent assessments of their local or regional populations. Advanced genetic analyses can contribute relevant information on effective population size and connectivity among populations although acquiring sufficient regional sample sizes can be challenging. DNA is typically amplified from tissue samples which are collected by hand spears with modified biopsy punch tips. This technique is not always popular due mainly to a perception that invasive sampling might harm the rays, change their behaviour, or have a negative impact on tourism. To explore alternative methods, we evaluated the yields and PCR success of DNA template prepared from the manta ray mucus collected underwater and captured and stored on a Whatman FTA™ Elute card. The pilot study demonstrated that mucus can be effectively collected underwater using toothbrush. DNA stored on cards was found to be reliable for PCR-based population genetics studies. We successfully amplified mtDNA ND5, nuclear DNA RAG1, and microsatellite loci for all samples and confirmed sequences and genotypes being those of target species. As the yields of DNA with the tested method were low, further improvements are desirable for assays that may require larger amounts of DNA, such as population genomic studies using emerging next-gen sequencing.
Maxwell, Elisabeth A.; Marshall, Andrea D.; Christensen, Ana B.
2015-01-01
Sharks and rays are increasingly being identified as high-risk species for extinction, prompting urgent assessments of their local or regional populations. Advanced genetic analyses can contribute relevant information on effective population size and connectivity among populations although acquiring sufficient regional sample sizes can be challenging. DNA is typically amplified from tissue samples which are collected by hand spears with modified biopsy punch tips. This technique is not always popular due mainly to a perception that invasive sampling might harm the rays, change their behaviour, or have a negative impact on tourism. To explore alternative methods, we evaluated the yields and PCR success of DNA template prepared from the manta ray mucus collected underwater and captured and stored on a Whatman FTA™ Elute card. The pilot study demonstrated that mucus can be effectively collected underwater using toothbrush. DNA stored on cards was found to be reliable for PCR-based population genetics studies. We successfully amplified mtDNA ND5, nuclear DNA RAG1, and microsatellite loci for all samples and confirmed sequences and genotypes being those of target species. As the yields of DNA with the tested method were low, further improvements are desirable for assays that may require larger amounts of DNA, such as population genomic studies using emerging next-gen sequencing. PMID:26413431
Faucon, Frederic; Gaude, Thierry; Dusfour, Isabelle; Navratil, Vincent; Corbel, Vincent; Juntarajumnong, Waraporn; Girod, Romain; Poupardin, Rodolphe; Boyer, Frederic; Reynaud, Stephane; David, Jean-Philippe
2017-04-01
The capacity of Aedes mosquitoes to resist chemical insecticides threatens the control of major arbovirus diseases worldwide. Until alternative control tools are widely deployed, monitoring insecticide resistance levels and identifying resistance mechanisms in field mosquito populations is crucial for implementing appropriate management strategies. Metabolic resistance to pyrethroids is common in Aedes aegypti but the monitoring of the dynamics of resistant alleles is impeded by the lack of robust genomic markers. In an attempt to identify the genomic bases of metabolic resistance to deltamethrin, multiple resistant and susceptible populations originating from various continents were compared using both RNA-seq and a targeted DNA-seq approach focused on the upstream regions of detoxification genes. Multiple detoxification enzymes were over transcribed in resistant populations, frequently associated with an increase in their gene copy number. Targeted sequencing identified potential promoter variations associated with their over transcription. Non-synonymous variations affecting detoxification enzymes were also identified in resistant populations. This study not only confirmed the role of gene copy number variations as a frequent cause of the over expression of detoxification enzymes associated with insecticide resistance in Aedes aegypti but also identified novel genomic resistance markers potentially associated with their cis-regulation and modifications of their protein structure conformation. As for gene transcription data, polymorphism patterns were frequently conserved within regions but differed among continents confirming the selection of different resistance factors worldwide. Overall, this study paves the way of the identification of a comprehensive set of genomic markers for monitoring the spatio-temporal dynamics of the variety of insecticide resistance mechanisms in Aedes aegypti.
Mathias, Patrick C; Turner, Emily H; Scroggins, Sheena M; Salipante, Stephen J; Hoffman, Noah G; Pritchard, Colin C; Shirts, Brian H
2016-03-01
To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. We combined a principal component analysis method with k-nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. We applied this technique to 859 patient samples with reliable self-report data. Our k-nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Ferreira, Keila Adriana Magalhães; Fajardo, Emanuella Francisco; Baptista, Rodrigo P; Macedo, Andrea Mara; Lages-Silva, Eliane; Ramírez, Luis Eduardo; Pedrosa, André Luiz
2014-06-01
Trypanosoma cruzi and Trypanosoma rangeli are kinetoplastid parasites which are able to infect humans in Central and South America. Misdiagnosis between these trypanosomes can be avoided by targeting barcoding sequences or genes of each organism. This work aims to analyze the feasibility of using species-specific markers for identification of intraspecific polymorphisms and as target for diagnostic methods by PCR. Accordingly, primers which are able to specifically detect T. cruzi or T. rangeli genomic DNA were characterized. The use of intergenic regions, generally divergent in the trypanosomatids, and the serine carboxypeptidase gene were successful. Using T. rangeli genomic sequences for the identification of group-specific polymorphisms and a polymorphic AT(n) dinucleotide repeat permitted the classification of the strains into two groups, which are entirely coincident with T. rangeli main lineages, KP1 (+) and KP1 (-), previously determined by kinetoplast DNA (kDNA) characterization. The sequences analyzed totalize 622 bp (382 bp represent a hypothetical protein sequence, and 240 bp represent an anonymous sequence), and of these, 581 (93.3%) are conserved sites and 41 bp (6.7%) are polymorphic, with 9 transitions (21.9%), 2 transversions (4.9%), and 30 (73.2%) insertion/deletion events. Taken together, the species-specific markers analyzed may be useful for the development of new strategies for the accurate diagnosis of infections. Furthermore, the identification of T. rangeli polymorphisms has a direct impact in the understanding of the population structure of this parasite.
2017-01-01
Abstract Target search as performed by DNA-binding proteins is a complex process, in which multiple factors contribute to both thermodynamic discrimination of the target sequence from overwhelmingly abundant off-target sites and kinetic acceleration of dynamic sequence interrogation. TRF1, the protein that binds to telomeric tandem repeats, faces an intriguing variant of the search problem where target sites are clustered within short fragments of chromosomal DNA. In this study, we use extensive (>0.5 ms in total) MD simulations to study the dynamical aspects of sequence-specific binding of TRF1 at both telomeric and non-cognate DNA. For the first time, we describe the spontaneous formation of a sequence-specific native protein–DNA complex in atomistic detail, and study the mechanism by which proteins avoid off-target binding while retaining high affinity for target sites. Our calculated free energy landscapes reproduce the thermodynamics of sequence-specific binding, while statistical approaches allow for a comprehensive description of intermediate stages of complex formation. PMID:28633355
Methylation-sensitive enrichment of minor DNA alleles using a double-strand DNA-specific nuclease.
Liu, Yibin; Song, Chen; Ladas, Ioannis; Fitarelli-Kiehl, Mariana; Makrigiorgos, G Mike
2017-04-07
Aberrant methylation changes, often present in a minor allelic fraction in clinical samples such as plasma-circulating DNA (cfDNA), are potentially powerful prognostic and predictive biomarkers in human disease including cancer. We report on a novel, highly-multiplexed approach to facilitate analysis of clinically useful methylation changes in minor DNA populations. Methylation Specific Nuclease-assisted Minor-allele Enrichment (MS-NaME) employs a double-strand-specific DNA nuclease (DSN) to remove excess DNA with normal methylation patterns. The technique utilizes oligonucleotide-probes that direct DSN activity to multiple targets in bisulfite-treated DNA, simultaneously. Oligonucleotide probes targeting unmethylated sequences generate local double stranded regions resulting to digestion of unmethylated targets, and leaving methylated targets intact; and vice versa. Subsequent amplification of the targeted regions results in enrichment of the targeted methylated or unmethylated minority-epigenetic-alleles. We validate MS-NaME by demonstrating enrichment of RARb2, ATM, MGMT and GSTP1 promoters in multiplexed MS-NaME reactions (177-plex) using dilutions of methylated/unmethylated DNA and in DNA from clinical lung cancer samples and matched normal tissue. MS-NaME is a highly scalable single-step approach performed at the genomic DNA level in solution that combines with most downstream detection technologies including Sanger sequencing, methylation-sensitive-high-resolution melting (MS-HRM) and methylation-specific-Taqman-based-digital-PCR (digital Methylight) to boost detection of low-level aberrant methylation-changes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Draft versus finished sequence data for DNA and protein diagnostic signature development
Gardner, Shea N.; Lam, Marisa W.; Smith, Jason R.; Torres, Clinton L.; Slezak, Tom R.
2005-01-01
Sequencing pathogen genomes is costly, demanding careful allocation of limited sequencing resources. We built a computational Sequencing Analysis Pipeline (SAP) to guide decisions regarding the amount of genomic sequencing necessary to develop high-quality diagnostic DNA and protein signatures. SAP uses simulations to estimate the number of target genomes and close phylogenetic relatives (near neighbors or NNs) to sequence. We use SAP to assess whether draft data are sufficient or finished sequencing is required using Marburg and variola virus sequences. Simulations indicate that intermediate to high-quality draft with error rates of 10−3–10−5 (∼8× coverage) of target organisms is suitable for DNA signature prediction. Low-quality draft with error rates of ∼1% (3× to 6× coverage) of target isolates is inadequate for DNA signature prediction, although low-quality draft of NNs is sufficient, as long as the target genomes are of high quality. For protein signature prediction, sequencing errors in target genomes substantially reduce the detection of amino acid sequence conservation, even if the draft is of high quality. In summary, high-quality draft of target and low-quality draft of NNs appears to be a cost-effective investment for DNA signature prediction, but may lead to underestimation of predicted protein signatures. PMID:16243783
Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua
2017-02-01
In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Zhang, Wensheng; Edwards, Andrea; Zhu, Dongxiao; Flemington, Erik K.; Deininger, Prescott; Zhang, Kun
2012-01-01
In metazoans, miRNAs regulate gene expression primarily through binding to target sites in the 3′ UTRs (untranslated regions) of messenger RNAs (mRNAs). Cis-acting variants within, or close to, a gene are crucial in explaining the variability of gene expression measures. Single nucleotide polymorphisms (SNPs) in the 3′ UTRs of genes can affect the base-pairing between miRNAs and mRNAs, and hence disrupt existing target sites (in the reference sequence) or create novel target sites, suggesting a possible mechanism for cis regulation of gene expression. Moreover, because the alleles of different SNPs within a DNA sequence of limited length tend to be in strong linkage disequilibrium (LD), we hypothesize the variants of miRNA target sites caused by SNPs potentially function as bridges linking the documented cis-SNP markers to the expression of the associated genes. A large-scale analysis was herein performed to test this hypothesis. By systematically integrating multiple latest information sources, we found 21 significant gene-level SNP-involved miRNA-mediated post-transcriptional regulation modules (SNP-MPRMs) in the form of SNP-miRNA-mRNA triplets in lymphocyte cell lines for the CEU and YRI populations. Among the cognate genes, six including ALG8, DGKE, GNA12, KLF11, LRPAP1, and MMAB are related to multiple genetic diseases such as depressive disorder and Type-II diabetes. Furthermore, we found that ∼35% of the documented transcript intensity-related cis-SNPs (∼950) in a recent publication are identical to, or in significant linkage disequilibrium (LD) (p<0.01) with, one or multiple SNPs located in miRNA target sites. Based on these associations (or identities), 69 significant exon-level SNP-MPRMs and 12 disease genes were further determined for two populations. These results provide concrete in silico evidence for the proposed hypothesis. The discovered modules warrant additional follow-up in independent laboratory studies. PMID:22348086
Polyclonality of Concurrent Natural Populations of Alteromonas macleodii
Gonzaga, Aitor; Martin-Cuadrado, Ana-Belen; López-Pérez, Mario; Megumi Mizuno, Carolina; García-Heredia, Inmaculada; Kimes, Nikole E.; Lopez-García, Purificación; Moreira, David; Ussery, David; Zaballos, Mila; Ghai, Rohit; Rodriguez-Valera, Francisco
2012-01-01
We have analyzed a natural population of the marine bacterium, Alteromonas macleodii, from a single sample of seawater to evaluate the genomic diversity present. We performed full genome sequencing of four isolates and 161 metagenomic fosmid clones, all of which were assigned to A. macleodii by sequence similarity. Out of the four strain genomes, A. macleodii deep ecotype (AltDE1) represented a different genome, whereas AltDE2 and AltDE3 were identical to the previously described AltDE. Although the core genome (∼80%) had an average nucleotide identity of 98.51%, both AltDE and AltDE1 contained flexible genomic islands (fGIs), that is, genomic islands present in both genomes in the same genomic context but having different gene content. Some of the fGIs encode cell surface receptors known to be phage recognition targets, such as the O-chain of the lipopolysaccharide, whereas others have genes involved in physiological traits (e.g., nutrient transport, degradation, and metal resistance) denoting microniche specialization. The presence in metagenomic fosmids of genomic fragments differing from the sequenced strain genomes, together with the presence of new fGIs, indicates that there are at least two more A. macleodii clones present. The availability of three or more sequences overlapping the same genomic region also allowed us to estimate the frequency and distribution of recombination events among these different clones, indicating that these clustered near the genomic islands. The results indicate that this natural A. macleodii population has multiple clones with a potential for different phage susceptibility and exploitation of resources, within a seemingly unstructured habitat. PMID:23212172
High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.
Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia; Pérez-Lluch, Sílvia; Abad, Amaya; Davis, Carrie; Gingeras, Thomas R; Frankish, Adam; Harrow, Jennifer; Guigo, Roderic; Johnson, Rory
2017-12-01
Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.
Evaluation of the genetic diversity of Plum pox virus in a single plum tree.
Predajňa, Lukáš; Šubr, Zdeno; Candresse, Thierry; Glasa, Miroslav
2012-07-01
Genetic diversity of Plum pox virus (PPV) and its distribution within a single perennial woody host (plum, Prunus domestica) has been evaluated. A plum tree was triply infected by chip-budding with PPV-M, PPV-D and PPV-Rec isolates in 2003 and left to develop untreated under open field conditions. In September 2010 leaf and fruit samples were collected from different parts of the tree canopy. A 745-bp NIb-CP fragment of PPV genome, containing the hypervariable region encoding the CP N-terminal end was amplified by RT-PCR from each sample and directly sequenced to determine the dominant sequence. In parallel, the PCR products were cloned and a total of 105 individual clones were sequenced. Sequence analysis revealed that after 7 years of infection, only PPV-M was still detectable in the tree and that the two other isolates (PPV-Rec and PPV-D) had been displaced. Despite the fact that the analysis targeted a relatively short portion of the genome, a substantial amount of intra-isolate variability was observed for PPV-M. A total of 51 different haplotypes could be identified from the 105 individual sequences, two of which were largely dominant. However, no clear-cut structuration of the viral population by the tree architecture could be highlighted although the results obtained suggest the possibility of intra-leaf/fruit differentiation of the viral population. Comparison of the consensus sequence with the original source isolate showed no difference, suggesting within-plant stability of this original isolate under open field conditions. Copyright © 2012 Elsevier B.V. All rights reserved.
Landscape of Insertion Polymorphisms in the Human Genome
Onozawa, Masahiro; Goldberg, Liat; Aplan, Peter D.
2015-01-01
Nucleotide substitutions, small (<50 bp) insertions or deletions (indels), and large (>50 bp) deletions are well-known causes of genetic variation within the human genome. We recently reported a previously unrecognized form of polymorphic insertions, termed templated sequence insertion polymorphism (TSIP), in which the inserted sequence was templated from a distant genomic region, and was inserted in the genome through reverse transcription of an RNA intermediate. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; class 1 TSIPs show target site duplication, polyadenylation, and preference for insertion at a 5′-TTTT/A-3′ sequence, suggesting a LINE-1 based insertion mechanism, whereas class 2 TSIPs show features consistent with repair of a DNA double strand break by nonhomologous end joining. To gain a more complete picture of TSIPs throughout the human population, we evaluated whole-genome sequence from 52 individuals, and identified 171 TSIPs. Most individuals had 25–30 TSIPs, and common (present in >20% of individuals) TSIPs were found in individuals throughout the world, whereas rare TSIPs tended to cluster in specific geographic regions. The number of rare TSIPs was greater than the number of common TSIPs, suggesting that TSIP generation is an ongoing process. Intriguingly, mitochondrial sequences were a frequent template for class 2 insertions, used more commonly than any nuclear chromosome. Similar to single nucleotide polymorphisms and indels, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases, and can be useful in tracking historical migration of populations. PMID:25745018
The Alveolate Perkinsus marinus: Biological Insights from EST Gene Discovery
2010-01-01
Background Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism) within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date. Results To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated >31,000 5' expressed sequence tags (ESTs) derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-value ≤ 1e-5). Some sequences are similar to those proven to be targets for effective intervention in other protozoan parasites, and include not only proteases, antioxidant enzymes, and heat shock proteins, but also those associated with relict plastids, such as acetyl-CoA carboxylase and methyl erythrithol phosphate pathway components, and those involved in glycan assembly, protein folding/secretion, and parasite-host interactions. Conclusions Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence for the presence of a relict plastid. Further, although P. marinus sequences display significant similarity to those from both apicomplexans and dinoflagellates, the presence of trans-spliced transcripts confirms the previously established affinities with the latter. The EST analysis reported herein, together with the recently completed sequence of the P. marinus genome and the development of transfection methodology, should result in improved intervention strategies against dermo disease. PMID:20374649
The genomic landscape of rapid repeated evolutionary ...
Atlantic killifish populations have rapidly adapted to normally lethal levels of pollution in four urban estuaries. Through analysis of 384 whole killifish genome sequences and comparative transcriptomics in four pairs of sensitive and tolerant populations, we identify the aryl hydrocarbon receptor–based signaling pathway as a shared target of selection. This suggests evolutionary constraint on adaptive solutions to complex toxicant mixtures at each site. However, distinct molecular variants apparently contribute to adaptive pathway modification among tolerant populations. Selection also targets other toxicity-mediatinggenes and genes of connected signaling pathways; this indicates complex tolerance phenotypes and potentially compensatory adaptations. Molecular changes are consistent with selection on standing genetic variation. In killifish, high nucleotide diversityhas likely been a crucial substrate for selective sweeps to propel rapid adaptation. This manuscript describes genomic evaluations that contribute to our understanding of the ecological and evolutionary risks associated with chronic contaminant exposures to wildlife populations. Here, we assessed genetic patterns associated with long-term response to an important class of highly toxic environmental pollutants. Specifically, chemical-specific tolerance has rapidly and repeatedly evolved in an estuarine fish species resident to estuaries of the Atlantic U.S. coast. We used laboratory studies to ch
Li, Jian; Li, Mei; Gao, Xingxiang; Fang, Feng
2017-12-01
Crabgrass (Digitaria sanguinalis) is an annual monocotyledonous weed. In recent years, field applications of nicosulfuron have been ineffective in controlling crabgrass populations in Shandong Province, China. To investigate the mechanisms of resistance to nicosulfuron in crabgrass populations, the acetolactate synthase (ALS) gene fragment covering known resistance-confering mutation sites was amplified and sequenced. Dose-response experiments suggested that the resistant population SD13 (R) was highly resistant to nicosulfuron (resistance index R/S = 43.7) compared with the sensitive population SD22 (S). ALS gene sequencing revealed a Trp574Arg substitution in the SD13 population, and no other known resistance-conferring mutations were found. In vitro ALS enzyme assays further confirmed that the SD13 population was resistant to all tested ALS-inhibiting herbicides. The resistance pattern experiments revealed that, compared with SD22, the SD13 population exhibited broad-spectrum resistance to nicosulfuron (43.7-fold), imazethapyr (11.4-fold) and flumetsulam (16.1-fold); however, it did not develop resistance to atrazine, mesotrione and topramezone. This study demonstrated that Trp574Arg substitution was the main reason for crabgrass resistance to ALS-inhibiting herbicides. To our knowledge, this is the first report of Trp574Arg substitution in a weed species, and is the first report of target-site mechanisms of herbicide resistance for crabgrass. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
A dominant variant in the PDE1C gene is associated with nonsyndromic hearing loss.
Wang, Li; Feng, Yong; Yan, Denise; Qin, Litao; Grati, M'hamed; Mittal, Rahul; Li, Tao; Sundhari, Abhiraami Kannan; Liu, Yalan; Chapagain, Prem; Blanton, Susan H; Liao, Shixiu; Liu, Xuezhong
2018-06-02
Identification of genes with variants causing non-syndromic hearing loss (NSHL) is challenging due to genetic heterogeneity. The difficulty is compounded by technical limitations that in the past prevented comprehensive gene identification. Recent advances in technology, using targeted capture and next-generation sequencing (NGS), is changing the face of gene identification and making it possible to rapidly and cost-effectively sequence the whole human exome. Here, we characterize a five-generation Chinese family with progressive, postlingual autosomal dominant nonsyndromic hearing loss (ADNSHL). By combining population-specific mutation arrays, targeted deafness genes panel, whole exome sequencing (WES), we identified PDE1C (Phosphodiesterase 1C) c.958G>T (p.A320S) as the disease-associated variant. Structural modeling insights into p.A320S strongly suggest that the sequence alteration will likely affect the substrate-binding pocket of PDE1C. By whole-mount immunofluorescence on postnatal day 3 mouse cochlea, we show its expression in outer (OHC) and inner (IHC) hair cells cytosol co-localizing with Lamp-1 in lysosomes. Furthermore, we provide evidence that the variant alters the PDE1C hydrolytic activity for both cyclic adenosine monophosphate (cAMP) and cyclic guanosine monophosphate (cGMP). Collectively, our findings indicate that the c.958G>T variant in PDE1C may disrupt the cross talk between cGMP-signaling and cAMP pathways in Ca 2+ homeostasis.
Exploring Nitrilase Sequence Space for Enantioselective Catalysis†
Robertson, Dan E.; Chaplin, Jennifer A.; DeSantis, Grace; Podar, Mircea; Madden, Mark; Chi, Ellen; Richardson, Toby; Milan, Aileen; Miller, Mark; Weiner, David P.; Wong, Kelvin; McQuaid, Jeff; Farwell, Bob; Preston, Lori A.; Tan, Xuqiu; Snead, Marjory A.; Keller, Martin; Mathur, Eric; Kretz, Patricia L.; Burk, Mark J.; Short, Jay M.
2004-01-01
Nitrilases are important in the biosphere as participants in synthesis and degradation pathways for naturally occurring, as well as xenobiotically derived, nitriles. Because of their inherent enantioselectivity, nitrilases are also attractive as mild, selective catalysts for setting chiral centers in fine chemical synthesis. Unfortunately, <20 nitrilases have been reported in the scientific and patent literature, and because of stability or specificity shortcomings, their utility has been largely unrealized. In this study, 137 unique nitrilases, discovered from screening of >600 biotope-specific environmental DNA (eDNA) libraries, were characterized. Using culture-independent means, phylogenetically diverse genomes were captured from entire biotopes, and their genes were expressed heterologously in a common cloning host. Nitrilase genes were targeted in a selection-based expression assay of clonal populations numbering 106 to 1010 members per eDNA library. A phylogenetic analysis of the novel sequences discovered revealed the presence of at least five major sequence clades within the nitrilase subfamily. Using three nitrile substrates targeted for their potential in chiral pharmaceutical synthesis, the enzymes were characterized for substrate specificity and stereospecificity. A number of important correlations were found between sequence clades and the selective properties of these nitrilases. These enzymes, discovered using a high-throughput, culture-independent method, provide a catalytic toolbox for enantiospecific synthesis of a variety of carboxylic acid derivatives, as well as an intriguing library for evolutionary and structural analyses. PMID:15066841
Quantifying Selection with Pool-Seq Time Series Data.
Taus, Thomas; Futschik, Andreas; Schlötterer, Christian
2017-11-01
Allele frequency time series data constitute a powerful resource for unraveling mechanisms of adaptation, because the temporal dimension captures important information about evolutionary forces. In particular, Evolve and Resequence (E&R), the whole-genome sequencing of replicated experimentally evolving populations, is becoming increasingly popular. Based on computer simulations several studies proposed experimental parameters to optimize the identification of the selection targets. No such recommendations are available for the underlying parameters selection strength and dominance. Here, we introduce a highly accurate method to estimate selection parameters from replicated time series data, which is fast enough to be applied on a genome scale. Using this new method, we evaluate how experimental parameters can be optimized to obtain the most reliable estimates for selection parameters. We show that the effective population size (Ne) and the number of replicates have the largest impact. Because the number of time points and sequencing coverage had only a minor effect, we suggest that time series analysis is feasible without major increase in sequencing costs. We anticipate that time series analysis will become routine in E&R studies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Lin, Jing; Pramono, Zacharias Aloysius Dwi; Maurer-Stroh, Sebastian
2016-01-01
The multiple circulating human influenza A virus subtypes coupled with the perpetual genomic mutations and segment reassortment events challenge the development of effective therapeutics. The capacity to drug most RNAs motivates the investigation on viral RNA targets. 123,060 segment sequences from 35,938 strains of the most prevalent subtypes also infecting humans–H1N1, 2009 pandemic H1N1, H3N2, H5N1 and H7N9, were used to identify 1,183 conserved RNA target sequences (≥15-mer) in the internal segments. 100% theoretical coverage in simultaneous heterosubtypic targeting is achieved by pairing specific sequences from the same segment (“Duals”) or from two segments (“Doubles”); 1,662 Duals and 28,463 Doubles identified. By combining specific Duals and/or Doubles to form a target graph wherein an edge connecting two vertices (target sequences) represents a Dual or Double, it is possible to hedge against antiviral resistance besides maintaining 100% heterosubtypic coverage. To evaluate the hedging potential, we define the hedge-factor as the minimum number of resistant target sequences that will render the graph to become resistant i.e. eliminate all the edges therein; a target sequence or a graph is considered resistant when it cannot achieve 100% heterosubtypic coverage. In an n-vertices graph (n ≥ 3), the hedge-factor is maximal (= n– 1) when it is a complete graph i.e. every distinct pair in a graph is either a Dual or Double. Computational analyses uncover an extensive number of complete graphs of different sizes. Monte Carlo simulations show that the mutation counts and time elapsed for a target graph to become resistant increase with the hedge-factor. Incidentally, target sequences which were reported to reduce virus titre in experiments are included in our target graphs. The identity of target sequence pairs for heterosubtypic targeting and their combinations for hedging antiviral resistance are useful toolkits to construct target graphs for different therapeutic objectives. PMID:26771381
Hubert, Jan; Erban, Tomas; Kopecky, Jan; Sopko, Bruno; Nesvorna, Marta; Lichovnikova, Martina; Schicht, Sabine; Strube, Christina; Sparagano, Olivier
2017-11-01
Blood feeding red poultry mites (RPM) serve as vectors of pathogenic bacteria and viruses among vertebrate hosts including wild birds, poultry hens, mammals, and humans. The microbiome of RPM has not yet been studied by high-throughput sequencing. RPM eggs, larvae, and engorged adult/nymph samples obtained in four poultry houses in Czechia were used for microbiome analyses by Illumina amplicon sequencing of the 16S ribosomal RNA (rRNA) gene V4 region. A laboratory RPM population was used as positive control for transcriptome analysis by pyrosequencing with identification of sequences originating from bacteria. The samples of engorged adult/nymph stages had 100-fold more copies of 16S rRNA gene copies than the samples of eggs and larvae. The microbiome composition showed differences among the four poultry houses and among observed developmental stadia. In the adults' microbiome 10 OTUs comprised 90 to 99% of all sequences. Bartonella-like bacteria covered between 30 and 70% of sequences in RPM microbiome and 25% bacterial sequences in transcriptome. The phylogenetic analyses of 16S rRNA gene sequences revealed two distinct groups of Bartonella-like bacteria forming sister groups: (i) symbionts of ants; (ii) Bartonella genus. Cardinium, Wolbachia, and Rickettsiella sp. were found in the microbiomes of all tested stadia, while Spiroplasma eriocheiris and Wolbachia were identified in the laboratory RPM transcriptome. The microbiomes from eggs, larvae, and engorged adults/nymphs differed. Bartonella-like symbionts were found in all stadia and sampling sites. Bartonella-like bacteria was the most diversified group within the RPM microbiome. The presence of identified putative pathogenic bacteria is relevant with respect to human and animal health issues while the identification of symbiontic bacteria can lead to new control methods targeting them to destabilize the arthropod host.
The study of human Y chromosome variation through ancient DNA.
Kivisild, Toomas
2017-05-01
High throughput sequencing methods have completely transformed the study of human Y chromosome variation by offering a genome-scale view on genetic variation retrieved from ancient human remains in context of a growing number of high coverage whole Y chromosome sequence data from living populations from across the world. The ancient Y chromosome sequences are providing us the first exciting glimpses into the past variation of male-specific compartment of the genome and the opportunity to evaluate models based on previously made inferences from patterns of genetic variation in living populations. Analyses of the ancient Y chromosome sequences are challenging not only because of issues generally related to ancient DNA work, such as DNA damage-induced mutations and low content of endogenous DNA in most human remains, but also because of specific properties of the Y chromosome, such as its highly repetitive nature and high homology with the X chromosome. Shotgun sequencing of uniquely mapping regions of the Y chromosomes to sufficiently high coverage is still challenging and costly in poorly preserved samples. To increase the coverage of specific target SNPs capture-based methods have been developed and used in recent years to generate Y chromosome sequence data from hundreds of prehistoric skeletal remains. Besides the prospects of testing directly as how much genetic change in a given time period has accompanied changes in material culture the sequencing of ancient Y chromosomes allows us also to better understand the rate at which mutations accumulate and get fixed over time. This review considers genome-scale evidence on ancient Y chromosome diversity that has recently started to accumulate in geographic areas favourable to DNA preservation. More specifically the review focuses on examples of regional continuity and change of the Y chromosome haplogroups in North Eurasia and in the New World.
Wang, Chunxiao; García-Fernández, David; Mas, Albert; Esteve-Zarzoso, Braulio
2015-01-01
The diversity of fungi in grape must and during wine fermentation was investigated in this study by culture-dependent and culture-independent techniques. Carignan and Grenache grapes were harvested from three vineyards in the Priorat region (Spain) in 2012, and nine samples were selected from the grape must after crushing and during wine fermentation. From culture-dependent techniques, 362 isolates were randomly selected and identified by 5.8S-ITS-RFLP and 26S-D1/D2 sequencing. Meanwhile, genomic DNA was extracted directly from the nine samples and analyzed by qPCR, DGGE and massive sequencing. The results indicated that grape must after crushing harbored a high species richness of fungi with Aspergillus tubingensis, Aureobasidium pullulans, or Starmerella bacillaris as the dominant species. As fermentation proceeded, the species richness decreased, and yeasts such as Hanseniaspora uvarum, Starmerella bacillaris and Saccharomyces cerevisiae successively occupied the must samples. The “terroir” characteristics of the fungus population are more related to the location of the vineyard than to grape variety. Sulfur dioxide treatment caused a low effect on yeast diversity by similarity analysis. Because of the existence of large population of fungi on grape berries, massive sequencing was more appropriate to understand the fungal community in grape must after crushing than the other techniques used in this study. Suitable target sequences and databases were necessary for accurate evaluation of the community and the identification of species by the 454 pyrosequencing of amplicons. PMID:26557110
Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C
2016-07-01
The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Experimental design and quantitative analysis of microbial community multiomics.
Mallick, Himel; Ma, Siyuan; Franzosa, Eric A; Vatanen, Tommi; Morgan, Xochitl C; Huttenhower, Curtis
2017-11-30
Studies of the microbiome have become increasingly sophisticated, and multiple sequence-based, molecular methods as well as culture-based methods exist for population-scale microbiome profiles. To link the resulting host and microbial data types to human health, several experimental design considerations, data analysis challenges, and statistical epidemiological approaches must be addressed. Here, we survey current best practices for experimental design in microbiome molecular epidemiology, including technologies for generating, analyzing, and integrating microbiome multiomics data. We highlight studies that have identified molecular bioactives that influence human health, and we suggest steps for scaling translational microbiome research to high-throughput target discovery across large populations.
Isolation and characterization of microsatellite loci in the intertidal sponge Halichondria panicea
Knowlton, Anne L.; Pierson, Barbara J.; Talbot, S.L.; Highsmith, Ray C.
2003-01-01
GA- and CA-enriched genomic libraries were constructed for the intertidal sponge Halichondria panicea. Unique repeat motifs identified varied from the expected simple dinucleotide repeats to more complex repeat units. All sequences tended to be highly repetitive but did not necessarily contain the targeted motifs. Seven microsatellite loci were evaluated on sponges from the clone source population. All seven were polymorphic with 5.43 ± 0.92 mean number of alleles. Six of the seven loci that could be resolved had mean heterozygosities of 0.14–0.68. The loci identified here will be useful for population studies.
Fresco, Jacques R.; Johnson, Marion D.
2002-01-01
Disclosed are methods for detecting in situ the presence of a target sequence in a substantially double-stranded nucleic acid segment, which comprises: a) contacting in situ under conditions suitable for hybridization a substantially double-stranded nucleic acid segment with a detectable third strand, said third strand being capable of hybridizing to at least a portion of the target sequence to form a triple-stranded structure, if said target sequence is present; and b) detecting whether hybridization between the third strand and the target sequence has occured.
Barnes, Kayla G; Weedall, Gareth D; Ndula, Miranda; Irving, Helen; Mzihalowa, Themba; Hemingway, Janet; Wondji, Charles S
2017-02-01
Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.
Glenn, Travis C; Lance, Stacey L; McKee, Anna M; Webster, Bonnie L; Emery, Aidan M; Zerlotini, Adhemar; Oliveira, Guilherme; Rollinson, David; Faircloth, Brant C
2013-10-17
Urogenital schistosomiasis caused by Schistosoma haematobium is widely distributed across Africa and is increasingly being targeted for control. Genome sequences and population genetic parameters can give insight into the potential for population- or species-level drug resistance. Microsatellite DNA loci are genetic markers in wide use by Schistosoma researchers, but there are few primers available for S. haematobium. We sequenced 1,058,114 random DNA fragments from clonal cercariae collected from a snail infected with a single Schistosoma haematobium miracidium. We assembled and aligned the S. haematobium sequences to the genomes of S. mansoni and S. japonicum, identifying microsatellite DNA loci across all three species and designing primers to amplify the loci in S. haematobium. To validate our primers, we screened 32 randomly selected primer pairs with population samples of S. haematobium. We designed >13,790 primer pairs to amplify unique microsatellite loci in S. haematobium, (available at http://www.cebio.org/projetos/schistosoma-haematobium-genome). The three Schistosoma genomes contained similar overall frequencies of microsatellites, but the frequency and length distributions of specific motifs differed among species. We identified 15 primer pairs that amplified consistently and were easily scored. We genotyped these 15 loci in S. haematobium individuals from six locations: Zanzibar had the highest levels of diversity; Malawi, Mauritius, Nigeria, and Senegal were nearly as diverse; but the sample from South Africa was much less diverse. About half of the primers in the database of Schistosoma haematobium microsatellite DNA loci should yield amplifiable and easily scored polymorphic markers, thus providing thousands of potential markers. Sequence conservation among S. haematobium, S. japonicum, and S. mansoni is relatively high, thus it should now be possible to identify markers that are universal among Schistosoma species (i.e., using DNA sequences conserved among species), as well as other markers that are specific to species or species-groups (i.e., using DNA sequences that differ among species). Full genome-sequencing of additional species and specimens of S. haematobium, S. japonicum, and S. mansoni is desirable to better characterize differences within and among these species, to develop additional genetic markers, and to examine genes as well as conserved non-coding elements associated with drug resistance.
Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.
Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P
2016-01-01
Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.
Identification of MicroRNA Targets of Capsicum spp. Using MiRTrans—a Trans-Omics Approach
Zhang, Lu; Qin, Cheng; Mei, Junpu; Chen, Xiaocui; Wu, Zhiming; Luo, Xirong; Cheng, Jiaowen; Tang, Xiangqun; Hu, Kailin; Li, Shuai C.
2017-01-01
The microRNA (miRNA) can regulate the transcripts that are involved in eukaryotic cell proliferation, differentiation, and metabolism. Especially for plants, our understanding of miRNA targets, is still limited. Early attempts of prediction on sequence alignments have been plagued by enormous false positives. It is helpful to improve target prediction specificity by incorporating the other data sources such as the dependency between miRNA and transcript expression or even cleaved transcripts by miRNA regulations, which are referred to as trans-omics data. In this paper, we developed MiRTrans (Prediction of MiRNA targets by Trans-omics data) to explore miRNA targets by incorporating miRNA sequencing, transcriptome sequencing, and degradome sequencing. MiRTrans consisted of three major steps. First, the target transcripts of miRNAs were predicted by scrutinizing their sequence characteristics and collected as an initial potential targets pool. Second, false positive targets were eliminated if the expression of miRNA and its targets were weakly correlated by lasso regression. Third, degradome sequencing was utilized to capture the miRNA targets by examining the cleaved transcripts that regulated by miRNAs. Finally, the predicted targets from the second and third step were combined by Fisher's combination test. MiRTrans was applied to identify the miRNA targets for Capsicum spp. (i.e., pepper). It can generate more functional miRNA targets than sequence-based predictions by evaluating functional enrichment. MiRTrans identified 58 miRNA-transcript pairs with high confidence from 18 miRNA families conserved in eudicots. Most of these targets were transcription factors; this lent support to the role of miRNA as key regulator in pepper. To our best knowledge, this work is the first attempt to investigate the miRNA targets of pepper, as well as their regulatory networks. Surprisingly, only a small proportion of miRNA-transcript pairs were shared between degradome sequencing and expression dependency predictions, suggesting that miRNA targets predicted by a single technology alone may be prone to report false negatives. PMID:28443105
Nucleic Acid Detection Methods
Smith, Cassandra L.; Yaar, Ron; Szafranski, Przemyslaw; Cantor, Charles R.
1998-05-19
The invention relates to methods for rapidly determining the sequence and/or length a target sequence. The target sequence may be a series of known or unknown repeat sequences which are hybridized to an array of probes. The hybridized array is digested with a single-strand nuclease and free 3'-hydroxyl groups extended with a nucleic acid polymerase. Nuclease cleaved heteroduplexes can be easily distinguish from nuclease uncleaved heteroduplexes by differential labeling. Probes and target can be differentially labeled with detectable labels. Matched target can be detected by cleaving resulting loops from the hybridized target and creating free 3-hydroxyl groups. These groups are recognized and extended by polymerases added into the reaction system which also adds or releases one label into solution. Analysis of the resulting products using either solid phase or solution. These methods can be used to detect characteristic nucleic acid sequences, to determine target sequence and to screen for genetic defects and disorders. Assays can be conducted on solid surfaces allowing for multiple reactions to be conducted in parallel and, if desired, automated.
Solid phase sequencing of biopolymers
Cantor, Charles; Koster, Hubert
2010-09-28
This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.
DeLeon, Orlando; Hodis, Hagit; O’Malley, Yunxia; Johnson, Jacklyn; Salimi, Hamid; Zhai, Yinjie; Winter, Elizabeth; Remec, Claire; Eichelberger, Noah; Van Cleave, Brandon; Puliadi, Ramya; Harrington, Robert D.; Stapleton, Jack T.; Haim, Hillel
2017-01-01
The envelope glycoproteins (Envs) of HIV-1 continuously evolve in the host by random mutations and recombination events. The resulting diversity of Env variants circulating in the population and their continuing diversification process limit the efficacy of AIDS vaccines. We examined the historic changes in Env sequence and structural features (measured by integrity of epitopes on the Env trimer) in a geographically defined population in the United States. As expected, many Env features were relatively conserved during the 1980s. From this state, some features diversified whereas others remained conserved across the years. We sought to identify “clues” to predict the observed historic diversification patterns. Comparison of viruses that cocirculate in patients at any given time revealed that each feature of Env (sequence or structural) exists at a defined level of variance. The in-host variance of each feature is highly conserved among individuals but can vary between different HIV-1 clades. We designate this property “volatility” and apply it to model evolution of features as a linear diffusion process that progresses with increasing genetic distance. Volatilities of different features are highly correlated with their divergence in longitudinally monitored patients. Volatilities of features also correlate highly with their population-level diversification. Using volatility indices measured from a small number of patient samples, we accurately predict the population diversity that developed for each feature over the course of 30 years. Amino acid variants that evolved at key antigenic sites are also predicted well. Therefore, small “fluctuations” in feature values measured in isolated patient samples accurately describe their potential for population-level diversification. These tools will likely contribute to the design of population-targeted AIDS vaccines by effectively capturing the diversity of currently circulating strains and addressing properties of variants expected to appear in the future. PMID:28384158
Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers
Pabinger, Stephan; Ernst, Karina; Pulverer, Walter; Kallmeyer, Rainer; Valdes, Ana M.; Metrustry, Sarah; Katic, Denis; Nuzzo, Angelo; Kriegner, Albert; Vierlinger, Klemens; Weinhaeusel, Andreas
2016-01-01
Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM). Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage. TABSAT is freely available under a GNU General Public License version 3.0 (GPLv3) at https://github.com/tadkeys/tabsat/ and http://demo.platomics.com/. PMID:27467908
Wang, Fangquan; Li, Wenqi; Zhu, Jinyan; Fan, Fangjun; Wang, Jun; Zhong, Weigong; Wang, Ming-Bo; Liu, Qing; Zhu, Qian-Hao; Zhou, Tong; Lan, Ying; Zhou, Yijun; Yang, Jie
2016-05-11
Rice black-streaked dwarf virus (RBSDV) belongs to the genus Fijivirus in the family of Reoviridae and causes severe yield loss in rice-producing areas in Asia. RNA silencing, as a natural defence mechanism against plant viruses, has been successfully exploited for engineering virus resistance in plants, including rice. In this study, we generated transgenic rice lines harbouring a hairpin RNA (hpRNA) construct targeting four RBSDV genes, S1, S2, S6 and S10, encoding the RNA-dependent RNA polymerase, the putative core protein, the RNA silencing suppressor and the outer capsid protein, respectively. Both field nursery and artificial inoculation assays of three generations of the transgenic lines showed that they had strong resistance to RBSDV infection. The RBSDV resistance in the segregating transgenic populations correlated perfectly with the presence of the hpRNA transgene. Furthermore, the hpRNA transgene was expressed in the highly resistant transgenic lines, giving rise to abundant levels of 21-24 nt small interfering RNA (siRNA). By small RNA deep sequencing, the RBSDV-resistant transgenic lines detected siRNAs from all four viral gene sequences in the hpRNA transgene, indicating that the whole chimeric fusion sequence can be efficiently processed by Dicer into siRNAs. Taken together, our results suggest that long hpRNA targeting multiple viral genes can be used to generate stable and durable virus resistance in rice, as well as other plant species.
Intravenous phage display identifies peptide sequences that target the burn-injured intestine.
Costantini, Todd W; Eliceiri, Brian P; Putnam, James G; Bansal, Vishal; Baird, Andrew; Coimbra, Raul
2012-11-01
The injured intestine is responsible for significant morbidity and mortality after severe trauma and burn; however, targeting the intestine with therapeutics aimed at decreasing injury has proven difficult. We hypothesized that we could use intravenous phage display technology to identify peptide sequences that target the injured intestinal mucosa in a murine model, and then confirm the cross-reactivity of this peptide sequence with ex vivo human gut. Four hours following 30% TBSA burn we performed an in vivo, intravenous systemic administration of phage library containing 10(12) phage in balb/c mice to biopan for gut-targeting peptides. In vivo assessment of the candidate peptide sequences identified after 4 rounds of internalization was performed by injecting 1×10(12) copies of each selected phage clone into sham or burned animals. Internalization into the gut was assessed using quantitative polymerase chain reaction. We then incubated this gut-targeting peptide sequence with human intestine and visualized fluorescence using confocal microscopy. We identified 3 gut-targeting peptide sequences which caused collapse of the phage library (4-1: SGHQLLLNKMP, 4-5: ILANDLTAPGPR, 4-11: SFKPSGLPAQSL). Sequence 4-5 was internalized into the intestinal mucosa of burned animals 9.3-fold higher than sham animals injected with the same sequence (2.9×10(5)vs. 3.1×10(4) particles per mg tissue). Sequences 4-1 and 4-11 were both internalized into the gut, but did not demonstrate specificity for the injured mucosa. Phage sequence 4-11 demonstrated cross-reactivity with human intestine. In the future, this gut-targeting peptide sequence could serve as a platform for the delivery of biotherapeutics. Copyright © 2012 Elsevier Inc. All rights reserved.
Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai
2017-11-23
The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.
Gardiner, Laura-Jayne; Gawroński, Piotr; Olohan, Lisa; Schnurbusch, Thorsten; Hall, Neil; Hall, Anthony
2014-12-01
Mapping-by-sequencing analyses have largely required a complete reference sequence and employed whole genome re-sequencing. In species such as wheat, no finished genome reference sequence is available. Additionally, because of its large genome size (17 Gb), re-sequencing at sufficient depth of coverage is not practical. Here, we extend the utility of mapping by sequencing, developing a bespoke pipeline and algorithm to map an early-flowering locus in einkorn wheat (Triticum monococcum L.) that is closely related to the bread wheat genome A progenitor. We have developed a genomic enrichment approach using the gene-rich regions of hexaploid bread wheat to design a 110-Mbp NimbleGen SeqCap EZ in solution capture probe set, representing the majority of genes in wheat. Here, we use the capture probe set to enrich and sequence an F2 mapping population of the mutant. The mutant locus was identified in T. monococcum, which lacks a complete genome reference sequence, by mapping the enriched data set onto pseudo-chromosomes derived from the capture probe target sequence, with a long-range order of genes based on synteny of wheat with Brachypodium distachyon. Using this approach we are able to map the region and identify a set of deleted genes within the interval. © 2014 The Authors.The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Glaser-Schmitt, Amanda; Duchen, Pablo; Parsch, John
2016-01-01
Insertions and deletions (indels) are a major source of genetic variation within species and may result in functional changes to coding or regulatory sequences. In this study we report that an indel polymorphism in the 3’ untranslated region (UTR) of the metallothionein gene MtnA is associated with gene expression variation in natural populations of Drosophila melanogaster. A derived allele of MtnA with a 49-bp deletion in the 3' UTR segregates at high frequency in populations outside of sub-Saharan Africa. The frequency of the deletion increases with latitude across multiple continents and approaches 100% in northern Europe. Flies with the deletion have more than 4-fold higher MtnA expression than flies with the ancestral sequence. Using reporter gene constructs in transgenic flies, we show that the 3' UTR deletion significantly contributes to the observed expression difference. Population genetic analyses uncovered signatures of a selective sweep in the MtnA region within populations from northern Europe. We also find that the 3’ UTR deletion is associated with increased oxidative stress tolerance. These results suggest that the 3' UTR deletion has been a target of selection for its ability to confer increased levels of MtnA expression in northern European populations, likely due to a local adaptive advantage of increased oxidative stress tolerance. PMID:27120580
2014-01-01
The Bactrian camel (Camelus bactrianus) and the dromedary (Camelus dromedarius) are among the last species that have been domesticated around 3000–6000 years ago. During domestication, strong artificial (anthropogenic) selection has shaped the livestock, creating a huge amount of phenotypes and breeds. Hence, domestic animals represent a unique resource to understand the genetic basis of phenotypic variation and adaptation. Similar to its late domestication history, the Bactrian camel is also among the last livestock animals to have its genome sequenced and deciphered. As no genomic data have been available until recently, we generated a de novo assembly by shotgun sequencing of a single male Bactrian camel. We obtained 1.6 Gb genomic sequences, which correspond to more than half of the Bactrian camel’s genome. The aim of this study was to identify heterozygous single-nucleotide polymorphisms (SNPs) and to estimate population parameters and nucleotide diversity based on an individual camel. With an average 6.6-fold coverage, we detected over 116 000 heterozygous SNPs and recorded a genome-wide nucleotide diversity similar to that of other domesticated ungulates. More than 20 000 (85%) dromedary expressed sequence tags successfully aligned to our genomic draft. Our results provide a template for future association studies targeting economically relevant traits and to identify changes underlying the process of camel domestication and environmental adaptation. PMID:23454912
Elisa, Mwega; Hasan, Salih Dia; Moses, Njahira; Elpidius, Rukambile; Skilton, Robert; Gwakisa, Paul
2015-04-01
This study investigated the genetic and antigenic diversity of Theileria parva in cattle from the Eastern and Southern zones of Tanzania. Thirty-nine (62%) positive samples were genotyped using 14 mini- and microsatellite markers with coverage of all four T. parva chromosomes. Wright's F index (F(ST) = 0 × 094) indicated a high level of panmixis. Linkage equilibrium was observed in the two zones studied, suggesting existence of a panmyctic population. In addition, sequence analysis of CD8+ T-cell target antigen genes Tp1 revealed a single protein sequence in all samples analysed, which is also present in the T. parva Muguga strain, which is a component of the FAO1 vaccine. All Tp2 epitope sequences were identical to those in the T. parva Muguga strain, except for one variant of a Tp2 epitope, which is found in T. parva Kiambu 5 strain, also a component the FAO1 vaccine. Neighbour joining tree of the nucleotide sequences of Tp2 showed clustering according to geographical origin. Our results show low genetic and antigenic diversity of T. parva within the populations analysed. This has very important implications for the development of sustainable control measures for T. parva in Eastern and Southern zones of Tanzania, where East Coast fever is endemic.
DNA barcoding of tuberous Orchidoideae: a resource for identification of orchids used in Salep.
Ghorbani, Abdolbaset; Gravendeel, Barbara; Selliah, Sugirthini; Zarré, Shahin; de Boer, Hugo
2017-03-01
Tubers of terrestrial orchids are harvested and traded from the eastern Mediterranean to the Caspian Sea for the traditional product Salep. Overexploitation of wild populations and increased middle-class prosperity have escalated prices for Salep, causing overharvesting, depletion of native populations and providing an incentive to expand harvesting to untapped areas in Iran. Limited morphological distinctiveness among traded Salep tubers renders species identification impossible, making it difficult to establish which species are targeted and affected the most. In this study, a reference database of 490 nrITS, trnL-F spacer and matK sequences of 133 taxa was used to identify 150 individual tubers from 31 batches purchased in 12 cities in Iran to assess species diversity in commerce. The sequence reference database consisted of 211 nrITS, 158 trnL-F and 121 matK sequences, including 238 new sequences from collections made for this study. The markers enabled unambiguous species identification with tree-based methods for nrITS in 67% of the tested tubers, 58% for trnL-F and 59% for matK. Species in the genera Orchis (34%), Anacamptis (27%) and Dactylorhiza (19%) were the most common in Salep. Our study shows that all tuberous orchid species in this area are threatened by this trade, and further stresses the urgency of controlling illegal harvesting and cross-border trade of Salep tubers. © 2016 John Wiley & Sons Ltd.
Guidelines for integrating population education into primary education and literacy programmes.
1989-01-01
In recent seminars and workshops in the Asia and Pacific region the integration of population education into primary schools and literacy programs were the main topics. In most of the countries in this area separate courses in population education appear to be unfeasible for primary and secondary schools. In the nonformal area experience has indicated that population education acquires more meaning and relevance if it is integrated into an ongoing development program. The integration approach requires knowledge of the contents of the accommodating subjects or programs and knowledge of the contents of the accommodating subjects or programs and knowledge of the contents of population education. Guidelines suggested include the following steps in developing an integrated curriculum and instructional materials. First determine the needs, characteristics and other background information needed on the target group. Next prioritize the problems and needs of the target group, and formulate educational objectives from the identified needs and problems. Next determine and sequence the curriculum contents and then determine specific population education objectives and contents for integration, and what specific materials have to be developed. Then identify the specific type of format of materials to be developed, and write the first draft of the material. Also prepare illustrations and other art and graphic materials. Then the draft material should be reviewed and translated into the language of the target audience if needed. The materials should then be pretested, or field tested, using a sample of the intended users. To make sure the materials are reaching the target groups and being used effectively, a user's guide should be prepared and teachers and facilitators, as well as supervisors, should be prepared on the use of the material. In addition, a distribution and utilization plan should be prepared. Nonformal education materials can be distributed through libraries, reading center, residences of village leaders, neighborhood stores, and direct mail. The material distribution and utilization should be monitored and evaluated.
Program Synthesizes UML Sequence Diagrams
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Osborne, Richard N.
2006-01-01
A computer program called "Rational Sequence" generates Universal Modeling Language (UML) sequence diagrams of a target Java program running on a Java virtual machine (JVM). Rational Sequence thereby performs a reverse engineering function that aids in the design documentation of the target Java program. Whereas previously, the construction of sequence diagrams was a tedious manual process, Rational Sequence generates UML sequence diagrams automatically from the running Java code.
Quantifying Genome Editing Outcomes at Endogenous Loci using SMRT Sequencing
Clark, Joseph; Punjya, Niraj; Sebastiano, Vittorio; Bao, Gang; Porteus, Matthew H
2014-01-01
SUMMARY Targeted genome editing with engineered nucleases has transformed the ability to introduce precise sequence modifications at almost any site within the genome. A major obstacle to probing the efficiency and consequences of genome editing is that no existing method enables the frequency of different editing events to be simultaneously measured across a cell population at any endogenous genomic locus. We have developed a novel method for quantifying individual genome editing outcomes at any site of interest using single molecule real time (SMRT) DNA sequencing. We show that this approach can be applied at various loci, using multiple engineered nuclease platforms including TALENs, RNA guided endonucleases (CRISPR/Cas9), and ZFNs, and in different cell lines to identify conditions and strategies in which the desired engineering outcome has occurred. This approach facilitates the evaluation of new gene editing technologies and permits sensitive quantification of editing outcomes in almost every experimental system used. PMID:24685129
Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.
2016-01-01
Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175
Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J
2005-01-01
Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134
Chan, Philip A.; Hogan, Joseph W.; Huang, Austin; DeLong, Allison; Salemi, Marco; Mayer, Kenneth H.; Kantor, Rami
2015-01-01
Background Molecular epidemiologic evaluation of HIV-1 transmission networks can elucidate behavioral components of transmission that can be targets for intervention. Methods We combined phylogenetic and statistical approaches using pol sequences from patients diagnosed 2004-2011 at a large HIV center in Rhode Island, following 75% of the state’s HIV population. Phylogenetic trees were constructed using maximum likelihood and putative transmission clusters were evaluated using latent class analyses (LCA) to determine association of cluster size with underlying demographic/behavioral characteristics. A logistic growth model was used to assess intra-cluster dynamics over time and predict “active” clusters that were more likely to harbor undiagnosed infections. Results Of 1,166 HIV-1 subtype B sequences, 31% were distributed among 114 statistically-supported, monophyletic clusters (range: 2-15 sequences/cluster). Sequences from men who have sex with men (MSM) formed 52% of clusters. LCA demonstrated that sequences from recently diagnosed (2008-2011) MSM with primary HIV infection (PHI) and other sexually transmitted infections (STIs) were more likely to form larger clusters (Odds Ratio 1.62-11.25, p<0.01). MSM in clusters were more likely to have anonymous partners and meet partners at sex clubs and pornographic stores. Four large clusters with 38 sequences (100% male, 89% MSM) had a high-probability of harboring undiagnosed infections and included younger MSM with PHI and STIs. Conclusions In this first large-scale molecular epidemiologic investigation of HIV-1 transmission in New England, sexual networks among recently diagnosed MSM with PHI and concomitant STIs contributed to ongoing transmission. Characterization of transmission dynamics revealed actively growing clusters which may be targets for intervention. PMID:26258569
Lee, Tzuu-fen; Gurazada, Sai Guna Ranjan; Zhai, Jixian; Li, Shengben; Simon, Stacey A; Matzke, Marjori A; Chen, Xuemei; Meyers, Blake C
2012-07-01
In plants, heterochromatin is maintained by a small RNA-based gene silencing mechanism known as RNA-directed DNA methylation (RdDM). RdDM requires the non-redundant functions of two plant-specific DNA-dependent RNA polymerases (RNAP), RNAP IV and RNAP V. RNAP IV plays a major role in siRNA biogenesis, while RNAP V may recruit DNA methylation machinery to target endogenous loci for silencing. Although small RNA-generating regions that are dependent on both RNAP IV and RNAP V have been identified previously, the genomic loci targeted by RNAP V for siRNA accumulation and silencing have not been described extensively. To characterize the RNAP V-dependent, heterochromatic siRNA-generating regions in the Arabidopsis genome, we deeply sequenced the small RNA populations of wild-type and RNAP V null mutant (nrpe1) plants. Our results showed that RNAP V-dependent siRNA-generating loci are associated predominately with short repetitive sequences in intergenic regions. Suppression of small RNA production from short repetitive sequences was also prominent in RdDM mutants including dms4, drd1, dms3 and rdm1, reflecting the known association of these RdDM effectors with RNAP V. The genomic regions targeted by RNAP V were small, with an estimated average length of 238 bp. Our results suggest that RNAP V affects siRNA production from genomic loci with features dissimilar to known RNAP IV-dependent loci. RNAP V, along with RNAP IV and DRM1/2, may target and silence a set of small, intergenic transposable elements located in dispersed genomic regions for silencing. Silencing at these loci may be actively reinforced by RdDM.
Stanhope, Michael J.; Walsh, Stacey L.; Becker, Julie A.; Italia, Michael J.; Ingraham, Karen A.; Gwynn, Michael N.; Mathie, Tom; Poupard, James A.; Miller, Linda A.; Brown, James R.; Amrine-Madsen, Heather
2005-01-01
Fluoroquinolones are an important class of antibiotics for the treatment of infections arising from the gram-positive respiratory pathogen Streptococcus pneumoniae. Although there is evidence supporting interspecific lateral DNA transfer of fluoroquinolone target loci, no studies have specifically been designed to assess the role of intraspecific lateral transfer of these genes in the spread of fluoroquinolone resistance. This study involves a comparative evolutionary perspective, in which the evolutionary history of a diverse set of S. pneumoniae clinical isolates is reconstructed from an expanded multilocus sequence typing data set, with putative recombinants excluded. This control history is then assessed against networks of each of the four fluoroquinolone target loci from the same isolates. The results indicate that although the majority of fluoroquinolone target loci from this set of 60 isolates are consistent with a clonal dissemination hypothesis, 3 to 10% of the sequences are consistent with an intraspecific lateral transfer hypothesis. Also evident were examples of interspecific transfer, with two isolates possessing a parE-parC gene region arising from viridans group streptococci. The Spain 23F-1 clone is the most dominant fluoroquinolone-nonsusceptible clone in this set of isolates, and the analysis suggests that its members act as frequent donors of fluoroquinolone-nonsusceptible loci. Although the majority of fluoroquinolone target gene sequences in this set of isolates can be explained on the basis of clonal dissemination, a significant number are more parsimoniously explained by intraspecific lateral DNA transfer, and in situations of high S. pneumoniae population density, such events could be an important means of resistance spread. PMID:16189113
Sulfur-oxidizing bacterial populations within cyanobacterial dominated coral disease lesions.
Bourne, David G; van der Zee, Marc J J; Botté, Emmanuelle S; Sato, Yui
2013-08-01
This study investigated the diversity and quantitative shifts of sulfur-oxidizing bacteria (SOB) during the onset of black band disease (BBD) in corals using quantitative PCR (qPCR) and cloning approaches targeting the soxB gene, involved in sulfur oxidation. Four Montipora sp. coral colonies identified with lesions previously termed cyanobacterial patches (CP) (comprising microbial communities different from those of BBD lesions), was monitored in situ as CP developed into BBD. The overall abundance of SOB in both CP and BBD lesions were very low and near the detection limit of the qPCR assay, although consistently indicated that SOB populations decreased as the lesions transitioned from CP to BBD. Phylogenetic assessment of retrieved soxB genes showed that SOB in both CP and BBD lesions were dominated by one sequence type, representing > 70% of all soxB gene sequences and affiliated with members of the Rhodobacteraceae within the α-Proteobacteria. This study represents the first assessment targeting SOB within BBD lesions and clearly shows that SOB are not highly diverse or abundant in this complex microbial mat. The lack of oxidation of reduced sulfur compounds by SOB likely aids the accumulation of high levels of sulfide at the base of the BBD mat, a compound contributing to the pathogenicity of BBD lesions. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Keshri, Jitendra; Mishra, Avinash; Jha, Bhavanath
2013-03-30
Population indices of bacteria and archaea were investigated from saline-alkaline soil and a possible microbe-environment pattern was established using gene targeted metagenomics. Clone libraries were constructed using 16S rRNA and functional gene(s) involved in carbon fixation (cbbL), nitrogen fixation (nifH), ammonia oxidation (amoA) and sulfur metabolism (apsA). Molecular phylogeny revealed the dominance of Actinobacteria, Firmicutes and Proteobacteria along with archaeal members of Halobacteraceae. The library consisted of novel bacterial (20%) and archaeal (38%) genera showing ≤95% similarity to previously retrieved sequences. Phylogenetic analysis indicated ability of inhabitant to survive in stress condition. The 16S rRNA gene libraries contained novel gene sequences and were distantly homologous with cultured bacteria. Functional gene libraries were found unique and most of the clones were distantly related to Proteobacteria, while clones of nifH gene library also showed homology with Cyanobacteria and Firmicutes. Quantitative real-time PCR exhibited that bacterial abundance was two orders of magnitude higher than archaeal. The gene(s) quantification indicated the size of the functional guilds harboring relevant key genes. The study provides insights on microbial ecology and different metabolic interactions occurring in saline-alkaline soil, possessing phylogenetically diverse groups of bacteria and archaea, which may be explored further for gene cataloging and metabolic profiling. Copyright © 2012 Elsevier GmbH. All rights reserved.
Davison, Michelle; Treangen, Todd J; Koren, Sergey; Pop, Mihai; Bhaya, Devaki
2016-01-01
The polymicrobial biofilm communities in Mushroom and Octopus Spring in Yellowstone National Park (YNP) are well characterized, yet little is known about the phage populations. Dominant species, Synechococcus sp. JA-2-3B'a(2-13), Synechococcus sp. JA-3-3Ab, Chloroflexus sp. Y-400-fl, and Roseiflexus sp. RS-1, contain multiple CRISPR-Cas arrays, suggesting complex interactions with phage predators. To analyze phage populations from Octopus Spring biofilms, we sequenced a viral enriched fraction. To assemble and analyze phage metagenomic data, we developed a custom module, VIRITAS, implemented within the MetAMOS framework. This module bins contigs into groups based on tetranucleotide frequencies and CRISPR spacer-protospacer matching and ORF calling. Using this pipeline we were able to assemble phage sequences into contigs and bin them into three clusters that corroborated with their potential host range. The virome contained 52,348 predicted ORFs; some were clearly phage-like; 9319 ORFs had a recognizable Pfam domain while the rest were hypothetical. Of the recognized domains with CRISPR spacer matches, was the phage endolysin used by lytic phage to disrupt cells. Analysis of the endolysins present in the thermophilic cyanophage contigs revealed a subset of characterized endolysins as well as a Glyco_hydro_108 (PF05838) domain not previously associated with sequenced cyanophages. A search for CRISPR spacer matches to all identified phage endolysins demonstrated that a majority of endolysin domains were targets. This strategy provides a general way to link host and phage as endolysins are known to be widely distributed in bacteriophage. Endolysins can also provide information about host cell wall composition and have the additional potential to be used as targets for novel therapeutics.
RISC RNA sequencing for context-specific identification of in vivo miR targets
Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W
2010-01-01
Rationale MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. Objective To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). Methods and Results We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias, and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1,645 mRNAs consistently targeted to mouse cardiac RISCs. We employed this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing ‘seed’ sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. Conclusions RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context, and is applicable to any tissue and any disease state. Summary MicroRNAs (miRs) are key regulators of mRNA translation in health and disease. While bioinformatic predictions suggest that a single miR may target hundreds of mRNAs, the number of experimentally verified targets of miRs is low. To enable comprehensive, unbiased examination of miR targets, we have performed deep RNA sequencing of cardiac transcriptomes in parallel with cardiac RNA-induced silencing complex (RISC)-associated RNAs (the RISCome), called RISC sequencing. We developed methods that did not require cross-linking of RNAs to RISCs or amplification of mRNA prior to sequencing, making it possible to rapidly perform RISC sequencing from intact tissue while avoiding amplification bias. Comparison of RISCome with transcriptome expression defined the degree of RISC enrichment for each mRNA. The majority of the mRNAs enriched in wild-type cardiac RISComes compared to transcriptomes were bioinformatically predicted to be targets of at least 1 of 139 cardiac-expressed miRs. Programming cardiomyocyte RISCs via transgenic overexpression in adult hearts of miR-133a or miR-499, two miRs that contain entirely different ‘seed’ sequences, elicited differing profiles of RISC-targeted mRNAs. Thus, RISC sequencing represents a highly sensitive method for general RISC profiling and individual miR target identification in biological context. PMID:21030712
Development and evaluation of a multi-locus sequence typing scheme for Mycoplasma synoviae.
Dijkman, R; Feberwee, A; Landman, W J M
2016-08-01
Reproducible molecular Mycoplasma synoviae typing techniques with sufficient discriminatory power may help to expand knowledge on its epidemiology and contribute to the improvement of control and eradication programmes of this mycoplasma species. The present study describes the development and validation of a novel multi-locus sequence typing (MLST) scheme for M. synoviae. Thirteen M. synoviae isolates originating from different poultry categories, farms and lesions, were subjected to whole genome sequencing. Their sequences were compared to that of M. synoviae reference strain MS53. A high number of single nucleotide polymorphisms (SNPs) indicating considerable genetic diversity were identified. SNPs were present in over 40 putative target genes for MLST of which five target genes were selected (nanA, uvrA, lepA, ruvB and ugpA) for the MLST scheme. This scheme was evaluated analysing 209 M. synoviae samples from different countries, categories of poultry, farms and lesions. Eleven clonal clusters and 76 different sequence types (STs) were obtained. Clustering occurred following geographical origin, supporting the hypothesis of regional population evolution. M. synoviae samples obtained from epidemiologically linked outbreaks often harboured the same ST. In contrast, multiple M. synoviae lineages were found in samples originating from swollen joints or oviducts from hens that produce eggs with eggshell apex abnormalities indicating that further research is needed to identify the genetic factors of M. synoviae that may explain its variations in tissue tropism and disease inducing potential. Furthermore, MLST proved to have a higher discriminatory power compared to variable lipoprotein and haemagglutinin A typing, which generated 50 different genotypes on the same database.
2012-01-01
Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. Conclusions We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species. PMID:22805587
Yang, Huaan; Tao, Ye; Zheng, Zequn; Li, Chengdao; Sweetingham, Mark W; Howieson, John G
2012-07-17
In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species.
Identifying bacterial predictors of honey bee health.
Budge, Giles E; Adams, Ian; Thwaites, Richard; Pietravalle, Stéphane; Drew, Georgia C; Hurst, Gregory D D; Tomkies, Victoria; Boonham, Neil; Brown, Mike
2016-11-01
Non-targeted approaches are useful tools to identify new or emerging issues in bee health. Here, we utilise next generation sequencing to highlight bacteria associated with healthy and unhealthy honey bee colonies, and then use targeted methods to screen a wider pool of colonies with known health status. Our results provide the first evidence that bacteria from the genus Arsenophonus are associated with poor health in honey bee colonies. We also discovered Lactobacillus and Leuconostoc spp. were associated with healthier honey bee colonies. Our results highlight the importance of understanding how the wider microbial population relates to honey bee colony health. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
BAC sequencing using pooled methods.
Saski, Christopher A; Feltus, F Alex; Parida, Laxmi; Haiminen, Niina
2015-01-01
Shotgun sequencing and assembly of a large, complex genome can be both expensive and challenging to accurately reconstruct the true genome sequence. Repetitive DNA arrays, paralogous sequences, polyploidy, and heterozygosity are main factors that plague de novo genome sequencing projects that typically result in highly fragmented assemblies and are difficult to extract biological meaning. Targeted, sub-genomic sequencing offers complexity reduction by removing distal segments of the genome and a systematic mechanism for exploring prioritized genomic content through BAC sequencing. If one isolates and sequences the genome fraction that encodes the relevant biological information, then it is possible to reduce overall sequencing costs and efforts that target a genomic segment. This chapter describes the sub-genome assembly protocol for an organism based upon a BAC tiling path derived from a genome-scale physical map or from fine mapping using BACs to target sub-genomic regions. Methods that are described include BAC isolation and mapping, DNA sequencing, and sequence assembly.
Advanced surface-enhanced Raman gene probe systems and methods thereof
Vo-Dinh, Tuan
2001-01-01
The subject invention is a series of methods and systems for using the Surface-Enhanced Raman (SER)-labeled Gene Probe for hybridization, detection and identification of SER-labeled hybridized target oligonucleotide material comprising the steps of immobilizing SER-labeled hybridized target oligonucleotide material on a support means, wherein the SER-labeled hybridized target oligonucleotide material comprise a SER label attached either to a target oligonucleotide of unknown sequence or to a gene probe of known sequence complementary to the target oligonucleotide sequence, the SER label is unique for the target oligonucleotide strands of a particular sequence wherein the SER-labeled oligonucleotide is hybridized to its complementary oligonucleotide strand, then the support means having the SER-labeled hybridized target oligonucleotide material adsorbed thereon is SERS activated with a SERS activating means, then the support means is analyzed.
Nucleic acid detection methods
Smith, C.L.; Yaar, R.; Szafranski, P.; Cantor, C.R.
1998-05-19
The invention relates to methods for rapidly determining the sequence and/or length a target sequence. The target sequence may be a series of known or unknown repeat sequences which are hybridized to an array of probes. The hybridized array is digested with a single-strand nuclease and free 3{prime}-hydroxyl groups extended with a nucleic acid polymerase. Nuclease cleaved heteroduplexes can be easily distinguish from nuclease uncleaved heteroduplexes by differential labeling. Probes and target can be differentially labeled with detectable labels. Matched target can be detected by cleaving resulting loops from the hybridized target and creating free 3-hydroxyl groups. These groups are recognized and extended by polymerases added into the reaction system which also adds or releases one label into solution. Analysis of the resulting products using either solid phase or solution. These methods can be used to detect characteristic nucleic acid sequences, to determine target sequence and to screen for genetic defects and disorders. Assays can be conducted on solid surfaces allowing for multiple reactions to be conducted in parallel and, if desired, automated. 18 figs.
Xia, Shu; Kohli, Manish; Du, Meijun; Dittmar, Rachel L; Lee, Adam; Nandy, Debashis; Yuan, Tiezheng; Guo, Yongchen; Wang, Yuan; Tschannen, Michael R; Worthey, Elizabeth; Jacob, Howard; See, William; Kilari, Deepak; Wang, Xuexia; Hovey, Raymond L; Huang, Chiang-Ching; Wang, Liang
2015-06-30
Liquid biopsies, examinations of tumor components in body fluids, have shown promise for predicting clinical outcomes. To evaluate tumor-associated genomic and genetic variations in plasma cell-free DNA (cfDNA) and their associations with treatment response and overall survival, we applied whole genome and targeted sequencing to examine the plasma cfDNAs derived from 20 patients with advanced prostate cancer. Sequencing-based genomic abnormality analysis revealed locus-specific gains or losses that were common in prostate cancer, such as 8q gains, AR amplifications, PTEN losses and TMPRSS2-ERG fusions. To estimate tumor burden in cfDNA, we developed a Plasma Genomic Abnormality (PGA) score by summing the most significant copy number variations. Cox regression analysis showed that PGA scores were significantly associated with overall survival (p < 0.04). After androgen deprivation therapy or chemotherapy, targeted sequencing showed significant mutational profile changes in genes involved in androgen biosynthesis, AR activation, DNA repair, and chemotherapy resistance. These changes may reflect the dynamic evolution of heterozygous tumor populations in response to these treatments. These results strongly support the feasibility of using non-invasive liquid biopsies as potential tools to study biological mechanisms underlying therapy-specific resistance and to predict disease progression in advanced prostate cancer.
Solid phase sequencing of double-stranded nucleic acids
Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.
2002-01-01
This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.
Kretova, Olga V; Chechetkin, Vladimir R; Fedoseeva, Daria M; Kravatsky, Yuri V; Sosin, Dmitri V; Alembekov, Ildar R; Gorbacheva, Maria A; Gashnikova, Natalya M; Tchurikov, Nickolai A
2017-02-01
Any method for silencing the activity of the HIV-1 retrovirus should tackle the extremely high variability of HIV-1 sequences and mutational escape. We studied sequence variability in the vicinity of selected RNA interference (RNAi) targets from isolates of HIV-1 subtype A in Russia, and we propose that using artificial RNAi is a potential alternative to traditional antiretroviral therapy. We prove that using multiple RNAi targets overcomes the variability in HIV-1 isolates. The optimal number of targets critically depends on the conservation of the target sequences. The total number of targets that are conserved with a probability of 0.7-0.8 should exceed at least 2. Combining deep sequencing and multitarget RNAi may provide an efficient approach to cure HIV/AIDS.
Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis
2014-01-01
Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. PMID:25078912
Shortt, Jonathan A; Card, Daren C; Schield, Drew R; Liu, Yang; Zhong, Bo; Castoe, Todd A; Carlton, Elizabeth J; Pollock, David D
2017-01-01
In areas where schistosomiasis control programs have been implemented, morbidity and prevalence have been greatly reduced. However, to sustain these reductions and move towards interruption of transmission, new tools for disease surveillance are needed. Genomic methods have the potential to help trace the sources of new infections, and allow us to monitor drug resistance. Large-scale genotyping efforts for schistosome species have been hindered by cost, limited numbers of established target loci, and the small amount of DNA obtained from miracidia, the life stage most readily acquired from humans. Here, we present a method using next generation sequencing to provide high-resolution genomic data from S. japonicum for population-based studies. We applied whole genome amplification followed by double digest restriction site associated DNA sequencing (ddRADseq) to individual S. japonicum miracidia preserved on Whatman FTA cards. We found that we could effectively and consistently survey hundreds of thousands of variants from 10,000 to 30,000 loci from archived miracidia as old as six years. An analysis of variation from eight miracidia obtained from three hosts in two villages in Sichuan showed clear population structuring by village and host even within this limited sample. This high-resolution sequencing approach yields three orders of magnitude more information than microsatellite genotyping methods that have been employed over the last decade, creating the potential to answer detailed questions about the sources of human infections and to monitor drug resistance. Costs per sample range from $50-$200, depending on the amount of sequence information desired, and we expect these costs can be reduced further given continued reductions in sequencing costs, improvement of protocols, and parallelization. This approach provides new promise for using modern genome-scale sampling to S. japonicum surveillance, and could be applied to other schistosome species and other parasitic helminthes.
2014-01-01
Background Hypervariable region 1 (HVR1) contained within envelope protein 2 (E2) gene is the most variable part of HCV genome and its translation product is a major target for the host immune response. Variability within HVR1 may facilitate evasion of the immune response and could affect treatment outcome. The aim of the study was to analyze the impact of HVR1 heterogeneity employing sensitive ultra-deep sequencing, on the outcome of PEG-IFN-α (pegylated interferon α) and ribavirin treatment. Methods HVR1 sequences were amplified from pretreatment serum samples of 25 patients infected with genotype 1b HCV (12 responders and 13 non-responders) and were subjected to pyrosequencing (GS Junior, 454/Roche). Reads were corrected for sequencing error using ShoRAH software, while population reconstruction was done using three different minimal variant frequency cut-offs of 1%, 2% and 5%. Statistical analysis was done using Mann–Whitney and Fisher’s exact tests. Results Complexity, Shannon entropy, nucleotide diversity per site, genetic distance and the number of genetic substitutions were not significantly different between responders and non-responders, when analyzing viral populations at any of the three frequencies (≥1%, ≥2% and ≥5%). When clonal sample was used to determine pyrosequencing error, 4% of reads were found to be incorrect and the most abundant variant was present at a frequency of 1.48%. Use of ShoRAH reduced the sequencing error to 1%, with the most abundant erroneous variant present at frequency of 0.5%. Conclusions While deep sequencing revealed complex genetic heterogeneity of HVR1 in chronic hepatitis C patients, there was no correlation between treatment outcome and any of the analyzed quasispecies parameters. PMID:25016390
RNA therapeutics targeting osteoclast-mediated excessive bone resorption
Wang, Yuwei; Grainger, David W
2011-01-01
RNA interference (RNAi) is a sequence-specific post-transcriptional gene silencing technique developed with dramatically increasing utility for both scientific and therapeutic purposes. Short interfering RNA (siRNA) is currently exploited to regulate protein expression relevant to many therapeutic applications, and commonly used as a tool for elucidating disease-associated genes. Osteoporosis and their associated osteoporotic fragility fractures in both men and women are rapidly becoming a global healthcare crisis as average life expectancy increases worldwide. New therapeutics are needed for this increasing patient population. This review describes the diversity of molecular targets suitable for RNAi-based gene knock-down in osteoclasts to control osteoclast-mediated excessive bone resorption. We identify strategies for developing targeted siRNA delivery and efficient gene silencing, and describe opportunities and challenges of introducing siRNA as a therapeutic approach to hard and connective tissue disorders. PMID:21945356
Cysteine-containing peptide tag for site-specific conjugation of proteins
Backer, Marina V.; Backer, Joseph M.
2008-04-08
The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.
Cysteine-containing peptide tag for site-specific conjugation of proteins
Backer, Marina V.; Backer, Joseph M.
2010-10-05
The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety bound to the targeting moiety; the biological conjugate having a covalent bond between the thiol group of SEQ ID NO:2 and a functional group in the binding moiety. The present invention is directed to a biological conjugate, comprising: (a) a targeting moiety comprising a polypeptide having an amino acid sequence comprising the polypeptide sequence of SEQ ID NO:2 and the polypeptide sequence of a selected targeting protein; and (b) a binding moiety that comprises an adapter protein, the adapter protein having a thiol group; the biological conjugate having a disulfide bond between the thiol group of SEQ ID NO:2 and the thiol group of the adapter protein. The present invention is also directed to biological sequences employed in the above biological conjugates, as well as pharmaceutical preparations and methods using the above biological conjugates.
Targeted therapy according to next generation sequencing-based panel sequencing.
Saito, Motonobu; Momma, Tomoyuki; Kono, Koji
2018-04-17
Targeted therapy against actionable gene mutations shows a significantly higher response rate as well as longer survival compared to conventional chemotherapy, and has become a standard therapy for many cancers. Recent progress in next-generation sequencing (NGS) has enabled to identify huge number of genetic aberrations. Based on sequencing results, patients recommend to undergo targeted therapy or immunotherapy. In cases where there are no available approved drugs for the genetic mutations detected in the patients, it is recommended to be facilitate the registration for the clinical trials. For that purpose, a NGS-based sequencing panel that can simultaneously target multiple genes in a single investigation has been used in daily clinical practice. To date, various types of sequencing panels have been developed to investigate genetic aberrations with tumor somatic genome variants (gain-of-function or loss-of-function mutations, high-level copy number alterations, and gene fusions) through comprehensive bioinformatics. Because sequencing panels are efficient and cost-effective, they are quickly being adopted outside the lab, in hospitals and clinics, in order to identify personal targeted therapy for individual cancer patients.
Jagtap, Soham; Shivaprasad, Padubidri V
2014-12-02
Micro (mi)RNAs are important regulators of plant development. Across plant lineages, Dicer-like 1 (DCL1) proteins process long ds-like structures to produce micro (mi) RNA duplexes in a stepwise manner. These miRNAs are incorporated into Argonaute (AGO) proteins and influence expression of RNAs that have sequence complementarity with miRNAs. Expression levels of AGOs are greatly regulated by plants in order to minimize unwarranted perturbations using miRNAs to target mRNAs coding for AGOs. AGOs may also have high promoter specificity-sometimes expression of AGO can be limited to just a few cells in a plant. Viral pathogens utilize various means to counter antiviral roles of AGOs including hijacking the host encoded miRNAs to target AGOs. Two host encoded miRNAs namely miR168 and miR403 that target AGOs have been described in the model plant Arabidopsis and such a mechanism is thought to be well conserved across plants because AGO sequences are well conserved. We show that the interaction between AGO mRNAs and miRNAs is species-specific due to the diversity in sequences of two miRNAs that target AGOs, sequence diversity among corresponding target regions in AGO mRNAs and variable expression levels of these miRNAs among vascular plants. We used miRNA sequences from 68 plant species representing 31 plant families for this analysis. Sequences of miR168 and miR403 are not conserved among plant lineages, but surprisingly they differ drastically in their sequence diversity and expression levels even among closely related plants. Variation in miR168 expression among plants correlates well with secondary structures/length of loop sequences of their precursors. Our data indicates a complex AGO targeting interaction among plant lineages due to miRNA sequence diversity and sequences of miRNA targeting regions among AGO mRNAs, thus leading to the assumption that the perturbations by viruses that use host miRNAs to target antiviral AGOs can only be species-specific. We also show that rapid evolution and likely loss of expression of miR168 isoforms in tobacco is related to the insertion of MITE-like transposons between miRNA and miRNA* sequences, a possible mechanism showing how miRNAs are lost in few plant lineages even though other close relatives have abundantly expressing miRNAs.
Method and apparatus for biological sequence comparison
Marr, T.G.; Chang, W.I.
1997-12-23
A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.
Method and apparatus for biological sequence comparison
Marr, Thomas G.; Chang, William I-Wei
1997-01-01
A method and apparatus for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence.
Lokki, A Inkeri; Daly, Emma; Triebwasser, Michael; Kurki, Mitja I; Roberson, Elisha D O; Häppölä, Paavo; Auro, Kirsi; Perola, Markus; Heinonen, Seppo; Kajantie, Eero; Kere, Juha; Kivinen, Katja; Pouta, Anneli; Salmon, Jane E; Meri, Seppo; Daly, Mark; Atkinson, John P; Laivuori, Hannele
2017-08-01
Preeclampsia is a common pregnancy-specific vascular disorder characterized by new-onset hypertension and proteinuria during the second half of pregnancy. Predisposition to preeclampsia is in part heritable. It is associated with an increased risk of cardiovascular disease later in life. We have sequenced 124 candidate genes implicated in preeclampsia to pinpoint genetic variants contributing to predisposition to or protection from preeclampsia. First, targeted exomic sequencing was performed in 500 preeclamptic women and 190 controls from the FINNPEC cohort (Finnish Genetics of Preeclampsia Consortium). Then 122 women with a history of preeclampsia and 1905 parous women with no such history from the National FINRISK Study (a large Finnish population survey on risk factors of chronic, noncommunicable diseases) were included in the analyses. We tested 146 rare and low-frequency variants and found an excess (observed 13 versus expected 7.3) nominally associated with preeclampsia ( P <0.05). The most significantly associated sequence variants were protective variants rs35832528 (E982A; P =2.49E-4; odds ratio=0.387) and rs141440705 (R54S; P =0.003; odds ratio=0.442) in Fms related tyrosine kinase 1. These variants are enriched in the Finnish population with minor allele frequencies 0.026 and 0.017, respectively. They may also be associated with a lower risk of heart failure in 11 257 FINRISK women. This study provides the first evidence of maternal protective genetic variants in preeclampsia. © 2017 American Heart Association, Inc.
Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C J; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H; Cui, Helen; Markotter, Wanda
2018-01-01
Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard.
Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C. J.; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H.; Cui, Helen; Markotter, Wanda
2018-01-01
Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard. PMID:29579103
Lindström, Ida; Kjellin, Midori; Palanisamy, Navaneethan; Bondeson, Kåre; Wesslén, Lars; Lannergard, Anders; Lennerstrand, Johan
2015-08-01
The future treatment of hepatitis C virus (HCV) infection will be combinations of direct-acting antivirals (DAAs) that not only target multiple viral targets, but are also effective against different HCV genotypes. Of the many drug targets in HCV, one promising target is the non-structural 5A protein (NS5A), against which inhibitors, namely daclatasvir, ledipasvir and ombitasvir, have shown potent efficacy. However, since HCV is known to have very high sequence diversity, development of resistance is a problem against but not limited to NS5A inhibitors (i.e. resistance also found against NS3-protease and NS5B non-nucleoside inhibitors), when used in suboptimal combinations. Furthermore, it has been shown that natural resistance against DAAs is present in treatment-naïve patients and such baseline resistance will potentially complicate future treatment strategies. A pan-genotypic population-sequencing method with degenerated primers targeting the NS5A region was developed. We have investigated the prevalence of baseline resistant variants in 127 treatment-naïve patients of HCV genotypes 1a, 1b, 2b and 3a. The method could successfully sequence more than 95% of genotype 1a, 1b and 3a samples. Interpretation of fold resistance data against the NS5A inhibitors was done with the help of earlier published phenotypic data. Baseline resistance variants associated with high resistance (1000-50,000-fold) was found in three patients: Q30H or Y93N in genotype 1a patients and further Y93H in a genotype 3a patient. Using this method, baseline resistance can be examined and the data could have a potential role in selecting the optimal and cost-efficient treatment for the patient.
Eimeria genomics: Where are we now and where are we going?
Blake, Damer P
2015-08-15
The evolution of sequencing technologies, from Sanger to next generation (NGS) and now the emerging third generation, has prompted a radical frameshift moving genomics from the specialist to the mainstream. For parasitology, genomics has moved fastest for the protozoa with sequence assemblies becoming available for multiple genera including Babesia, Cryptosporidium, Eimeria, Giardia, Leishmania, Neospora, Plasmodium, Theileria, Toxoplasma and Trypanosoma. Progress has commonly been slower for parasites of animals which lack zoonotic potential, but the deficit is now being redressed with impact likely in the areas of drug and vaccine development, molecular diagnostics and population biology. Genomics studies with the apicomplexan Eimeria species clearly illustrate the approaches and opportunities available. Specifically, more than ten years after initiation of a genome sequencing project a sequence assembly was published for Eimeria tenella in 2014, complemented by assemblies for all other Eimeria species which infect the chicken and Eimeria falciformis, a parasite of the mouse. Public access to these and other coccidian genome assemblies through resources such as GeneDB and ToxoDB now promotes comparative analysis, encouraging better use of shared resources and enhancing opportunities for development of novel diagnostic and control strategies. In the short term genomics resources support development of targeted and genome-wide genetic markers such as single nucleotide polymorphisms (SNPs), with whole genome re-sequencing becoming viable in the near future. Experimental power will develop rapidly as additional species, strains and isolates are sampled with particular emphasis on population structure and allelic diversity. Copyright © 2015 Elsevier B.V. All rights reserved.
Cammen, Kristina M; Schultz, Thomas F; Rosel, Patricia E; Wells, Randall S; Read, Andrew J
2015-09-01
Harmful algal blooms (HABs), which can be lethal in marine species and cause illness in humans, are increasing worldwide. In the Gulf of Mexico, HABs of Karenia brevis produce neurotoxic brevetoxins that cause large-scale marine mortality events. The long history of such blooms, combined with the potentially severe effects of exposure, may have produced a strong selective pressure for evolved resistance. Advances in next-generation sequencing, in particular genotyping-by-sequencing, greatly enable the genomic study of such adaptation in natural populations. We used restriction site-associated DNA (RAD) sequencing to investigate brevetoxicosis resistance in common bottlenose dolphins (Tursiops truncatus). To improve our understanding of the epidemiology and aetiology of brevetoxicosis and the potential for evolved resistance in an upper trophic level predator, we sequenced pools of genomic DNA from dolphins sampled from both coastal and estuarine populations in Florida and during multiple HAB-associated mortality events. We sequenced 129 594 RAD loci and analysed 7431 single nucleotide polymorphisms (SNPs). The allele frequencies of many of these polymorphic loci differed significantly between live and dead dolphins. Some loci associated with survival showed patterns suggesting a common genetic-based mechanism of resistance to brevetoxins in bottlenose dolphins along the Gulf coast of Florida, but others suggested regionally specific mechanisms of resistance or reflected differences among HABs. We identified candidate genes that may be the evolutionary target for brevetoxin resistance by searching the dolphin genome for genes adjacent to survival-associated SNPs. © 2015 John Wiley & Sons Ltd.
Epidemics of panic during a bioterrorist attack--a mathematical model.
Radosavljevic, Vladan; Radunovic, Desanka; Belojevic, Goran
2009-09-01
A bioterrorist attacks usually cause epidemics of panic in a targeted population. We have presented epidemiologic aspect of this phenomenon as a three-component model--host, information on an attack and social network. We have proposed a mathematical model of panic and counter-measures as the function of time in a population exposed to a bioterrorist attack. The model comprises ordinary differential equations and graphically presented combinations of the equations parameters. Clinically, we have presented a model through a sequence of psychic conditions and disorders initiated by an act of bioterrorism. This model might be helpful for an attacked community to timely and properly apply counter-measures and to minimize human mental suffering during a bioterrorist attack.
Bozinovic, Goran; Oleksiak, Marjorie F.
2010-01-01
Transcriptomics and population genomics are two complementary genomic approaches that can be used to gain insight into pollutant effects in natural populations. Transcriptomics identify altered gene expression pathways while population genomics approaches more directly target the causative genomic polymorphisms. Neither approach is restricted to a pre-determined set of genes or loci. Instead, both approaches allow a broad overview of genomic processes. Transcriptomics and population genomic approaches have been used to explore genomic responses in populations of fish from polluted environments and have identified sets of candidate genes and loci that appear biologically important in response to pollution. Often differences in gene expression or loci between polluted and reference populations are not conserved among polluted populations suggesting a biological complexity that we do not yet fully understand. As genomic approaches become less expensive with the advent of new sequencing and genotyping technologies, they will be more widely used in complimentary studies. However, while these genomic approaches are immensely powerful for identifying candidate gene and loci, the challenge of determining biological mechanisms that link genotypes and phenotypes remains. PMID:21072843
Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
2000-01-01
A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.
Southwell, Amber L; Skotte, Niels H; Villanueva, Erika B; Østergaard, Michael E; Gu, Xiaofeng; Kordasiewicz, Holly B; Kay, Chris; Cheung, Daphne; Xie, Yuanyun; Waltl, Sabine; Dal Cengio, Louisa; Findlay-Black, Hailey; Doty, Crystal N; Petoukhov, Eugenia; Iworima, Diepiriye; Slama, Ramy; Ooi, Jolene; Pouladi, Mahmoud A; Yang, X William; Swayze, Eric E; Seth, Punit P; Hayden, Michael R
2017-03-15
Huntington disease (HD) is a neurodegenerative disease caused by a mutation in the huntingtin (HTT) gene. HTT is a large protein, interacts with many partners and is involved in many cellular pathways, which are perturbed in HD. Therapies targeting HTT directly are likely to provide the most global benefit. Thus there is a need for preclinical models of HD recapitulating human HTT genetics. We previously generated a humanized mouse model of HD, Hu97/18, by intercrossing BACHD and YAC18 mice with knockout of the endogenous mouse HD homolog (Hdh). Hu97/18 mice recapitulate the genetics of HD, having two full-length, genomic human HTT transgenes heterozygous for the HD mutation and polymorphisms associated with HD in populations of Caucasian descent. We have now generated a companion model, Hu128/21, by intercrossing YAC128 and BAC21 mice on the Hdh-/- background. Hu128/21 mice have two full-length, genomic human HTT transgenes heterozygous for the HD mutation and polymorphisms associated with HD in populations of East Asian descent and in a minority of patients from other ethnic groups. Hu128/21 mice display a wide variety of HD-like phenotypes that are similar to YAC128 mice. Additionally, both transgenes in Hu128/21 mice match the human HTT exon 1 reference sequence. Conversely, the BACHD transgene carries a floxed, synthetic exon 1 sequence. Hu128/21 mice will be useful for investigations of human HTT that cannot be addressed in Hu97/18 mice, for developing therapies targeted to exon 1, and for preclinical screening of personalized HTT lowering therapies in HD patients of East Asian descent. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Sabatini, Linda M; Mathews, Charles; Ptak, Devon; Doshi, Shivang; Tynan, Katherine; Hegde, Madhuri R; Burke, Tara L; Bossler, Aaron D
2016-05-01
The increasing use of advanced nucleic acid sequencing technologies for clinical diagnostics and therapeutics has made vital understanding the costs of performing these procedures and their value to patients, providers, and payers. The Association for Molecular Pathology invested in a cost and value analysis of specific genomic sequencing procedures (GSPs) newly coded by the American Medical Association Current Procedural Terminology Editorial Panel. Cost data and work effort, including the development and use of data analysis pipelines, were gathered from representative laboratories currently performing these GSPs. Results were aggregated to generate representative cost ranges given the complexity and variability of performing the tests. Cost-impact models for three clinical scenarios were generated with assistance from key opinion leaders: impact of using a targeted gene panel in optimizing care for patients with advanced non-small-cell lung cancer, use of a targeted gene panel in the diagnosis and management of patients with sensorineural hearing loss, and exome sequencing in the diagnosis and management of children with neurodevelopmental disorders of unknown genetic etiology. Each model demonstrated value by either reducing health care costs or identifying appropriate care pathways. The templates generated will aid laboratories in assessing their individual costs, considering the value structure in their own patient populations, and contributing their data to the ongoing dialogue regarding the impact of GSPs on improving patient care. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.
Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing
2015-08-05
To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing.
Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions
2014-01-01
Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. PMID:24428920
Detection and isolation of nucleic acid sequences using competitive hybridization probes
Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.
1997-01-01
A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.
Dasa, Siva Sai Krishna; Kelly, Kimberly A.
2016-01-01
Next-generation sequencing has enhanced the phage display process, allowing for the quantification of millions of sequences resulting from the biopanning process. In response, many valuable analysis programs focused on specificity and finding targeted motifs or consensus sequences were developed. For targeted drug delivery and molecular imaging, it is also necessary to find peptides that are selective—targeting only the cell type or tissue of interest. We present a new analysis strategy and accompanying software, PHage Analysis for Selective Targeted PEPtides (PHASTpep), which identifies highly specific and selective peptides. Using this process, we discovered and validated, both in vitro and in vivo in mice, two sequences (HTTIPKV and APPIMSV) targeted to pancreatic cancer-associated fibroblasts that escaped identification using previously existing software. Our selectivity analysis makes it possible to discover peptides that target a specific cell type and avoid other cell types, enhancing clinical translatability by circumventing complications with systemic use. PMID:27186887
Detection and isolation of nucleic acid sequences using competitive hybridization probes
Lucas, J.N.; Straume, T.; Bogen, K.T.
1997-04-01
A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.
A tale of two sequences: microRNA-target chimeric reads.
Broughton, James P; Pasquinelli, Amy E
2016-04-04
In animals, a functional interaction between a microRNA (miRNA) and its target RNA requires only partial base pairing. The limited number of base pair interactions required for miRNA targeting provides miRNAs with broad regulatory potential and also makes target prediction challenging. Computational approaches to target prediction have focused on identifying miRNA target sites based on known sequence features that are important for canonical targeting and may miss non-canonical targets. Current state-of-the-art experimental approaches, such as CLIP-seq (cross-linking immunoprecipitation with sequencing), PAR-CLIP (photoactivatable-ribonucleoside-enhanced CLIP), and iCLIP (individual-nucleotide resolution CLIP), require inference of which miRNA is bound at each site. Recently, the development of methods to ligate miRNAs to their target RNAs during the preparation of sequencing libraries has provided a new tool for the identification of miRNA target sites. The chimeric, or hybrid, miRNA-target reads that are produced by these methods unambiguously identify the miRNA bound at a specific target site. The information provided by these chimeric reads has revealed extensive non-canonical interactions between miRNAs and their target mRNAs, and identified many novel interactions between miRNAs and noncoding RNAs.
Targeted RNA-Sequencing with Competitive Multiplex-PCR Amplicon Libraries
Blomquist, Thomas M.; Crawford, Erin L.; Lovett, Jennie L.; Yeo, Jiyoun; Stanoszek, Lauren M.; Levin, Albert; Li, Jia; Lu, Mei; Shi, Leming; Muldrew, Kenneth; Willey, James C.
2013-01-01
Whole transcriptome RNA-sequencing is a powerful tool, but is costly and yields complex data sets that limit its utility in molecular diagnostic testing. A targeted quantitative RNA-sequencing method that is reproducible and reduces the number of sequencing reads required to measure transcripts over the full range of expression would be better suited to diagnostic testing. Toward this goal, we developed a competitive multiplex PCR-based amplicon sequencing library preparation method that a) targets only the sequences of interest and b) controls for inter-target variation in PCR amplification during library preparation by measuring each transcript native template relative to a known number of synthetic competitive template internal standard copies. To determine the utility of this method, we intentionally selected PCR conditions that would cause transcript amplification products (amplicons) to converge toward equimolar concentrations (normalization) during library preparation. We then tested whether this approach would enable accurate and reproducible quantification of each transcript across multiple library preparations, and at the same time reduce (through normalization) total sequencing reads required for quantification of transcript targets across a large range of expression. We demonstrate excellent reproducibility (R2 = 0.997) with 97% accuracy to detect 2-fold change using External RNA Controls Consortium (ERCC) reference materials; high inter-day, inter-site and inter-library concordance (R2 = 0.97–0.99) using FDA Sequencing Quality Control (SEQC) reference materials; and cross-platform concordance with both TaqMan qPCR (R2 = 0.96) and whole transcriptome RNA-sequencing following “traditional” library preparation using Illumina NGS kits (R2 = 0.94). Using this method, sequencing reads required to accurately quantify more than 100 targeted transcripts expressed over a 107-fold range was reduced more than 10,000-fold, from 2.3×109 to 1.4×105 sequencing reads. These studies demonstrate that the competitive multiplex-PCR amplicon library preparation method presented here provides the quality control, reproducibility, and reduced sequencing reads necessary for development and implementation of targeted quantitative RNA-sequencing biomarkers in molecular diagnostic testing. PMID:24236095
Transcriptome analyses based on genetic screens for Pax3 myogenic targets in the mouse embryo
2010-01-01
Background Pax3 is a key upstream regulator of the onset of myogenesis, controlling progenitor cell survival and behaviour as well as entry into the myogenic programme. It functions in the dermomyotome of the somite from which skeletal muscle derives and in progenitor cell populations that migrate from the somite such as those of the limbs. Few Pax3 target genes have been identified. Identifying genes that lie genetically downstream of Pax3 is therefore an important endeavour in elucidating the myogenic gene regulatory network. Results We have undertaken a screen in the mouse embryo which employs a Pax3GFP allele that permits isolation of Pax3 expressing cells by flow cytometry and a Pax3PAX3-FKHR allele that encodes PAX3-FKHR in which the DNA binding domain of Pax3 is fused to the strong transcriptional activation domain of FKHR. This constitutes a gain of function allele that rescues the Pax3 mutant phenotype. Microarray comparisons were carried out between Pax3GFP/+ and Pax3GFP/PAX3-FKHR preparations from the hypaxial dermomyotome of somites at E9.5 and forelimb buds at E10.5. A further transcriptome comparison between Pax3-GFP positive and negative cells identified sequences specific to myogenic progenitors in the forelimb buds. Potential Pax3 targets, based on changes in transcript levels on the gain of function genetic background, were validated by analysis on loss or partial loss of function Pax3 mutant backgrounds. Sequences that are up- or down-regulated in the presence of PAX3-FKHR are classified as somite only, somite and limb or limb only. The latter should not contain sequences from Pax3 positive neural crest cells which do not invade the limbs. Verification by whole mount in situ hybridisation distinguishes myogenic markers. Presentation of potential Pax3 target genes focuses on signalling pathways and on transcriptional regulation. Conclusions Pax3 orchestrates many of the signalling pathways implicated in the activation or repression of myogenesis by regulating effectors and also, notably, inhibitors of these pathways. Important transcriptional regulators of myogenesis are candidate Pax3 targets. Myogenic determination genes, such as Myf5 are controlled positively, whereas the effect of Pax3 on genes encoding inhibitors of myogenesis provides a potential brake on differentiation. In the progenitor cell population, Pax7 and also Hdac5 which is a potential repressor of Foxc2, are subject to positive control by Pax3. PMID:21143873
Sequences show rapid motor transfer and spatial translation in the oculomotor system.
Stainer, Matthew J; Carpenter, R H S; Brotchie, Peter; Anderson, Andrew J
2016-07-01
Every day we perform learnt sequences of actions that seem to happen almost without awareness. It has been argued that for learning such sequences parallel learning networks exist - one using spatial coordinates and one using motor coordinates - with sequence acquisition involving a progressive shift from the former to the latter as a sequence is rehearsed. When sequences are interrupted by an out-of-sequence target, there is a delay in the response to the target, and so here we transiently interrupt oculomotor sequences to probe the influence of oculomotor rehearsal and spatial coordinates in sequence acquisition. For our main experiments, we used a repeating sequences of eight targets in length that was first learnt either using saccadic eye movements (left/right), manual responses (left/right or up/down) or as a sequence of colour (blue/red) requiring no motor response. The sequence was immediately repeated for saccadic eye movements, during which the influence of on out-of-sequence target (an interruption) was assessed. When a sequence is learnt beforehand in an abstract way (for example, as a sequence of colours or of orthogonally mapped manual responses), interruptions are immediately disruptive to latency, suggesting neither motor rehearsal nor specific spatial coordinates are essential for encoding sequences of actions and that sequences - no matter how they are encoded - can be rapidly translated into oculomotor coordinates. The magnitude of a disruption does, however, correspond to how well a sequence is learnt: introducing an interruption to an extended sequence before it was reliably learnt reduces the magnitude of the latency disruption. Copyright © 2016 Elsevier Ltd. All rights reserved.
Pool, John E.; Corbett-Detig, Russell B.; Sugino, Ryuichi P.; Stevens, Kristian A.; Cardeno, Charis M.; Crepeau, Marc W.; Duchen, Pablo; Emerson, J. J.; Saelao, Perot; Begun, David J.; Langley, Charles H.
2012-01-01
Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa FST were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations. PMID:23284287
Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin
2016-01-01
The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. PMID:27172202
Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin
2016-07-07
The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.
Molecular genotyping of Trypanosoma cruzi for lineage assignment and population genetics.
Messenger, Louisa A; Yeo, Matthew; Lewis, Michael D; Llewellyn, Martin S; Miles, Michael A
2015-01-01
Trypanosoma cruzi, the etiological agent of Chagas disease, remains a major public health problem in Latin America. Infection with T. cruzi is lifelong and can lead to a spectrum of pathological sequelae ranging from subclinical to lethal cardiac and/or gastrointestinal complications. Isolates of T. cruzi can be assigned to six genetic lineages or discrete typing units (DTUs), which are broadly associated with disparate ecologies, transmission cycles, and geographical distributions. This extensive genetic diversity is also believed to contribute to the clinical variation observed among chagasic patients. Unravelling the population structure of T. cruzi is fundamental to understanding Chagas disease epidemiology, developing control strategies, and resolving the relationship between parasite genotype and clinical prognosis. To date, no single, widely validated, genetic target allows unequivocal resolution to DTU-level. In this chapter we present standardized methods for strain DTU assignment using PCR-restriction fragment length polymorphism analysis (PCR-RFLP) and nuclear multilocus sequence typing (MLST). PCR-RFLPs have the advantages of simplicity and reproducibility, requiring limited expertise and few laboratory consumables. MLST data are more laborious to generate but more informative; DNA sequences are readily transferable between research groups and amenable to recombination detection and intra-lineage analyses. We also recommend a mitochondrial (maxicircle) MLST scheme and a panel of 28 microsatellite loci for higher resolution population genetics studies. Due to the scarcity of T. cruzi in blood and tissue, all of these genotyping techniques have limited sensitivity when applied directly to clinical or biological specimens, particularly when targets are single (MLST) or low copy number (PCR-RFLPs). We therefore describe essential protocols to isolate parasites, derive biological clones, and extract T. cruzi genomic DNA from field and clinical samples.
Leydet, Karine Posbic; Grupstra, Carsten G B; Coma, Rafel; Ribes, Marta; Hellberg, Michael E
2018-06-01
Many organisms are expanding their ranges in response to changing environmental conditions. Understanding the patterns of genetic diversity and adaptation along an expansion front is crucial to assessing a species' long-term success. While next-generation sequencing techniques can reveal these changes in fine detail, ascribing them to a particular species can be difficult for organisms that live in close association with symbionts. Using a novel modified restriction site-associated DNA sequencing (RAD-Seq) protocol to target coral DNA, we collected 595 coral-specific single nucleotide polymorphisms from 189 colonies of the invasive coral Oculina patagonica from the Spanish Mediterranean coast, including established core populations and two expansion fronts. Surprisingly, populations from the recent northern expansion are genetically distinct from the westward expansion and core populations and also harbour greater genetic diversity. We found that temperature may have driven adaptation along the northern expansion, as genome scans for selection found three candidate loci associated with temperature in the north but none in the west. We found no genomic signature of selection associated with artificial substrate, which has been proposed for explaining the rapid spread of O. patagonica. This suggests that this coral is simply an opportunistic colonizer of free space made available by coastal habitat modifications. Our results suggest that unique genetic variation, possibly due to limited dispersal across the Ibiza Channel, an influx of individuals from different depths and/or adaptation to cooler temperatures along the northern expansion front may have facilitated the northward range expansion of O. patagonica in the western Mediterranean. © 2018 John Wiley & Sons Ltd.
Johnson, Jennifer L.; Wittgenstein, Helena; Mitchell, Sharon E.; Hyma, Katie E.; Temnykh, Svetlana V.; Kharlamova, Anastasiya V.; Gulevich, Rimma G.; Vladimirova, Anastasiya V.; Fong, Hiu Wa Flora; Acland, Gregory M.; Trut, Lyudmila N.; Kukekova, Anna V.
2015-01-01
The silver fox (Vulpes vulpes) offers a novel model for studying the genetics of social behavior and animal domestication. Selection of foxes, separately, for tame and for aggressive behavior has yielded two strains with markedly different, genetically determined, behavioral phenotypes. Tame strain foxes are eager to establish human contact while foxes from the aggressive strain are aggressive and difficult to handle. These strains have been maintained as separate outbred lines for over 40 generations but their genetic structure has not been previously investigated. We applied a genotyping-by-sequencing (GBS) approach to provide insights into the genetic composition of these fox populations. Sequence analysis of EcoT22I genomic libraries of tame and aggressive foxes identified 48,294 high quality SNPs. Population structure analysis revealed genetic divergence between the two strains and more diversity in the aggressive strain than in the tame one. Significant differences in allele frequency between the strains were identified for 68 SNPs. Three of these SNPs were located on fox chromosome 14 within an interval of a previously identified behavioral QTL, further supporting the importance of this region for behavior. The GBS SNP data confirmed that significant genetic diversity has been preserved in both fox populations despite many years of selective breeding. Analysis of SNP allele frequencies in the two populations identified several regions of genetic divergence between the tame and aggressive foxes, some of which may represent targets of selection for behavior. The GBS protocol used in this study significantly expanded genomic resources for the fox, and can be adapted for SNP discovery and genotyping in other canid species. PMID:26061395
Johnson, Jennifer L; Wittgenstein, Helena; Mitchell, Sharon E; Hyma, Katie E; Temnykh, Svetlana V; Kharlamova, Anastasiya V; Gulevich, Rimma G; Vladimirova, Anastasiya V; Fong, Hiu Wa Flora; Acland, Gregory M; Trut, Lyudmila N; Kukekova, Anna V
2015-01-01
The silver fox (Vulpes vulpes) offers a novel model for studying the genetics of social behavior and animal domestication. Selection of foxes, separately, for tame and for aggressive behavior has yielded two strains with markedly different, genetically determined, behavioral phenotypes. Tame strain foxes are eager to establish human contact while foxes from the aggressive strain are aggressive and difficult to handle. These strains have been maintained as separate outbred lines for over 40 generations but their genetic structure has not been previously investigated. We applied a genotyping-by-sequencing (GBS) approach to provide insights into the genetic composition of these fox populations. Sequence analysis of EcoT22I genomic libraries of tame and aggressive foxes identified 48,294 high quality SNPs. Population structure analysis revealed genetic divergence between the two strains and more diversity in the aggressive strain than in the tame one. Significant differences in allele frequency between the strains were identified for 68 SNPs. Three of these SNPs were located on fox chromosome 14 within an interval of a previously identified behavioral QTL, further supporting the importance of this region for behavior. The GBS SNP data confirmed that significant genetic diversity has been preserved in both fox populations despite many years of selective breeding. Analysis of SNP allele frequencies in the two populations identified several regions of genetic divergence between the tame and aggressive foxes, some of which may represent targets of selection for behavior. The GBS protocol used in this study significantly expanded genomic resources for the fox, and can be adapted for SNP discovery and genotyping in other canid species.
NASA Astrophysics Data System (ADS)
Qiu, Tian; Guo, Huiqin; Zhao, Huan; Wang, Luhua; Zhang, Zhihui
2015-06-01
Identification of multi-gene variations has led to the development of new targeted therapies in lung adenocarcinoma patients, and identification of an appropriate patient population with a reliable screening method is the key to the overall success of tumor targeted therapies. In this study, we used the Ion Torrent next-generation sequencing (NGS) technique to screen for mutations in 89 cases of lung adenocarcinoma metastatic lymph node specimens obtained by fine-needle aspiration cytology (FNAC). Of the 89 specimens, 30 (34%) were found to harbor epidermal growth factor receptor (EGFR) kinase domain mutations. Seven (8%) samples harbored KRAS mutations, and three (3%) samples had BRAF mutations involving exon 11 (G469A) and exon 15 (V600E). Eight (9%) samples harbored PIK3CA mutations. One (1%) sample had a HRAS G12C mutation. Thirty-two (36%) samples (36%) harbored TP53 mutations. Other genes including APC, ATM, MET, PTPN11, GNAS, HRAS, RB1, SMAD4 and STK11 were found each in one case. Our study has demonstrated that NGS using the Ion Torrent technology is a useful tool for gene mutation screening in lung adenocarcinoma metastatic lymph node specimens obtained by FNAC, and may promote the development of new targeted therapies in lung adenocarcinoma patients.
NASA Astrophysics Data System (ADS)
Sanderson, Mark I.; Simmons, James A.
2005-11-01
Echolocating big brown bats (Eptesicus fuscus) emit trains of frequency-modulated (FM) biosonar signals whose duration, repetition rate, and sweep structure change systematically during interception of prey. When stimulated with a 2.5-s sequence of 54 FM pulse-echo pairs that mimic sounds received during search, approach, and terminal stages of pursuit, single neurons (N=116) in the bat's inferior colliculus (IC) register the occurrence of a pulse or echo with an average of <1 spike/sound. Individual IC neurons typically respond to only a segment of the search or approach stage of pursuit, with fewer neurons persisting to respond in the terminal stage. Composite peristimulus-time-histogram plots of responses assembled across the whole recorded population of IC neurons depict the delay of echoes and, hence, the existence and distance of the simulated biosonar target, entirely as on-response latencies distributed across time. Correlated changes in pulse duration, repetition rate, and pulse or echo amplitude do modulate the strength of responses (probability of the single spike actually occurring for each sound), but registration of the target itself remains confined exclusively to the latencies of single spikes across cells. Modeling of echo processing in FM biosonar should emphasize spike-time algorithms to explain the content of biosonar images.
Ousterout, David G; Kabadi, Ami M; Thakore, Pratiksha I; Perez-Pinera, Pablo; Brown, Matthew T; Majoros, William H; Reddy, Timothy E; Gersbach, Charles A
2015-01-01
Duchenne muscular dystrophy (DMD) is caused by genetic mutations that result in the absence of dystrophin protein expression. Oligonucleotide-induced exon skipping can restore the dystrophin reading frame and protein production. However, this requires continuous drug administration and may not generate complete skipping of the targeted exon. In this study, we apply genome editing with zinc finger nucleases (ZFNs) to permanently remove essential splicing sequences in exon 51 of the dystrophin gene and thereby exclude exon 51 from the resulting dystrophin transcript. This approach can restore the dystrophin reading frame in ~13% of DMD patient mutations. Transfection of two ZFNs targeted to sites flanking the exon 51 splice acceptor into DMD patient myoblasts led to deletion of this genomic sequence. A clonal population was isolated with this deletion and following differentiation we confirmed loss of exon 51 from the dystrophin mRNA transcript and restoration of dystrophin protein expression. Furthermore, transplantation of corrected cells into immunodeficient mice resulted in human dystrophin expression localized to the sarcolemmal membrane. Finally, we quantified ZFN toxicity in human cells and mutagenesis at predicted off-target sites. This study demonstrates a powerful method to restore the dystrophin reading frame and protein expression by permanently deleting exons. PMID:25492562
RISC RNA sequencing for context-specific identification of in vivo microRNA targets.
Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W
2011-01-07
MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1645 mRNAs consistently targeted to mouse cardiac RISCs. We used this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing "seed" sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context and is applicable to any tissue and any disease state.
Yamada, Takeshi; Abei, Masato; Danjoh, Inaho; Shirota, Ryoko; Yamashita, Taro; Hyodo, Ichinosuke; Nakamura, Yukio
2015-04-11
Cancer stem cell (CSC) research has highlighted the necessity of developing drugs targeting CSCs. We investigated a hepatocellular carcinoma (HCC) cell line that not only has CSC hierarchy but also shows phenotypic changes (population changes) upon differentiation of CSC during culture and can be used for screening drugs targeting CSC. Based on a hypothesis that the CSC proportion should decrease upon its differentiation into progenitors (population change), we tested HCC cell lines (HuH-7, Li-7, PLC/PRF/5, HLF, HLE) before and after 2 months culture for several markers (CD13, EpCAM, CD133, CD44, CD90, CD24, CD166). Tumorigenicity was tested using nude mice. To evaluate the CSC hierarchy, we investigated reconstructivity, proliferation, ALDH activity, spheroid formation, chemosensitivity and microarray analysis of the cell populations sorted by FACS. Only Li-7 cells showed a population change during culture: the proportion of CD13 positive cells decreased, while that of CD166 positive cells increased. The high tumorigenicity of the Li-7 was lost after the population change. CD13(+)/CD166(-) cells showed slow growth and reconstructed the bulk Li-7 populations composed of CD13(+)/CD166(-), CD13(-)/CD166(-) and CD13(-)/CD166(+) fractions, whereas CD13(-)/CD166(+) cells showed rapid growth but could not reproduce any other population. CD13(+)/CD166(-) cells showed high ALDH activity, spheroid forming ability and resistance to 5-fluorouracil. Microarray analysis demonstrated higher expression of stemness-related genes in CD166(-) than CD166(+) fraction. These results indicated a hierarchy in Li-7 cells, in which CD13(+)/CD166(-) and CD13(-)/CD166(+) cells serve as slow growing CSCs and rapid growing progenitors, respectively. Sorafenib selectively targeted the CD166(-) fraction, including CD13(+) CSCs, which exhibited higher mRNA expression for FGF3 and FGF4, candidate biomarkers for sorafenib. 5-fluorouracil followed by sorafenib inhibited the growth of bulk Li-7 cells more effectively than the reverse sequence or either alone. We identified a unique HCC line, Li-7, which not only shows heterogeneity for a CD13(+) CSC hierarchy, but also undergoes a "population change" upon CSC differentiation. Sorafenib targeted the CSC in vitro, supporting the use of this model for screening drugs targeting the CSC. This type of "heterogeneous, unstable" cell line may prove more useful in the CSC era than conventional "homogeneous, stable" cell lines.
UniDrug-target: a computational tool to identify unique drug targets in pathogenic bacteria.
Chanumolu, Sree Krishna; Rout, Chittaranjan; Chauhan, Rajinder S
2012-01-01
Targeting conserved proteins of bacteria through antibacterial medications has resulted in both the development of resistant strains and changes to human health by destroying beneficial microbes which eventually become breeding grounds for the evolution of resistances. Despite the availability of more than 800 genomes sequences, 430 pathways, 4743 enzymes, 9257 metabolic reactions and protein (three-dimensional) 3D structures in bacteria, no pathogen-specific computational drug target identification tool has been developed. A web server, UniDrug-Target, which combines bacterial biological information and computational methods to stringently identify pathogen-specific proteins as drug targets, has been designed. Besides predicting pathogen-specific proteins essentiality, chokepoint property, etc., three new algorithms were developed and implemented by using protein sequences, domains, structures, and metabolic reactions for construction of partial metabolic networks (PMNs), determination of conservation in critical residues, and variation analysis of residues forming similar cavities in proteins sequences. First, PMNs are constructed to determine the extent of disturbances in metabolite production by targeting a protein as drug target. Conservation of pathogen-specific protein's critical residues involved in cavity formation and biological function determined at domain-level with low-matching sequences. Last, variation analysis of residues forming similar cavities in proteins sequences from pathogenic versus non-pathogenic bacteria and humans is performed. The server is capable of predicting drug targets for any sequenced pathogenic bacteria having fasta sequences and annotated information. The utility of UniDrug-Target server was demonstrated for Mycobacterium tuberculosis (H37Rv). The UniDrug-Target identified 265 mycobacteria pathogen-specific proteins, including 17 essential proteins which can be potential drug targets. UniDrug-Target is expected to accelerate pathogen-specific drug targets identification which will increase their success and durability as drugs developed against them have less chance to develop resistances and adverse impact on environment. The server is freely available at http://117.211.115.67/UDT/main.html. The standalone application (source codes) is available at http://www.bioinformatics.org/ftp/pub/bioinfojuit/UDT.rar.
Moyle, Richard L.; Carvalhais, Lilia C.; Pretorius, Lara-Simone; Nowak, Ekaterina; Subramaniam, Gayathery; Dalton-Morgan, Jessica; Schenk, Peer M.
2017-01-01
Studies investigating the action of small RNAs on computationally predicted target genes require some form of experimental validation. Classical molecular methods of validating microRNA action on target genes are laborious, while approaches that tag predicted target sequences to qualitative reporter genes encounter technical limitations. The aim of this study was to address the challenge of experimentally validating large numbers of computationally predicted microRNA-target transcript interactions using an optimized, quantitative, cost-effective, and scalable approach. The presented method combines transient expression via agroinfiltration of Nicotiana benthamiana leaves with a quantitative dual luciferase reporter system, where firefly luciferase is used to report the microRNA-target sequence interaction and Renilla luciferase is used as an internal standard to normalize expression between replicates. We report the appropriate concentration of N. benthamiana leaf extracts and dilution factor to apply in order to avoid inhibition of firefly LUC activity. Furthermore, the optimal ratio of microRNA precursor expression construct to reporter construct and duration of the incubation period post-agroinfiltration were determined. The optimized dual luciferase assay provides an efficient, repeatable and scalable method to validate and quantify microRNA action on predicted target sequences. The optimized assay was used to validate five predicted targets of rice microRNA miR529b, with as few as six technical replicates. The assay can be extended to assess other small RNA-target sequence interactions, including assessing the functionality of an artificial miRNA or an RNAi construct on a targeted sequence. PMID:28979287
Detecting Signatures of Positive Selection along Defined Branches of a Population Tree Using LSD.
Librado, Pablo; Orlando, Ludovic
2018-06-01
Identifying the genomic basis underlying local adaptation is paramount to evolutionary biology, and bears many applications in the fields of conservation biology, crop, and animal breeding, as well as personalized medicine. Although many approaches have been developed to detect signatures of positive selection within single populations and population pairs, the increasing wealth of high-throughput sequencing data requires improved methods capable of handling multiple, and ideally large number of, populations in a single analysis. In this study, we introduce LSD (levels of exclusively shared differences), a fast and flexible framework to perform genome-wide selection scans, along the internal and external branches of a given population tree. We use forward simulations to demonstrate that LSD can identify branches targeted by positive selection with remarkable sensitivity and specificity. We illustrate a range of potential applications by analyzing data from the 1000 Genomes Project and uncover a list of adaptive candidates accompanying the expansion of anatomically modern humans out of Africa and their spread to Europe.
Research progress of plant population genomics based on high-throughput sequencing.
Wang, Yun-sheng
2016-08-01
Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.
Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi
2011-09-01
Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.
Barnes, W M; Bevan, M
1983-01-01
A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
Kugelman, Jeffrey R; Sanchez-Lockhart, Mariano; Andersen, Kristian G; Gire, Stephen; Park, Daniel J; Sealfon, Rachel; Lin, Aaron E; Wohl, Shirlee; Sabeti, Pardis C; Kuhn, Jens H; Palacios, Gustavo F
2015-01-20
Until recently, Ebola virus (EBOV) was a rarely encountered human pathogen that caused disease among small populations with extraordinarily high lethality. At the end of 2013, EBOV initiated an unprecedented disease outbreak in West Africa that is still ongoing and has already caused thousands of deaths. Recent studies revealed the genomic changes this particular EBOV variant undergoes over time during human-to-human transmission. Here we highlight the genomic changes that might negatively impact the efficacy of currently available EBOV sequence-based candidate therapeutics, such as small interfering RNAs (siRNAs), phosphorodiamidate morpholino oligomers (PMOs), and antibodies. Ten of the observed mutations modify the sequence of the binding sites of monoclonal antibody (MAb) 13F6, MAb 1H3, MAb 6D8, MAb 13C6, and siRNA EK-1, VP24, and VP35 targets and might influence the binding efficacy of the sequence-based therapeutics, suggesting that their efficacy should be reevaluated against the currently circulating strain. Copyright © 2015 Kugelman, et al.
Application of single-cell sequencing in human cancer.
Rantalainen, Mattias
2017-11-02
Precision medicine is emerging as a cornerstone of future cancer care with the objective of providing targeted therapies based on the molecular phenotype of each individual patient. Traditional bulk-level molecular phenotyping of tumours leads to significant information loss, as the molecular profile represents an average phenotype over large numbers of cells, while cancer is a disease with inherent intra-tumour heterogeneity at the cellular level caused by several factors, including clonal evolution, tissue hierarchies, rare cells and dynamic cell states. Single-cell sequencing provides means to characterize heterogeneity in a large population of cells and opens up opportunity to determine key molecular properties that influence clinical outcomes, including prognosis and probability of treatment response. Single-cell sequencing methods are now reliable enough to be used in many research laboratories, and we are starting to see applications of these technologies for characterization of human primary cancer cells. In this review, we provide an overview of studies that have applied single-cell sequencing to characterize human cancers at the single-cell level, and we discuss some of the current challenges in the field. © The Author 2017. Published by Oxford University Press.
Identification and genetic analysis of cancer cells with PCR-activated cell sorting
Eastburn, Dennis J.; Sciambi, Adam; Abate, Adam R.
2014-01-01
Cell sorting is a central tool in life science research for analyzing cellular heterogeneity or enriching rare cells out of large populations. Although methods like FACS and FISH-FC can characterize and isolate cells from heterogeneous populations, they are limited by their reliance on antibodies, or the requirement to chemically fix cells. We introduce a new cell sorting technology that robustly sorts based on sequence-specific analysis of cellular nucleic acids. Our approach, PCR-activated cell sorting (PACS), uses TaqMan PCR to detect nucleic acids within single cells and trigger their sorting. With this method, we identified and sorted prostate cancer cells from a heterogeneous population by performing >132 000 simultaneous single-cell TaqMan RT-PCR reactions targeting vimentin mRNA. Following vimentin-positive droplet sorting and downstream analysis of recovered nucleic acids, we found that cancer-specific genomes and transcripts were significantly enriched. Additionally, we demonstrate that PACS can be used to sort and enrich cells via TaqMan PCR reactions targeting single-copy genomic DNA. PACS provides a general new technical capability that expands the application space of cell sorting by enabling sorting based on cellular information not amenable to existing approaches. PMID:25030902
Chen, Liang; Huang, Linzhou; Min, Donghong; Phillips, Andy; Wang, Shiqiang; Madgwick, Pippa J; Parry, Martin A J; Hu, Yin-Gang
2012-01-01
Mutagenesis is an important tool in crop improvement. However, the hexaploid genome of wheat (Triticum aestivum L.) presents problems in identifying desirable genetic changes based on phenotypic screening due to gene redundancy. TILLING (Targeting Induced Local Lesions IN Genomes), a powerful reverse genetic strategy that allows the detection of induced point mutations in individuals of the mutagenized populations, can address the major challenge of linking sequence information to the biological function of genes and can also identify novel variation for crop breeding. Wheat is especially well-suited for TILLING due to the high mutation densities tolerated by polyploids. However, only a few wheat TILLING populations are currently available in the world, which is far from satisfying the requirement of researchers and breeders in different growing environments. In addition, current TILLING screening protocols require costly fluorescence detection systems, limiting their use, especially in developing countries. We developed a new TILLING resource comprising 2610 M(2) mutants in a common wheat cultivar 'Jinmai 47'. Numerous phenotypes with altered morphological and agronomic traits were observed from the M(2) and M(3) lines in the field. To simplify the procedure and decrease costs, we use unlabeled primers and either non-denaturing polyacrylamide gels or agarose gels for mutation detection. The value of this new resource was tested using PCR with RAPD and Intron-spliced junction (ISJ) primers, and also TILLING in three selected candidate genes, in 300 and 512 mutant lines, revealing high mutation densities of 1/34 kb by RAPD/ISJ analysis and 1/47 kb by TILLING. In total, 31 novel alleles were identified in the 3 targeted genes and confirmed by sequencing. The results indicate that this mutant population represents a useful resource for the wheat research community. We hope that the use of this reverse genetics resource will provide novel allelic diversity for wheat improvement and functional genomics.
Temperature influences the level of glyphosate resistance in barnyardgrass (Echinochloa colona).
Nguyen, Thai Hoan; Malone, Jenna M; Boutsalis, Peter; Shirley, Neil; Preston, Christopher
2016-05-01
Echinochloa colona is an important summer-growing weed species in cropping regions of northern Australia that has evolved resistance to glyphosate owing to intensive use of this herbicide in summer fallow. Pot trials conducted at 20 and 30 °C on six E. colona populations showed a significant increase in the level of glyphosate resistance in resistant populations at 30 °C compared with 20 °C. However, there was no influence of growth temperature on glyphosate susceptibility of the sensitive population. Sequencing of the target-site gene (EPSPS) of the six populations identified a mutation at position 106 leading to a change from proline to serine in the most resistant population A533.1 only. EPSPS gene amplification was not detected in any of the resistant populations examined. Examining (14) C-glyphosate uptake on two resistant and one susceptible population showed a twofold increase at 20 °C; however, few differences in glyphosate translocation occurred from the treated leaf to other plant parts between populations or temperatures. There is reduced efficacy of glyphosate at high temperatures on resistant E. colona populations, making these populations harder to control in summer. © 2015 Society of Chemical Industry.
Jin, Sheng Chih; Benitez, Bruno A; Deming, Yuetiva; Cruchaga, Carlos
2016-01-01
Analyses of genome-wide association studies (GWAS) for complex disorders usually identify common variants with a relatively small effect size that only explain a small proportion of phenotypic heritability. Several studies have suggested that a significant fraction of heritability may be explained by low-frequency (minor allele frequency (MAF) of 1-5 %) and rare-variants that are not contained in the commercial GWAS genotyping arrays (Schork et al., Curr Opin Genet Dev 19:212, 2009). Rare variants can also have relatively large effects on risk for developing human diseases or disease phenotype (Cruchaga et al., PLoS One 7:e31039, 2012). However, it is necessary to perform next-generation sequencing (NGS) studies in a large population (>4,000 samples) to detect a significant rare-variant association. Several NGS methods, such as custom capture sequencing and amplicon-based sequencing, are designed to screen a small proportion of the genome, but most of these methods are limited in the number of samples that can be multiplexed (i.e. most sequencing kits only provide 96 distinct index). Additionally, the sequencing library preparation for 4,000 samples remains expensive and thus conducting NGS studies with the aforementioned methods are not feasible for most research laboratories.The need for low-cost large scale rare-variant detection makes pooled-DNA sequencing an ideally efficient and cost-effective technique to identify rare variants in target regions by sequencing hundreds to thousands of samples. Our recent work has demonstrated that pooled-DNA sequencing can accurately detect rare variants in targeted regions in multiple DNA samples with high sensitivity and specificity (Jin et al., Alzheimers Res Ther 4:34, 2012). In these studies we used a well-established pooled-DNA sequencing approach and a computational package, SPLINTER (short indel prediction by large deviation inference and nonlinear true frequency estimation by recursion) (Vallania et al., Genome Res 20:1711, 2010), for accurate identification of rare variants in large DNA pools. Given an average sequencing coverage of 30× per haploid genome, SPLINTER can detect rare variants and short indels up to 4 base pairs (bp) with high sensitivity and specificity (up to 1 haploid allele in a pool as large as 500 individuals). Step-by-step instructions on how to conduct pooled-DNA sequencing experiments and data analyses are described in this chapter.
Application of industrial scale genomics to discovery of therapeutic targets in heart failure.
Mehraban, F; Tomlinson, J E
2001-12-01
In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.
Deep whole-genome sequencing of 100 southeast Asian Malays.
Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying
2013-01-10
Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Deep Whole-Genome Sequencing of 100 Southeast Asian Malays
Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying
2013-01-01
Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies. PMID:23290073
Wolfs, Jason M; Hamilton, Thomas A; Lant, Jeremy T; Laforet, Marcon; Zhang, Jenny; Salemi, Louisa M; Gloor, Gregory B; Schild-Poulter, Caroline; Edgell, David R
2016-12-27
The CRISPR/Cas9 nuclease is commonly used to make gene knockouts. The blunt DNA ends generated by cleavage can be efficiently ligated by the classical nonhomologous end-joining repair pathway (c-NHEJ), regenerating the target site. This repair creates a cycle of cleavage, ligation, and target site regeneration that persists until sufficient modification of the DNA break by alternative NHEJ prevents further Cas9 cutting, generating a heterogeneous population of insertions and deletions typical of gene knockouts. Here, we develop a strategy to escape this cycle and bias events toward defined length deletions by creating an RNA-guided dual active site nuclease that generates two noncompatible DNA breaks at a target site, effectively deleting the majority of the target site such that it cannot be regenerated. The TevCas9 nuclease, a fusion of the I-TevI nuclease domain to Cas9, functions robustly in HEK293 cells and generates 33- to 36-bp deletions at frequencies up to 40%. Deep sequencing revealed minimal processing of TevCas9 products, consistent with protection of the DNA ends from exonucleolytic degradation and repair by the c-NHEJ pathway. Directed evolution experiments identified I-TevI variants with broadened targeting range, making TevCas9 an easy-to-use reagent. Our results highlight how the sequence-tolerant cleavage properties of the I-TevI homing endonuclease can be harnessed to enhance Cas9 applications, circumventing the cleavage and ligation cycle and biasing genome-editing events toward defined length deletions.
Sather, Blythe D; Romano Ibarra, Guillermo S; Sommer, Karen; Curinga, Gabrielle; Hale, Malika; Khan, Iram F; Singh, Swati; Song, Yumei; Gwiazda, Kamila; Sahni, Jaya; Jarjour, Jordan; Astrakhan, Alexander; Wagner, Thor A; Scharenberg, Andrew M; Rawlings, David J
2015-09-30
Genetic mutations or engineered nucleases that disrupt the HIV co-receptor CCR5 block HIV infection of CD4(+) T cells. These findings have motivated the engineering of CCR5-specific nucleases for application as HIV therapies. The efficacy of this approach relies on efficient biallelic disruption of CCR5, and the ability to efficiently target sequences that confer HIV resistance to the CCR5 locus has the potential to further improve clinical outcomes. We used RNA-based nuclease expression paired with adeno-associated virus (AAV)-mediated delivery of a CCR5-targeting donor template to achieve highly efficient targeted recombination in primary human T cells. This method consistently achieved 8 to 60% rates of homology-directed recombination into the CCR5 locus in T cells, with over 80% of cells modified with an MND-GFP expression cassette exhibiting biallelic modification. MND-GFP-modified T cells maintained a diverse repertoire and engrafted in immune-deficient mice as efficiently as unmodified cells. Using this method, we integrated sequences coding chimeric antigen receptors (CARs) into the CCR5 locus, and the resulting targeted CAR T cells exhibited antitumor or anti-HIV activity. Alternatively, we introduced the C46 HIV fusion inhibitor, generating T cell populations with high rates of biallelic CCR5 disruption paired with potential protection from HIV with CXCR4 co-receptor tropism. Finally, this protocol was applied to adult human mobilized CD34(+) cells, resulting in 15 to 20% homologous gene targeting. Our results demonstrate that high-efficiency targeted integration is feasible in primary human hematopoietic cells and highlight the potential of gene editing to engineer T cell products with myriad functional properties. Copyright © 2015, American Association for the Advancement of Science.
Design of ligand-targeted nanoparticles for enhanced cancer targeting
NASA Astrophysics Data System (ADS)
Stefanick, Jared F.
Ligand-targeted nanoparticles are increasingly used as drug delivery vehicles for cancer therapy, yet have not consistently produced successful clinical outcomes. Although these inconsistencies may arise from differences in disease models and target receptors, nanoparticle design parameters can significantly influence therapeutic efficacy. By employing a multifaceted synthetic strategy to prepare peptide-targeted nanoparticles with high purity, reproducibility, and precisely controlled stoichiometry of functionalities, this work evaluates the roles of polyethylene glycol (PEG) coating, ethylene glycol (EG) peptide-linker length, peptide hydrophilicity, peptide density, and nanoparticle size on tumor targeting in a systematic manner. These parameters were analyzed in multiple disease models by targeting human epidermal growth factor receptor 2 (HER2) in breast cancer and very late antigen-4 (VLA-4) in multiple myeloma to demonstrate the widespread applicability of this approach. By increasing the hydrophilicity of the targeting peptide sequence and simultaneously optimizing the EG peptide-linker length, the in vitro cellular uptake of targeted liposomes was significantly enhanced. Specifically, including a short oligolysine chain adjacent to the targeting peptide sequence effectively increased cellular uptake ~80-fold using an EG6 peptide-linker compared to ~10-fold using an EG45 linker. In vivo, targeted liposomes prepared in a traditional manner lacking the oligolysine chain demonstrated similar biodistribution and tumor uptake to non-targeted liposomes. However, by including the oligolysine chain, targeted liposomes using an EG45 linker significantly improved tumor uptake ~8-fold over non-targeted liposomes, while the use of an EG6 linker decreased tumor accumulation and uptake, owing to differences in cellular uptake kinetics, clearance mechanisms, and binding site barrier effects. To further improve tumor targeting and enhance the selectivity of targeted nanoparticles, a dual-receptor targeted approach was evaluated by targeting multiple cell surface receptors simultaneously. Liposomes functionalized with two distinct peptide antagonists to target VLA-4 and Leukocyte Peyer's Patch Adhesion Molecule-1 (LPAM-1) demonstrated synergistically enhanced cellular uptake by cells overexpressing both target receptors and negligible uptake by cells that do not simultaneously express both receptors, providing a strategy to improve selectivity over conventional single receptor-targeted designs. Taken together, this process of systematic optimization of well-defined nanoparticle drug delivery systems has the potential to improve cancer therapy for a broader patient population.
Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform.
Schirmer, Melanie; Ijaz, Umer Z; D'Amore, Rosalinda; Hall, Neil; Sloan, William T; Quince, Christopher
2015-03-31
With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification of tissue-specific targeting peptide
NASA Astrophysics Data System (ADS)
Jung, Eunkyoung; Lee, Nam Kyung; Kang, Sang-Kee; Choi, Seung-Hoon; Kim, Daejin; Park, Kisoo; Choi, Kihang; Choi, Yun-Jaie; Jung, Dong Hyun
2012-11-01
Using phage display technique, we identified tissue-targeting peptide sets that recognize specific tissues (bone-marrow dendritic cell, kidney, liver, lung, spleen and visceral adipose tissue). In order to rapidly evaluate tissue-specific targeting peptides, we performed machine learning studies for predicting the tissue-specific targeting activity of peptides on the basis of peptide sequence information using four machine learning models and isolated the groups of peptides capable of mediating selective targeting to specific tissues. As a representative liver-specific targeting sequence, the peptide "DKNLQLH" was selected by the sequence similarity analysis. This peptide has a high degree of homology with protein ligands which can interact with corresponding membrane counterparts. We anticipate that our models will be applicable to the prediction of tissue-specific targeting peptides which can recognize the endothelial markers of target tissues.
A long-term target detection approach in infrared image sequence
NASA Astrophysics Data System (ADS)
Li, Hang; Zhang, Qi; Wang, Xin; Hu, Chao
2016-10-01
An automatic target detection method used in long term infrared (IR) image sequence from a moving platform is proposed. Firstly, based on POME(the principle of maximum entropy), target candidates are iteratively segmented. Then the real target is captured via two different selection approaches. At the beginning of image sequence, the genuine target with litter texture is discriminated from other candidates by using contrast-based confidence measure. On the other hand, when the target becomes larger, we apply online EM method to estimate and update the distributions of target's size and position based on the prior detection results, and then recognize the genuine one which satisfies both the constraints of size and position. Experimental results demonstrate that the presented method is accurate, robust and efficient.
NASA Astrophysics Data System (ADS)
Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.
2017-07-01
DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
The rate and character of spontaneous mutation in an RNA virus.
Malpica, José M; Fraile, Aurora; Moreno, Ignacio; Obies, Clara I; Drake, John W; García-Arenal, Fernando
2002-01-01
Estimates of spontaneous mutation rates for RNA viruses are few and uncertain, most notably due to their dependence on tiny mutation reporter sequences that may not well represent the whole genome. We report here an estimate of the spontaneous mutation rate of tobacco mosaic virus using an 804-base cognate mutational target, the viral MP gene that encodes the movement protein (MP). Selection against newly arising mutants was countered by providing MP function from a transgene. The estimated genomic mutation rate was on the lower side of the range previously estimated for lytic animal riboviruses. We also present the first unbiased riboviral mutational spectrum. The proportion of base substitutions is the same as that in a retrovirus but is lower than that in most DNA-based organisms. Although the MP mutant frequency was 0.02-0.05, 35% of the sequenced mutants contained two or more mutations. Therefore, the mutation process in populations of TMV and perhaps of riboviruses generally differs profoundly from that in populations of DNA-based microbes and may be strongly influenced by a subpopulation of mutator polymerases. PMID:12524327
TARGETED CAPTURE IN EVOLUTIONARY AND ECOLOGICAL GENOMICS
Jones, Matthew R.; Good, Jeffrey M.
2016-01-01
The rapid expansion of next-generation sequencing has yielded a powerful array of tools to address fundamental biological questions at a scale that was inconceivable just a few years ago. Various genome partitioning strategies to sequence select subsets of the genome have emerged as powerful alternatives to whole genome sequencing in ecological and evolutionary genomic studies. High throughput targeted capture is one such strategy that involves the parallel enrichment of pre-selected genomic regions of interest. The growing use of targeted capture demonstrates its potential power to address a range of research questions, yet these approaches have yet to expand broadly across labs focused on evolutionary and ecological genomics. In part, the use of targeted capture has been hindered by the logistics of capture design and implementation in species without established reference genomes. Here we aim to 1) increase the accessibility of targeted capture to researchers working in non-model taxa by discussing capture methods that circumvent the need of a reference genome, 2) highlight the evolutionary and ecological applications where this approach is emerging as a powerful sequencing strategy, and 3) discuss the future of targeted capture and other genome partitioning approaches in light of the increasing accessibility of whole genome sequencing. Given the practical advantages and increasing feasibility of high-throughput targeted capture, we anticipate an ongoing expansion of capture-based approaches in evolutionary and ecological research, synergistic with an expansion of whole genome sequencing. PMID:26137993
Profiles of Brain Metastases: Prioritization of Therapeutic Targets.
Ferguson, Sherise D; Zheng, Siyuan; Xiu, Joanne; Zhou, Shouhao; Khasraw, Mustafa; Brastianos, Priscilla K; Kesari, Santosh; Hu, Jethro; Rudnick, Jeremy; Salacz, Michael E; Piccioni, David; Huang, Suyun; Davies, Michael A; Glitza, Isabella C; Heymach, John V; Zhang, Jianjun; Ibrahim, Nuhad K; DeGroot, John F; McCarty, Joseph; O'Brien, Barbara J; Sawaya, Raymond; Verhaak, Roeland G W; Reddy, Sandeep K; Priebe, Waldemar; Gatalica, Zoran; Spetzler, David; Heimberger, Amy B
2018-06-19
We sought to compare the tumor profiles of brain metastases from common cancers with those of primary tumors and extracranial metastases in order to identify potential targets and prioritize rational treatment strategies. Tumor samples were collected from both the primary and metastatic sites of non-small cell lung cancer, breast cancer, and melanoma from patients in locations worldwide, and these were submitted to Caris Life Sciences for tumor multiplatform analysis, including gene sequencing (Sanger and next-generation sequencing with a targeted 47-gene panel), protein expression (assayed by immunohistochemistry), and gene amplification (assayed by in situ hybridization). The data analysis considered differential protein expression, gene amplification, and mutations among brain metastases, extracranial metastases, and primary tumors. The analyzed population included: 16,999 unmatched primary tumor and/or metastasis samples: 8178 non-small cell lung cancers (5098 primaries; 2787 systemic metastases; 293 brain metastases), 7064 breast cancers (3496 primaries; 3469 systemic metastases; 99 brain metastases), and 1757 melanomas (660 primaries; 996 systemic metastases; 101 brain metastases). TOP2A expression was increased in brain metastases from all 3 cancers, and brain metastases overexpressed multiple proteins clustering around functions critical to DNA synthesis and repair and implicated in chemotherapy resistance, including RRM1, TS, ERCC1, and TOPO1. cMET was overexpressed in melanoma brain metastases relative to primary skin specimens. Brain metastasis patients may particularly benefit from therapeutic targeting of enzymes associated with DNA synthesis, replication, and/or repair. This article is protected by copyright. All rights reserved. © 2018 UICC.
Mobegi, Victor A; Duffy, Craig W; Amambua-Ngwa, Alfred; Loua, Kovana M; Laman, Eugene; Nwakanma, Davis C; MacInnis, Bronwyn; Aspeling-Jones, Harvey; Murray, Lee; Clark, Taane G; Kwiatkowski, Dominic P; Conway, David J
2014-06-01
Locally varying selection on pathogens may be due to differences in drug pressure, host immunity, transmission opportunities between hosts, or the intensity of between-genotype competition within hosts. Highly recombining populations of the human malaria parasite Plasmodium falciparum throughout West Africa are closely related, as gene flow is relatively unrestricted in this endemic region, but markedly varying ecology and transmission intensity should cause distinct local selective pressures. Genome-wide analysis of sequence variation was undertaken on a sample of 100 P. falciparum clinical isolates from a highly endemic region of the Republic of Guinea where transmission occurs for most of each year and compared with data from 52 clinical isolates from a previously sampled population from The Gambia, where there is relatively limited seasonal malaria transmission. Paired-end short-read sequences were mapped against the 3D7 P. falciparum reference genome sequence, and data on 136,144 single nucleotide polymorphisms (SNPs) were obtained. Within-population analyses identifying loci showing evidence of recent positive directional selection and balancing selection confirm that antimalarial drugs and host immunity have been major selective agents. Many of the signatures of recent directional selection reflected by standardized integrated haplotype scores were population specific, including differences at drug resistance loci due to historically different antimalarial use between the countries. In contrast, both populations showed a similar set of loci likely to be under balancing selection as indicated by very high Tajima's D values, including a significant overrepresentation of genes expressed at the merozoite stage that invades erythrocytes and several previously validated targets of acquired immunity. Between-population FST analysis identified exceptional differentiation of allele frequencies at a small number of loci, most markedly for five SNPs covering a 15-kb region within and flanking the gdv1 gene that regulates the early stages of gametocyte development, which is likely related to the extreme differences in mosquito vector abundance and seasonality that determine the transmission opportunities for the sexual stage of the parasite. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
An evolution based biosensor receptor DNA sequence generation algorithm.
Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng
2010-01-01
A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier
2016-01-04
The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Jakupciak, John P; Wells, Jeffrey M; Karalus, Richard J; Pawlowski, David R; Lin, Jeffrey S; Feldman, Andrew B
2013-01-01
Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations.
Jakupciak, John P.; Wells, Jeffrey M.; Karalus, Richard J.; Pawlowski, David R.; Lin, Jeffrey S.; Feldman, Andrew B.
2013-01-01
Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations. PMID:24455204
Ashman, Tia-Lynn; Tennessen, Jacob A.; Dalton, Rebecca M.; Govindarajulu, Rajanikanth; Koski, Matthew H.; Liston, Aaron
2015-01-01
Gynodioecy, the coexistence of females and hermaphrodites, occurs in 20% of angiosperm families and often enables transitions between hermaphroditism and dioecy. Clarifying mechanisms of sex determination in gynodioecious species can thus illuminate sexual system evolution. Genetic determination of gynodioecy, however, can be complex and is not fully characterized in any wild species. We used targeted sequence capture to genetically map a novel nuclear contributor to male sterility in a self-pollinated hermaphrodite of Fragaria vesca subsp. bracteata from the southern portion of its range. To understand its interaction with another identified locus and possibly additional loci, we performed crosses within and between two populations separated by 2000 km, phenotyped the progeny and sequenced candidate markers at both sex-determining loci. The newly mapped locus contains a high density of pentatricopeptide repeat genes, a class commonly involved in restoration of fertility caused by cytoplasmic male sterility. Examination of all crosses revealed three unlinked epistatically interacting loci that determine sexual phenotype and vary in frequency between populations. Fragaria vesca subsp. bracteata represents the first wild gynodioecious species with genomic evidence of both cytoplasmic and nuclear genes in sex determination. We propose a model for the interactions between these loci and new hypotheses for the evolution of sex determining chromosomes in the subdioecious and dioecious Fragaria. PMID:26483011
Vierheilig, J.; Savio, D.; Ley, R. E.; Mach, R. L.; Farnleitner, A. H.
2016-01-01
The applicability of next generation DNA sequencing (NGS) methods for water quality assessment has so far not been broadly investigated. This study set out to evaluate the potential of an NGS-based approach in a complex catchment with importance for drinking water abstraction. In this multicompartment investigation, total bacterial communities in water, faeces, soil, and sediment samples were investigated by 454 pyrosequencing of bacterial 16S rRNA gene amplicons to assess the capabilities of this NGS method for (i) the development and evaluation of environmental molecular diagnostics, (ii) direct screening of the bulk bacterial communities, and (iii) the detection of faecal pollution in water. Results indicate that NGS methods can highlight potential target populations for diagnostics and will prove useful for the evaluation of existing and the development of novel DNA-based detection methods in the field of water microbiology. The used approach allowed unveiling of dominant bacterial populations but failed to detect populations with low abundances such as faecal indicators in surface waters. In combination with metadata, NGS data will also allow the identification of drivers of bacterial community composition during water treatment and distribution, highlighting the power of this approach for monitoring of bacterial regrowth and contamination in technical systems. PMID:26606090
Pembleton, Luke W; Inch, Courtney; Baillie, Rebecca C; Drayton, Michelle C; Thakur, Preeti; Ogaji, Yvonne O; Spangenberg, German C; Forster, John W; Daetwyler, Hans D; Cogan, Noel O I
2018-06-02
Exploitation of data from a ryegrass breeding program has enabled rapid development and implementation of genomic selection for sward-based biomass yield with a twofold-to-threefold increase in genetic gain. Genomic selection, which uses genome-wide sequence polymorphism data and quantitative genetics techniques to predict plant performance, has large potential for the improvement in pasture plants. Major factors influencing the accuracy of genomic selection include the size of reference populations, trait heritability values and the genetic diversity of breeding populations. Global diversity of the important forage species perennial ryegrass is high and so would require a large reference population in order to achieve moderate accuracies of genomic selection. However, diversity of germplasm within a breeding program is likely to be lower. In addition, de novo construction and characterisation of reference populations are a logistically complex process. Consequently, historical phenotypic records for seasonal biomass yield and heading date over a 18-year period within a commercial perennial ryegrass breeding program have been accessed, and target populations have been characterised with a high-density transcriptome-based genotyping-by-sequencing assay. Ability to predict observed phenotypic performance in each successive year was assessed by using all synthetic populations from previous years as a reference population. Moderate and high accuracies were achieved for the two traits, respectively, consistent with broad-sense heritability values. The present study represents the first demonstration and validation of genomic selection for seasonal biomass yield within a diverse commercial breeding program across multiple years. These results, supported by previous simulation studies, demonstrate the ability to predict sward-based phenotypic performance early in the process of individual plant selection, so shortening the breeding cycle, increasing the rate of genetic gain and allowing rapid adoption in ryegrass improvement programs.
A multiplex primer design algorithm for target amplification of continuous genomic regions.
Ozturk, Ahmet Rasit; Can, Tolga
2017-06-19
Targeted Next Generation Sequencing (NGS) assays are cost-efficient and reliable alternatives to Sanger sequencing. For sequencing of very large set of genes, the target enrichment approach is suitable. However, for smaller genomic regions, the target amplification method is more efficient than both the target enrichment method and Sanger sequencing. The major difficulty of the target amplification method is the preparation of amplicons, regarding required time, equipment, and labor. Multiplex PCR (MPCR) is a good solution for the mentioned problems. We propose a novel method to design MPCR primers for a continuous genomic region, following the best practices of clinically reliable PCR design processes. On an experimental setup with 48 different combinations of factors, we have shown that multiple parameters might effect finding the first feasible solution. Increasing the length of the initial primer candidate selection sequence gives better results whereas waiting for a longer time to find the first feasible solution does not have a significant impact. We generated MPCR primer designs for the HBB whole gene, MEFV coding regions, and human exons between 2000 bp to 2100 bp-long. Our benchmarking experiments show that the proposed MPCR approach is able produce reliable NGS assay primers for a given sequence in a reasonable amount of time.
Kamatuka, Kenta; Hattori, Masahiro; Sugiyama, Tomoyasu
2016-12-01
RNA interference (RNAi) screening is extensively used in the field of reverse genetics. RNAi libraries constructed using random oligonucleotides have made this technology affordable. However, the new methodology requires exploration of the RNAi target gene information after screening because the RNAi library includes non-natural sequences that are not found in genes. Here, we developed a web-based tool to support RNAi screening. The system performs short hairpin RNA (shRNA) target prediction that is informed by comprehensive enquiry (SPICE). SPICE automates several tasks that are laborious but indispensable to evaluate the shRNAs obtained by RNAi screening. SPICE has four main functions: (i) sequence identification of shRNA in the input sequence (the sequence might be obtained by sequencing clones in the RNAi library), (ii) searching the target genes in the database, (iii) demonstrating biological information obtained from the database, and (iv) preparation of search result files that can be utilized in a local personal computer (PC). Using this system, we demonstrated that genes targeted by random oligonucleotide-derived shRNAs were not different from those targeted by organism-specific shRNA. The system facilitates RNAi screening, which requires sequence analysis after screening. The SPICE web application is available at http://www.spice.sugysun.org/.
RNase H-assisted RNA-primed rolling circle amplification for targeted RNA sequence detection.
Takahashi, Hirokazu; Ohkawachi, Masahiko; Horio, Kyohei; Kobori, Toshiro; Aki, Tsunehiro; Matsumura, Yukihiko; Nakashimada, Yutaka; Okamura, Yoshiko
2018-05-17
RNA-primed rolling circle amplification (RPRCA) is a useful laboratory method for RNA detection; however, the detection of RNA is limited by the lack of information on 3'-terminal sequences. We uncovered that conventional RPRCA using pre-circularized probes could potentially detect the internal sequence of target RNA molecules in combination with RNase H. However, the specificity for mRNA detection was low, presumably due to non-specific hybridization of non-target RNA with the circular probe. To overcome this technical problem, we developed a method for detecting a sequence of interest in target RNA molecules via RNase H-assisted RPRCA using padlocked probes. When padlock probes are hybridized to the target RNA molecule, they are converted to the circular form by SplintR ligase. Subsequently, RNase H creates nick sites only in the hybridized RNA sequence, and single-stranded DNA is finally synthesized from the nick site by phi29 DNA polymerase. This method could specifically detect at least 10 fmol of the target RNA molecule without reverse transcription. Moreover, this method detected GFP mRNA present in 10 ng of total RNA isolated from Escherichia coli without background DNA amplification. Therefore, this method can potentially detect almost all types of RNA molecules without reverse transcription and reveal full-length sequence information.
Composition for nucleic acid sequencing
Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY
2008-08-26
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules
Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu
2006-06-06
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules
Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu
2006-05-30
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Lu, Chaoxia; Wu, Wei; Xiao, Jifang; Meng, Yan; Zhang, Shuyang; Zhang, Xue
2013-06-01
To detect pathogenic mutations in Marfan syndrome (MFS) using an Ion Torrent Personal Genome Machine (PGM) and to validate the result of targeted next-generation semiconductor sequencing for the diagnosis of genetic disorders. Peripheral blood samples were collected from three MFS patients and a normal control with informed consent. Genomic DNA was isolated by standard method and then subjected to targeted sequencing using an Ion Ampliseq(TM) Inherited Disease Panel. Three multiplex PCR reactions were carried out to amplify the coding exons of 328 genes including FBN1, TGFBR1 and TGFBR2. DNA fragments from different samples were ligated with barcoded sequencing adaptors. Template preparation and emulsion PCR, and Ion Sphere Particles enrichment were carried out using an Ion One Touch system. The ion sphere particles were sequenced on a 318 chip using the PGM platform. Data from the PGM runs were processed using an Ion Torrent Suite 3.2 software to generate sequence reads. After sequence alignment and extraction of SNPs and indels, all the variants were filtered against dbSNP137. DNA sequences were visualized with an Integrated Genomics Viewer. The most likely disease-causing variants were analyzed by Sanger sequencing. The PGM sequencing has yielded an output of 855.80 Mb, with a > 100 × median sequencing depth and a coverage of > 98% for the targeted regions in all the four samples. After data analysis and database filtering, one known missense mutation (p.E1811K) and two novel premature termination mutations (p.E2264X and p.L871FfsX23) in the FBN1 gene were identified in the three MFS patients. All mutations were verified by conventional Sanger sequencing. Pathogenic FBN1 mutations have been identified in all patients with MFS, indicating that the targeted next-generation sequencing on the PGM sequencers can be applied for accurate and high-throughput testing of genetic disorders.
Halmillawewa, Anupama P; Restrepo-Córdoba, Marcela; Perry, Benjamin J; Yost, Christopher K; Hynes, Michael F
2016-02-01
Bacteriophages may play an important role in regulating population size and diversity of the root nodule symbiont Rhizobium leguminosarum, as well as participating in horizontal gene transfer. Although phages that infect this species have been isolated in the past, our knowledge of their molecular biology, and especially of genome composition, is extremely limited, and this lack of information impacts on the ability to assess phage population dynamics and limits potential agricultural applications of rhizobiophages. To help address this deficit in available sequence and biological information, the complete genome sequence of the Myoviridae temperate phage PPF1 that infects R. leguminosarum biovar viciae strain F1 was determined. The genome is 54,506 bp in length with an average G+C content of 61.9 %. The genome contains 94 putative open reading frames (ORFs) and 74.5 % of these predicted ORFs share homology at the protein level with previously reported sequences in the database. However, putative functions could only be assigned to 25.5 % (24 ORFs) of the predicted genes. PPF1 was capable of efficiently lysogenizing its rhizobial host R. leguminosarum F1. The site-specific recombination system of the phage targets an integration site that lies within a putative tRNA-Pro (CGG) gene in R. leguminosarum F1. Upon integration, the phage is capable of restoring the disrupted tRNA gene, owing to the 50 bp homologous sequence (att core region) it shares with its rhizobial host genome. Phage PPF1 is the first temperate phage infecting members of the genus Rhizobium for which a complete genome sequence, as well as other biological data such as the integration site, is available.
Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G
2018-03-01
Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.
Population genetic characterization of Cyclospora cayetanensis from discrete geographical regions.
Guo, Yaqiong; Li, Na; Ortega, Ynes R; Zhang, Longxian; Roellig, Dawn M; Feng, Yaoyu; Xiao, Lihua
2018-01-01
Cyclospora cayetanensis is an emerging pathogen that is endemic in developing countries and responsible for many large foodborne cyclosporiasis outbreaks in North America since 1990s. Because of the lack of typing targets, the genetic diversity and population genetics of C. cayetanensis have not been investigated. In this study, we undertook a population genetic analysis of multilocus sequence typing data we recently collected from 64 C. cayetanensis specimens. Despite the extensive genetic heterogeneity in the overall C. cayetanensis population, there were significant intra- and inter-genic linkage disequilibria (LD). A disappearance of LD was observed when only multilocus genotypes were included in the population genetic analysis, indicative of an epidemic nature of C. cayetanensis. Geographical segregation-associated sub-structuring was observed between specimens from China and those from Peru and the United States. The two subpopulations had reduced LD, indicating the likely occurrence of genetic exchange among isolates in endemic areas. Further analyses of specimens from other geographical regions are necessary to fully understand the population genetics of C. cayetanensis. Copyright © 2017 Elsevier Inc. All rights reserved.
Genomic dissection of small RNAs in wild rice (Oryza rufipogon): lessons for rice domestication.
Wang, Yu; Bai, Xuefei; Yan, Chenghai; Gui, Yiejie; Wei, Xinghua; Zhu, Qian-Hao; Guo, Longbiao; Fan, Longjiang
2012-11-01
The lack of a MIRNA set and genome sequence of wild rice (Oryza rufipogon) has prevented us from determining the role of MIRNA genes in rice domestication. In this study, a genome, three small RNA populations and a degradome of O. rufipogon were sequenced by Illumina platform and the expression levels of microRNAs (miRNAs) were investigated by miRNA chips. A de novo O. rufipogon genome was assembled using c. 55× coverage of raw sequencing data and a total of 387 MIRNAs were identified in the O. rufipogon genome based on c. 5.2 million unique small RNA reads from three different tissues of O. rufipogon. Of these, O. rufipogon MIRNAs, 259 were not found in the cultivated rice, suggesting a loss of these MIRNAs in the cultivated rice. We also found that 48 MIRNAs were novel in the cultivated rice, suggesting that they were potential targets of domestication selection. Some miRNAs showed significant expression differences between wild and cultivated rice, suggesting that expression of miRNA could also be a target of domestication, as demonstrated for the miR164 family. Our results illustrated that MIRNA genes, like protein-coding genes, might have been significantly shaped during rice domestication and could be one of the driving forces that contributed to rice domestication. © 2012 The Authors. New Phytologist © 2012 New Phytologist Trust.
Fontanesi, Luca; Bertolini, Francesca; Scotti, Emilio; Schiavo, Giuseppina; Colombo, Michela; Trevisi, Paolo; Ribani, Anisa; Buttazzoni, Luca; Russo, Vincenzo; Dall'Olio, Stefania
2015-01-01
The GPR120 gene (also known as FFAR4 or O3FAR1) encodes for a functional omega-3 fatty acid receptor/sensor that mediates potent insulin sensitizing effects by repressing macrophage-induced tissue inflammation. For its functional role, GPR120 could be considered a potential target gene in animal nutrigenetics. In this work we resequenced the porcine GPR120 gene by high throughput Ion Torrent semiconductor sequencing of amplified fragments obtained from 8 DNA pools derived, on the whole, from 153 pigs of different breeds/populations (two Italian Large White pools, Italian Duroc, Italian Landrace, Casertana, Pietrain, Meishan, and wild boars). Three single nucleotide polymorphisms (SNPs), two synonymous substitutions and one in the putative 3'-untranslated region (g.114765469C > T), were identified and their allele frequencies were estimated by sequencing reads count. The g.114765469C > T SNP was also genotyped by PCR-RFLP confirming estimated frequency in Italian Large White pools. Then, this SNP was analyzed in two Italian Large White cohorts using a selective genotyping approach based on extreme and divergent pigs for back fat thickness (BFT) estimated breeding value (EBV) and average daily gain (ADG) EBV. Significant differences of allele and genotype frequencies distribution was observed between the extreme ADG-EBV groups (P < 0.001) whereas this marker was not associated with BFT-EBV.
McGhee, Gayle C.; Sundin, George W.
2012-01-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) comprise a family of short DNA repeat sequences that are separated by non repetitive spacer sequences and, in combination with a suite of Cas proteins, are thought to function as an adaptive immune system against invading DNA. The number of CRISPR arrays in a bacterial chromosome is variable, and the content of each array can differ in both repeat number and in the presence or absence of specific spacers. We utilized a comparative sequence analysis of CRISPR arrays of the plant pathogen Erwinia amylovora to uncover previously unknown genetic diversity in this species. A total of 85 E. amylovora strains varying in geographic isolation (North America, Europe, New Zealand, and the Middle East), host range, plasmid content, and streptomycin sensitivity/resistance were evaluated for CRISPR array number and spacer variability. From these strains, 588 unique spacers were identified in the three CRISPR arrays present in E. amylovora, and these arrays could be categorized into 20, 17, and 2 patterns types, respectively. Analysis of the relatedness of spacer content differentiated most apple and pear strains isolated in the eastern U.S. from western U.S. strains. In addition, we identified North American strains that shared CRISPR genotypes with strains isolated on other continents. E. amylovora strains from Rubus and Indian hawthorn contained mostly unique spacers compared to apple and pear strains, while strains from loquat shared 79% of spacers with apple and pear strains. Approximately 23% of the spacers matched known sequences, with 16% targeting plasmids and 5% targeting bacteriophage. The plasmid pEU30, isolated in E. amylovora strains from the western U.S., was targeted by 55 spacers. Lastly, we used spacer patterns and content to determine that streptomycin-resistant strains of E. amylovora from Michigan were low in diversity and matched corresponding streptomycin-sensitive strains from the background population. PMID:22860008
McGhee, Gayle C; Sundin, George W
2012-01-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) comprise a family of short DNA repeat sequences that are separated by non repetitive spacer sequences and, in combination with a suite of Cas proteins, are thought to function as an adaptive immune system against invading DNA. The number of CRISPR arrays in a bacterial chromosome is variable, and the content of each array can differ in both repeat number and in the presence or absence of specific spacers. We utilized a comparative sequence analysis of CRISPR arrays of the plant pathogen Erwinia amylovora to uncover previously unknown genetic diversity in this species. A total of 85 E. amylovora strains varying in geographic isolation (North America, Europe, New Zealand, and the Middle East), host range, plasmid content, and streptomycin sensitivity/resistance were evaluated for CRISPR array number and spacer variability. From these strains, 588 unique spacers were identified in the three CRISPR arrays present in E. amylovora, and these arrays could be categorized into 20, 17, and 2 patterns types, respectively. Analysis of the relatedness of spacer content differentiated most apple and pear strains isolated in the eastern U.S. from western U.S. strains. In addition, we identified North American strains that shared CRISPR genotypes with strains isolated on other continents. E. amylovora strains from Rubus and Indian hawthorn contained mostly unique spacers compared to apple and pear strains, while strains from loquat shared 79% of spacers with apple and pear strains. Approximately 23% of the spacers matched known sequences, with 16% targeting plasmids and 5% targeting bacteriophage. The plasmid pEU30, isolated in E. amylovora strains from the western U.S., was targeted by 55 spacers. Lastly, we used spacer patterns and content to determine that streptomycin-resistant strains of E. amylovora from Michigan were low in diversity and matched corresponding streptomycin-sensitive strains from the background population.
Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S
2014-11-01
Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B
1986-01-01
A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461
Arrays of probes for positional sequencing by hybridization
Cantor, Charles R [Boston, MA; Prezetakiewiczr, Marek [East Boston, MA; Smith, Cassandra L [Boston, MA; Sano, Takeshi [Waltham, MA
2008-01-15
This invention is directed to methods and reagents useful for sequencing nucleic acid targets utilizing sequencing by hybridization technology comprising probes, arrays of probes and methods whereby sequence information is obtained rapidly and efficiently in discrete packages. That information can be used for the detection, identification, purification and complete or partial sequencing of a particular target nucleic acid. When coupled with a ligation step, these methods can be performed under a single set of hybridization conditions. The invention also relates to the replication of probe arrays and methods for making and replicating arrays of probes which are useful for the large scale manufacture of diagnostic aids used to screen biological samples for specific target sequences. Arrays created using PCR technology may comprise probes with 5'- and/or 3'-overhangs.
Colacino, Justin A.; McDermott, Sean P.; Sartor, Maureen A.; Wicha, Max S.; Rozek, Laura S.
2017-01-01
Curcumin is a potential agent for both the prevention and treatment of cancers. Curcumin treatment alone, or in combination with piperine, limits breast stem cell self-renewal while remaining non-toxic to normal differentiated cells. We paired fluorescence activated cell sorting with RNA sequencing to characterize the genome-wide changes induced specifically in normal breast stem cells following treatment with these compounds. We generated genome-wide maps of the transcriptional changes that occur in epithelial-like (ALDH+) and mesenchymal-like (ALDH−/CD44+/CD24−) normal breast stem/progenitor cells following treatment with curcumin and piperine. We show that curcumin targets both stem cell populations by down-regulating expression of breast stem cell genes including ALDH1A3, CD49f, PROM1, and TP63. We also identified novel genes and pathways targeted by curcumin, including downregulation of SCD. Transient siRNA knockdown of SCD in MCF10A cells significantly inhibited mammosphere formation and the mean proportion of CD44+/CD24− cells, suggesting that SCD is a regulator of breast stemness and a target of curcumin in breast stem cells. These findings extend previous reports of curcumin targeting stem cells, here in two phenotypically distinct stem/progenitor populations isolated from normal human breast tissue. We identified novel mechanisms by which curcumin and piperine target breast stem cell self-renewal, such as by targeting lipid metabolism, providing a mechanistic link between curcumin treatment and stem cell self renewal. These results elucidate the mechanisms by which curcumin may act as a cancer preventive compound and provide novel targets for cancer prevention and treatment. PMID:27306423
Colacino, Justin A; McDermott, Sean P; Sartor, Maureen A; Wicha, Max S; Rozek, Laura S
2016-07-01
Curcumin is a potential agent for both the prevention and treatment of cancers. Curcumin treatment alone, or in combination with piperine, limits breast stem cell self-renewal, while remaining non-toxic to normal differentiated cells. We paired fluorescence-activated cell sorting with RNA sequencing to characterize the genome-wide changes induced specifically in normal breast stem cells following treatment with these compounds. We generated genome-wide maps of the transcriptional changes that occur in epithelial-like (ALDH+) and mesenchymal-like (ALDH-/CD44+/CD24-) normal breast stem/progenitor cells following treatment with curcumin and piperine. We show that curcumin targets both stem cell populations by down-regulating expression of breast stem cell genes including ALDH1A3, CD49f, PROM1, and TP63. We also identified novel genes and pathways targeted by curcumin, including downregulation of SCD. Transient siRNA knockdown of SCD in MCF10A cells significantly inhibited mammosphere formation and the mean proportion of CD44+/CD24- cells, suggesting that SCD is a regulator of breast stemness and a target of curcumin in breast stem cells. These findings extend previous reports of curcumin targeting stem cells, here in two phenotypically distinct stem/progenitor populations isolated from normal human breast tissue. We identified novel mechanisms by which curcumin and piperine target breast stem cell self-renewal, such as by targeting lipid metabolism, providing a mechanistic link between curcumin treatment and stem cell self-renewal. These results elucidate the mechanisms by which curcumin may act as a cancer-preventive compound and provide novel targets for cancer prevention and treatment.
Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun
2016-01-01
Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682
Phage display selection of peptides that target calcium-binding proteins.
Vetter, Stefan W
2013-01-01
Phage display allows to rapidly identify peptide sequences with binding affinity towards target proteins, for example, calcium-binding proteins (CBPs). Phage technology allows screening of 10(9) or more independent peptide sequences and can identify CBP binding peptides within 2 weeks. Adjusting of screening conditions allows selecting CBPs binding peptides that are either calcium-dependent or independent. Obtained peptide sequences can be used to identify CBP target proteins based on sequence homology or to quickly obtain peptide-based CBP inhibitors to modulate CBP-target interactions. The protocol described here uses a commercially available phage display library, in which random 12-mer peptides are displayed on filamentous M13 phages. The library was screened against the calcium-binding protein S100B.
Design of the hairpin ribozyme for targeting specific RNA sequences.
Hampel, A; DeYoung, M B; Galasinski, S; Siwkowski, A
1997-01-01
The following steps should be taken when designing the hairpin ribozyme to cleave a specific target sequence: 1. Select a target sequence containing BN*GUC where B is C, G, or U. 2. Select the target sequence in areas least likely to have extensive interfering structure. 3. Design the conventional hairpin ribozyme as shown in Fig. 1, such that it can form a 4 bp helix 2 and helix 1 lengths up to 10 bp. 4. Synthesize this ribozyme from single-stranded DNA templates with a double-stranded T7 promoter. 5. Prepare a series of short substrates capable of forming a range of helix 1 lengths of 5-10 bp. 6. Identify these by direct RNA sequencing. 7. Assay the extent of cleavage of each substrate to identify the optimal length of helix 1. 8. Prepare the hairpin tetraloop ribozyme to determine if catalytic efficiency can be improved.
You, Yanqin; Sun, Yan; Li, Xuchao; Li, Yali; Wei, Xiaoming; Chen, Fang; Ge, Huijuan; Lan, Zhangzhang; Zhu, Qian; Tang, Ying; Wang, Shujuan; Gao, Ya; Jiang, Fuman; Song, Jiaping; Shi, Quan; Zhu, Xuan; Mu, Feng; Dong, Wei; Gao, Vince; Jiang, Hui; Yi, Xin; Wang, Wei; Gao, Zhiying
2014-08-01
This article demonstrates a prominent noninvasive prenatal approach to assist the clinical diagnosis of a single-gene disorder disease, maple syrup urine disease, using targeted sequencing knowledge from the affected family. The method reported here combines novel mutant discovery in known genes by targeted massively parallel sequencing with noninvasive prenatal testing. By applying this new strategy, we successfully revealed novel mutations in the gene BCKDHA (Ex2_4dup and c.392A>G) in this Chinese family and developed a prenatal haplotype-assisted approach to noninvasively detect the genotype of the fetus (transmitted from both parents). This is the first report of integration of targeted sequencing and noninvasive prenatal testing into clinical practice. Our study has demonstrated that this massively parallel sequencing-based strategy can potentially be used for single-gene disorder diagnosis in the future.
Performance of a visuomotor walking task in an augmented reality training setting.
Haarman, Juliet A M; Choi, Julia T; Buurke, Jaap H; Rietman, Johan S; Reenalda, Jasper
2017-12-01
Visual cues can be used to train walking patterns. Here, we studied the performance and learning capacities of healthy subjects executing a high-precision visuomotor walking task, in an augmented reality training set-up. A beamer was used to project visual stepping targets on the walking surface of an instrumented treadmill. Two speeds were used to manipulate task difficulty. All participants (n = 20) had to change their step length to hit visual stepping targets with a specific part of their foot, while walking on a treadmill over seven consecutive training blocks, each block composed of 100 stepping targets. Distance between stepping targets was varied between short, medium and long steps. Training blocks could either be composed of random stepping targets (no fixed sequence was present in the distance between the stepping targets) or sequenced stepping targets (repeating fixed sequence was present). Random training blocks were used to measure non-specific learning and sequenced training blocks were used to measure sequence-specific learning. Primary outcome measures were performance (% of correct hits), and learning effects (increase in performance over the training blocks: both sequence-specific and non-specific). Secondary outcome measures were the performance and stepping-error in relation to the step length (distance between stepping target). Subjects were able to score 76% and 54% at first try for lower speed (2.3 km/h) and higher speed (3.3 km/h) trials, respectively. Performance scores did not increase over the course of the trials, nor did the subjects show the ability to learn a sequenced walking task. Subjects were better able to hit targets while increasing their step length, compared to shortening it. In conclusion, augmented reality training by use of the current set-up was intuitive for the user. Suboptimal feedback presentation might have limited the learning effects of the subjects. Copyright © 2017 Elsevier B.V. All rights reserved.
Sommer, J M; Nguyen, T T; Wang, C C
1994-08-15
Import of proteins into the glycosomes of T. brucei resembles the peroxisomal protein import in that C-terminal SKL-like tripeptide sequences can function as targeting signals. Many of the glycosomal proteins do not, however, possess such C-terminal tripeptide signals. Among these, phosphoenolpyruvate carboxykinase (PEPCK (ATP)) was thought to be targeted to the glycosomes by an N-terminal or an internal targeting signal. A limited similarity to the N-terminal targeting signal of rat peroxisomal thiolase exists at the N-terminus of T. brucei PEPCK. However, we found that this peroxisomal targeting signal does not function for glycosomal protein import in T. brucei. Further studies of the PEPCK gene revealed that the C-terminus of the predicted protein does not correspond to the previously deduced protein sequence of 472 amino acids due to a -1 frame shift error in the original DNA sequence. Readjusting the reading frame of the sequence results in a predicted protein of 525 amino acids in length ending in a tripeptide serine-arginine-leucine (SRL), which is a potential targeting signal for import into the glycosomes. A fusion protein of firefly luciferase, without its own C-terminal SKL targeting signal, and T. brucei PEPCK is efficiently imported into the glycosomes when expressed in procyclic trypanosomes. Deletion of the C-terminal SRL tripeptide or the last 29 amino acids of PEPCK reduced the import only by about 50%, while a deletion of the last 47 amino acids completely abolished the import. These results suggest that T. brucei PEPCK may contain a second, internal glycosomal targeting signal upstream of the C-terminal SRL sequence.
Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites
Prouse, Michael B.; Campbell, Malcolm M.
2013-01-01
Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing). The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators. PMID:23741471
Abdel-Shafi, Iman R; Shoieb, Eman Y; Attia, Samar S; Rubio, José M; Ta-Tang, Thuy-Huong; El-Badry, Ayman A
2017-03-01
Lymphatic filariasis (LF) is a serious vector-borne health problem, and Wuchereria bancrofti (W.b) is the major cause of LF worldwide and is focally endemic in Egypt. Identification of filarial infection using traditional morphologic and immunological criteria can be difficult and lead to misdiagnosis. The aim of the present study was molecular detection of W.b in residents in endemic areas in Egypt, sequence variance analysis, and phylogenetic analysis of W.b DNA. Collected blood samples from residents in filariasis endemic areas in five governorates were subjected to semi-nested PCR targeting repeated DNA sequence, for detection of W.b DNA. PCR products were sequenced; subsequently, a phylogenetic analysis of the obtained sequences was performed. Out of 300 blood samples, W.b DNA was identified in 48 (16%). Sequencing analysis confirmed PCR results identifying only W.b species. Sequence alignment and phylogenetic analysis indicated genetically distinct clusters of W.b among the study population. Study results demonstrated that the semi-nested PCR proved to be an effective diagnostic tool for accurate and rapid detection of W.b infections in nano-epidemics and is applicable for samples collected in the daytime as well as the night time. PCR products sequencing and phylogenitic analysis revealed three different nucleotide sequences variants. Further genetic studies of W.b in Egypt and other endemic areas are needed to distinguish related strains and the various ecological as well as drug effects exerted on them to support W.b elimination.
MPN estimation of qPCR target sequence recoveries from whole cell calibrator samples.
Sivaganesan, Mano; Siefring, Shawn; Varma, Manju; Haugland, Richard A
2011-12-01
DNA extracts from enumerated target organism cells (calibrator samples) have been used for estimating Enterococcus cell equivalent densities in surface waters by a comparative cycle threshold (Ct) qPCR analysis method. To compare surface water Enterococcus density estimates from different studies by this approach, either a consistent source of calibrator cells must be used or the estimates must account for any differences in target sequence recoveries from different sources of calibrator cells. In this report we describe two methods for estimating target sequence recoveries from whole cell calibrator samples based on qPCR analyses of their serially diluted DNA extracts and most probable number (MPN) calculation. The first method employed a traditional MPN calculation approach. The second method employed a Bayesian hierarchical statistical modeling approach and a Monte Carlo Markov Chain (MCMC) simulation method to account for the uncertainty in these estimates associated with different individual samples of the cell preparations, different dilutions of the DNA extracts and different qPCR analytical runs. The two methods were applied to estimate mean target sequence recoveries per cell from two different lots of a commercially available source of enumerated Enterococcus cell preparations. The mean target sequence recovery estimates (and standard errors) per cell from Lot A and B cell preparations by the Bayesian method were 22.73 (3.4) and 11.76 (2.4), respectively, when the data were adjusted for potential false positive results. Means were similar for the traditional MPN approach which cannot comparably assess uncertainty in the estimates. Cell numbers and estimates of recoverable target sequences in calibrator samples prepared from the two cell sources were also used to estimate cell equivalent and target sequence quantities recovered from surface water samples in a comparative Ct method. Our results illustrate the utility of the Bayesian method in accounting for uncertainty, the high degree of precision attainable by the MPN approach and the need to account for the differences in target sequence recoveries from different calibrator sample cell sources when they are used in the comparative Ct method. Published by Elsevier B.V.
Designing Peptide-Based HIV Vaccine for Chinese
Fan, Xiaojuan
2014-01-01
CD4+ T cells are central to the induction and maintenance of CD8+ T cell and antibody-producing B cell responses, and the latter are essential for the protection against disease in subjects with HIV infection. How to elicit HIV-specific CD4+ T cell responses in a given population using vaccines is one of the major areas of current HIV vaccine research. To design vaccine that targets specifically Chinese, we assembled a database that is comprised of sequences from 821 Chinese HIV isolates and 46 human leukocyte antigen (HLA) DR alleles identified in Chinese population. We then predicted 20 potential HIV epitopes using bioinformatics approaches. The combination of these 20 epitopes has a theoretical coverage of 98.1% of the population for both the prevalent HIV genotypes and also Chinese HLA-DR types. We suggest that testing this vaccine experimentally will facilitate the development of a CD4+ T cell vaccine especially catered for Chinese. PMID:25136573
Designing peptide-based HIV vaccine for Chinese.
Shu, Jiayi; Fan, Xiaojuan; Ping, Jie; Jin, Xia; Hao, Pei
2014-01-01
CD4+ T cells are central to the induction and maintenance of CD8+ T cell and antibody-producing B cell responses, and the latter are essential for the protection against disease in subjects with HIV infection. How to elicit HIV-specific CD4+ T cell responses in a given population using vaccines is one of the major areas of current HIV vaccine research. To design vaccine that targets specifically Chinese, we assembled a database that is comprised of sequences from 821 Chinese HIV isolates and 46 human leukocyte antigen (HLA) DR alleles identified in Chinese population. We then predicted 20 potential HIV epitopes using bioinformatics approaches. The combination of these 20 epitopes has a theoretical coverage of 98.1% of the population for both the prevalent HIV genotypes and also Chinese HLA-DR types. We suggest that testing this vaccine experimentally will facilitate the development of a CD4+ T cell vaccine especially catered for Chinese.
NASA Astrophysics Data System (ADS)
Saylor, Dicy; Lepine, Sebastien; Crossfield, Ian; Petigura, Erik A.
2018-01-01
The K2 mission is targeting large numbers of nearby (d < 100 pc) GKM dwarfs selected from the SUPERBLINK proper motion survey (μ > 40 mas yr‑1, V < 20). Additionally, the mission is targeting low-mass, high proper motion stars associated with the local (d < 500 pc) Galactic halo population also selected from SUPERBLINK. K2 campaigns 0 through 8 monitored a total of 26,518 of these cool main-sequence stars. We used the auto-correlation function to search for fast rotators by identifying short-period photometric modulations in the K2 light curves. We identified 481 candidate fast rotators with rotation periods <4 days that show light-curve modulations consistent with starspots. Their kinematics show low average transverse velocities, suggesting that they are part of the young disk population. A subset (13) of the fast rotators is found among those targets with colors and kinematics consistent with the local Galactic halo population and may represent stars spun up by tidal interactions in close binary systems. We further demonstrate that the M dwarf fast rotators selected from the K2 light curves are significantly more likely to have UV excess and discuss the potential of the K2 mission to identify new nearby young GKM dwarfs on the basis of their fast rotation rates. Finally, we discuss the possible use of local halo stars as fiducial, non-variable sources in the Kepler fields.
Bipartite recognition of target RNAs activates DNA cleavage by the Type III-B CRISPR–Cas system
Elmore, Joshua R.; Sheppard, Nolan F.; Ramia, Nancy; Deighan, Trace; Li, Hong; Terns, Rebecca M.; Terns, Michael P.
2016-01-01
CRISPR–Cas systems eliminate nucleic acid invaders in bacteria and archaea. The effector complex of the Type III-B Cmr system cleaves invader RNAs recognized by the CRISPR RNA (crRNA ) of the complex. Here we show that invader RNAs also activate the Cmr complex to cleave DNA. As has been observed for other Type III systems, Cmr eliminates plasmid invaders in Pyrococcus furiosus by a mechanism that depends on transcription of the crRNA target sequence within the plasmid. Notably, we found that the target RNA per se induces DNA cleavage by the Cmr complex in vitro. DNA cleavage activity does not depend on cleavage of the target RNA but notably does require the presence of a short sequence adjacent to the target sequence within the activating target RNA (rPAM [RNA protospacer-adjacent motif]). The activated complex does not require a target sequence (or a PAM) in the DNA substrate. Plasmid elimination by the P. furiosus Cmr system also does not require the Csx1 (CRISPR-associated Rossman fold [CARF] superfamily) protein. Plasmid silencing depends on the HD nuclease and Palm domains of the Cmr2 (Cas10 superfamily) protein. The results establish the Cmr complex as a novel DNA nuclease activated by invader RNAs containing a crRNA target sequence and a rPAM. PMID:26848045
Sequencing Needs for Viral Diagnostics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S N; Lam, M; Mulakken, N J
2004-01-26
We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
Neuro-immune interactions at barrier surfaces
Veiga-Fernandes, Henrique; Mucida, Daniel
2016-01-01
Multidirectional interactions between the nervous and immune systems have been documented in homeostasis and pathologies ranging from multiple sclerosis to autism, and from leukemia to acute and chronic inflammation. Recent studies have addressed this crosstalk using cell-specific targeting, novel sequencing, imaging and analytical tools, shedding light on unappreciated mechanisms of neuro-immune regulation. This review focuses on neuro-immune interactions at barrier surfaces, mostly the gut, but also including the skin and the airways, areas densely populated by neurons and immune cells that constantly sense and adapt to tissue-specific environmental challenges. PMID:27153494
Personalized Cancer Medicine: Molecular Diagnostics, Predictive biomarkers, and Drug Resistance
Gonzalez de Castro, D; Clarke, P A; Al-Lazikani, B; Workman, P
2013-01-01
The progressive elucidation of the molecular pathogenesis of cancer has fueled the rational development of targeted drugs for patient populations stratified by genetic characteristics. Here we discuss general challenges relating to molecular diagnostics and describe predictive biomarkers for personalized cancer medicine. We also highlight resistance mechanisms for epidermal growth factor receptor (EGFR) kinase inhibitors in lung cancer. We envisage a future requiring the use of longitudinal genome sequencing and other omics technologies alongside combinatorial treatment to overcome cellular and molecular heterogeneity and prevent resistance caused by clonal evolution. PMID:23361103
DNA sequence similarity recognition by hybridization to short oligomers
Milosavljevic, Aleksandar
1999-01-01
Methods are disclosed for the comparison of nucleic acid sequences. Data is generated by hybridizing sets of oligomers with target nucleic acids. The data thus generated is manipulated simultaneously with respect to both (i) matching between oligomers and (ii) matching between oligomers and putative reference sequences available in databases. Using data compression methods to manipulate this mutual information, sequences for the target can be constructed.
Concerted evolution at the population level: pupfish HindIII satellite DNA sequences.
Elder, J F; Turner, B J
1994-01-01
The canonical monomers (approximately 170 bp) of an abundant (1.9 x 10(6) copies per diploid genome) satellite DNA sequence family in the genome of Cyprinodon variegatus, a "pupfish" that ranges along the Atlantic coast from Cape Cod to central Mexico, are divergent in base sequence in 10 of 12 samples collected from natural populations. The divergence involves substitutions, deletions, and insertions, is marked in scope (mean pairwise sequence similarity = 61.6%; range = 35-95.9%), is largely confined to the 3' half of the monomer, and is not correlated with the distance among collecting sites. Repetitive cloning and direct genomic sequencing experiments failed to detect intrapopulation and intraindividual variation, suggesting high levels of sequence homogeneity within populations. The satellite sequence has therefore undergone "concerted evolution," at the level of the local population. Concerted evolution has previously almost always been discussed in terms of the divergence of species or higher taxa; its intraspecific occurrence apparently has not been reported previously. The generality of the observation is difficult to evaluate, for although satellite DNAs from a large number of organisms have been studied in detail, there appear to be little or no other data on their sequence variation in natural populations. The relationship (if any) between concerted, population level, satellite DNA divergence and the extent of gene flow/genetic isolation among conspecific natural populations remains to be established. Images PMID:8302879
Demidov, German; Simakova, Tamara; Vnuchkova, Julia; Bragin, Anton
2016-10-22
Multiplex polymerase chain reaction (PCR) is a common enrichment technique for targeted massive parallel sequencing (MPS) protocols. MPS is widely used in biomedical research and clinical diagnostics as the fast and accurate tool for the detection of short genetic variations. However, identification of larger variations such as structure variants and copy number variations (CNV) is still being a challenge for targeted MPS. Some approaches and tools for structural variants detection were proposed, but they have limitations and often require datasets of certain type, size and expected number of amplicons affected by CNVs. In the paper, we describe novel algorithm for high-resolution germinal CNV detection in the PCR-enriched targeted sequencing data and present accompanying tool. We have developed a machine learning algorithm for the detection of large duplications and deletions in the targeted sequencing data generated with PCR-based enrichment step. We have performed verification studies and established the algorithm's sensitivity and specificity. We have compared developed tool with other available methods applicable for the described data and revealed its higher performance. We showed that our method has high specificity and sensitivity for high-resolution copy number detection in targeted sequencing data using large cohort of samples.
Uncovering the Repertoire of Endogenous Flaviviral Elements in Aedes Mosquito Genomes
Suzuki, Yasutsugu; Frangeul, Lionel; Dickson, Laura B.; Blanc, Hervé; Verdier, Yann; Vinh, Joelle
2017-01-01
ABSTRACT Endogenous viral elements derived from nonretroviral RNA viruses have been described in various animal genomes. Whether they have a biological function, such as host immune protection against related viruses, is a field of intense study. Here, we investigated the repertoire of endogenous flaviviral elements (EFVEs) in Aedes mosquitoes, the vectors of arboviruses such as dengue and chikungunya viruses. Previous studies identified three EFVEs from Aedes albopictus cell lines and one from Aedes aegypti cell lines. However, an in-depth characterization of EFVEs in wild-type mosquito populations and individual mosquitoes in vivo has not been performed. We detected the full-length DNA sequence of the previously described EFVEs and their respective transcripts in several A. albopictus and A. aegypti populations from geographically distinct areas. However, EFVE-derived proteins were not detected by mass spectrometry. Using deep sequencing, we detected the production of PIWI-interacting RNA-like small RNAs, in an antisense orientation, targeting the EFVEs and their flanking regions in vivo. The EFVEs were integrated in repetitive regions of the mosquito genomes, and their flanking sequences varied among mosquito populations. We bioinformatically predicted several new EFVEs from a Vietnamese A. albopictus population and observed variation in the occurrence of those elements among mosquitoes. Phylogenetic analysis of an A. aegypti EFVE suggested that it integrated prior to the global expansion of the species and subsequently diverged among and within populations. The findings of this study together reveal the substantial structural and nucleotide diversity of flaviviral integrations in Aedes genomes. Unraveling this diversity will help to elucidate the potential biological function of these EFVEs. IMPORTANCE Endogenous viral elements (EVEs) are whole or partial viral sequences integrated in host genomes. Interestingly, some EVEs have important functions for host fitness and antiviral defense. Because mosquitoes also have EVEs in their genomes, characterizing these EVEs is a prerequisite for their potential use to manipulate the mosquito antiviral response. In the study described here, we focused on EVEs related to the Flavivirus genus, to which dengue and Zika viruses belong, in individual Aedes mosquitoes from geographically distinct areas. We show the existence in vivo of flaviviral EVEs previously identified in mosquito cell lines, and we detected new ones. We show that EVEs have evolved differently in each mosquito population. They produce transcripts and small RNAs but not proteins, suggesting a function at the RNA level. Our study uncovers the diverse repertoire of flaviviral EVEs in Aedes mosquito populations and contributes to an understanding of their role in the host antiviral system. PMID:28539440
Uncovering the Repertoire of Endogenous Flaviviral Elements in Aedes Mosquito Genomes.
Suzuki, Yasutsugu; Frangeul, Lionel; Dickson, Laura B; Blanc, Hervé; Verdier, Yann; Vinh, Joelle; Lambrechts, Louis; Saleh, Maria-Carla
2017-08-01
Endogenous viral elements derived from nonretroviral RNA viruses have been described in various animal genomes. Whether they have a biological function, such as host immune protection against related viruses, is a field of intense study. Here, we investigated the repertoire of endogenous flaviviral elements (EFVEs) in Aedes mosquitoes, the vectors of arboviruses such as dengue and chikungunya viruses. Previous studies identified three EFVEs from Aedes albopictus cell lines and one from Aedes aegypti cell lines. However, an in-depth characterization of EFVEs in wild-type mosquito populations and individual mosquitoes in vivo has not been performed. We detected the full-length DNA sequence of the previously described EFVEs and their respective transcripts in several A. albopictus and A. aegypti populations from geographically distinct areas. However, EFVE-derived proteins were not detected by mass spectrometry. Using deep sequencing, we detected the production of PIWI-interacting RNA-like small RNAs, in an antisense orientation, targeting the EFVEs and their flanking regions in vivo The EFVEs were integrated in repetitive regions of the mosquito genomes, and their flanking sequences varied among mosquito populations. We bioinformatically predicted several new EFVEs from a Vietnamese A. albopictus population and observed variation in the occurrence of those elements among mosquitoes. Phylogenetic analysis of an A. aegypti EFVE suggested that it integrated prior to the global expansion of the species and subsequently diverged among and within populations. The findings of this study together reveal the substantial structural and nucleotide diversity of flaviviral integrations in Aedes genomes. Unraveling this diversity will help to elucidate the potential biological function of these EFVEs. IMPORTANCE Endogenous viral elements (EVEs) are whole or partial viral sequences integrated in host genomes. Interestingly, some EVEs have important functions for host fitness and antiviral defense. Because mosquitoes also have EVEs in their genomes, characterizing these EVEs is a prerequisite for their potential use to manipulate the mosquito antiviral response. In the study described here, we focused on EVEs related to the Flavivirus genus, to which dengue and Zika viruses belong, in individual Aedes mosquitoes from geographically distinct areas. We show the existence in vivo of flaviviral EVEs previously identified in mosquito cell lines, and we detected new ones. We show that EVEs have evolved differently in each mosquito population. They produce transcripts and small RNAs but not proteins, suggesting a function at the RNA level. Our study uncovers the diverse repertoire of flaviviral EVEs in Aedes mosquito populations and contributes to an understanding of their role in the host antiviral system. Copyright © 2017 Suzuki et al.
Brett, Maggie; McPherson, John; Zang, Zhi Jiang; Lai, Angeline; Tan, Ee-Shien; Ng, Ivy; Ong, Lai-Choo; Cham, Breana; Tan, Patrick; Rozen, Steve; Tan, Ene-Choo
2014-01-01
Developmental delay and/or intellectual disability (DD/ID) affects 1–3% of all children. At least half of these are thought to have a genetic etiology. Recent studies have shown that massively parallel sequencing (MPS) using a targeted gene panel is particularly suited for diagnostic testing for genetically heterogeneous conditions. We report on our experiences with using massively parallel sequencing of a targeted gene panel of 355 genes for investigating the genetic etiology of eight patients with a wide range of phenotypes including DD/ID, congenital anomalies and/or autism spectrum disorder. Targeted sequence enrichment was performed using the Agilent SureSelect Target Enrichment Kit and sequenced on the Illumina HiSeq2000 using paired-end reads. For all eight patients, 81–84% of the targeted regions achieved read depths of at least 20×, with average read depths overlapping targets ranging from 322× to 798×. Causative variants were successfully identified in two of the eight patients: a nonsense mutation in the ATRX gene and a canonical splice site mutation in the L1CAM gene. In a third patient, a canonical splice site variant in the USP9X gene could likely explain all or some of her clinical phenotypes. These results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes. However, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism. PMID:24690944
Artificial neural network study on organ-targeting peptides
NASA Astrophysics Data System (ADS)
Jung, Eunkyoung; Kim, Junhyoung; Choi, Seung-Hoon; Kim, Minkyoung; Rhee, Hokyoung; Shin, Jae-Min; Choi, Kihang; Kang, Sang-Kee; Lee, Nam Kyung; Choi, Yun-Jaie; Jung, Dong Hyun
2010-01-01
We report a new approach to studying organ targeting of peptides on the basis of peptide sequence information. The positive control data sets consist of organ-targeting peptide sequences identified by the peroral phage-display technique for four organs, and the negative control data are prepared from random sequences. The capacity of our models to make appropriate predictions is validated by statistical indicators including sensitivity, specificity, enrichment curve, and the area under the receiver operating characteristic (ROC) curve (the ROC score). VHSE descriptor produces statistically significant training models and the models with simple neural network architectures show slightly greater predictive power than those with complex ones. The training and test set statistics indicate that our models could discriminate between organ-targeting and random sequences. We anticipate that our models will be applicable to the selection of organ-targeting peptides for generating peptide drugs or peptidomimetics.
Ellingford, Jamie M; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G; Sergouniotis, Panagiotis I; O'Sullivan, James; Lamb, Janine A; Perveen, Rahat; Hall, Georgina; Newman, William G; Bishop, Paul N; Roberts, Stephen A; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C; Nemeth, Andrea H; Black, Graeme C M
2016-05-01
To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Case series. A total of 562 patients diagnosed with IRD. We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Diagnostic yield of genomic testing. Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15-45) uplift in diagnostic yield. We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Razmara, Ehsan; Bitarafan, Fatemeh; Esmaeilzadeh-Gharehdaghi, Elika; Almadani, Navid; Garshasbi, Masoud
2018-03-01
Targeted next-generation sequencing (NGS) provides a consequential opportunity to elucidate genetic factors in known diseases, particularly in profoundly heterogeneous disorders such as non-syndromic hearing loss (NSHL). Hearing impairments could be classified into syndromic and non-syndromic types. This study intended to assess the significance of mutations in these genes to the autosomal recessive/dominant non-syndromic genetic load among Iranian families. Two families were involved in this research and two patients were examined by targeted next-generation sequencing. Here we report two novel mutations in the MYO7A and EYA1 genes in two patients detected by targeted NGS. They were confirmed by Sanger sequencing and quantitative real-time PCR techniques. In this investigation, we identified a novel mutation in MYO7A , c.3751G>C, p.A1251P, along with another previously identified mutation (c.1708C>T) in one of the cases. This mutation is located in the MYTH4 protein domain which is a pivotal domain for the myosin function. Another finding in this research was a novel de-novo deletion which deletes the entire EYA1 coding region (EX1-18 DEL). Mutations in EYA1 gene have been found in branchiootorenal (BOR) syndrome. Interestingly the patient with EYA1 deletion did not show any other additional clinical implications apart from HL. This finding might argue for the sole involvement of EYA1 function in the mechanism of hearing. This investigation exhibited that the novel mutations in MYO7A , c.3751G>C, p.A1251P, and EYA1 , EX1-18 DEL, were associated with NSHL. Our research increased the mutation spectrum of hearing loss in the Iranian population.
Nematode.net update 2011: addition of data sets and tools featuring next-generation sequencing data
Martin, John; Abubucker, Sahar; Heizer, Esley; Taylor, Christina M.; Mitreva, Makedonka
2012-01-01
Nematode.net (http://nematode.net) has been a publicly available resource for studying nematodes for over a decade. In the past 3 years, we reorganized Nematode.net to provide more user-friendly navigation through the site, a necessity due to the explosion of data from next-generation sequencing platforms. Organism-centric portals containing dynamically generated data are available for over 56 different nematode species. Next-generation data has been added to the various data-mining portals hosted, including NemaBLAST and NemaBrowse. The NemaPath metabolic pathway viewer builds associations using KOs, rather than ECs to provide more accurate and fine-grained descriptions of proteins. Two new features for data analysis and comparative genomics have been added to the site. NemaSNP enables the user to perform population genetics studies in various nematode populations using next-generation sequencing data. HelmCoP (Helminth Control and Prevention) as an independent component of Nematode.net provides an integrated resource for storage, annotation and comparative genomics of helminth genomes to aid in learning more about nematode genomes, as well as drug, pesticide, vaccine and drug target discovery. With this update, Nematode.net will continue to realize its original goal to disseminate diverse bioinformatic data sets and provide analysis tools to the broad scientific community in a useful and user-friendly manner. PMID:22139919
Aquatic environmental DNA detects seasonal fish abundance and habitat preference in an urban estuary
Soboleva, Lyubov; Charlop-Powers, Zachary
2017-01-01
The difficulty of censusing marine animal populations hampers effective ocean management. Analyzing water for DNA traces shed by organisms may aid assessment. Here we tested aquatic environmental DNA (eDNA) as an indicator of fish presence in the lower Hudson River estuary. A checklist of local marine fish and their relative abundance was prepared by compiling 12 traditional surveys conducted between 1988–2015. To improve eDNA identification success, 31 specimens representing 18 marine fish species were sequenced for two mitochondrial gene regions, boosting coverage of the 12S eDNA target sequence to 80% of local taxa. We collected 76 one-liter shoreline surface water samples at two contrasting estuary locations over six months beginning in January 2016. eDNA was amplified with vertebrate-specific 12S primers. Bioinformatic analysis of amplified DNA, using a reference library of GenBank and our newly generated 12S sequences, detected most (81%) locally abundant or common species and relatively few (23%) uncommon taxa, and corresponded to seasonal presence and habitat preference as determined by traditional surveys. Approximately 2% of fish reads were commonly consumed species that are rare or absent in local waters, consistent with wastewater input. Freshwater species were rarely detected despite Hudson River inflow. These results support further exploration and suggest eDNA will facilitate fine-scale geographic and temporal mapping of marine fish populations at relatively low cost. PMID:28403183
The eukaryotic signal sequence, YGRL, targets the chlamydial inclusion
Kabeiseman, Emily J.; Cichos, Kyle H.; Moore, Elizabeth R.
2014-01-01
Understanding how host proteins are targeted to pathogen-specified organelles, like the chlamydial inclusion, is fundamentally important to understanding the biogenesis of these unique subcellular compartments and how they maintain autonomy within the cell. Syntaxin 6, which localizes to the chlamydial inclusion, contains an YGRL signal sequence. The YGRL functions to return syntaxin 6 to the trans-Golgi from the plasma membrane, and deletion of the YGRL signal sequence from syntaxin 6 also prevents the protein from localizing to the chlamydial inclusion. YGRL is one of three YXXL (YGRL, YQRL, and YKGL) signal sequences which target proteins to the trans-Golgi. We designed various constructs of eukaryotic proteins to test the specificity and propensity of YXXL sequences to target the inclusion. The YGRL signal sequence redirects proteins (e.g., Tgn38, furin, syntaxin 4) that normally do not localize to the chlamydial inclusion. Further, the requirement of the YGRL signal sequence for syntaxin 6 localization to inclusions formed by different species of Chlamydia is conserved. These data indicate that there is an inherent property of the chlamydial inclusion, which allows it to recognize the YGRL signal sequence. To examine whether this “inherent property” was protein or lipid in nature, we asked if deletion of the YGRL signal sequence from syntaxin 6 altered the ability of the protein to interact with proteins or lipids. Deletion or alteration of the YGRL from syntaxin 6 does not appreciably impact syntaxin 6-protein interactions, but does decrease syntaxin 6-lipid interactions. Intriguingly, data also demonstrate that YKGL or YQRL can successfully substitute for YGRL in localization of syntaxin 6 to the chlamydial inclusion. Importantly and for the first time, we are establishing that a eukaryotic signal sequence targets the chlamydial inclusion. PMID:25309881
Brannon, A Rose; Vakiani, Efsevia; Sylvester, Brooke E; Scott, Sasinya N; McDermott, Gregory; Shah, Ronak H; Kania, Krishan; Viale, Agnes; Oschwald, Dayna M; Vacic, Vladimir; Emde, Anne-Katrin; Cercek, Andrea; Yaeger, Rona; Kemeny, Nancy E; Saltz, Leonard B; Shia, Jinru; D'Angelica, Michael I; Weiser, Martin R; Solit, David B; Berger, Michael F
2014-08-28
Colorectal cancer is the second leading cause of cancer death in the United States, with over 50,000 deaths estimated in 2014. Molecular profiling for somatic mutations that predict absence of response to anti-EGFR therapy has become standard practice in the treatment of metastatic colorectal cancer; however, the quantity and type of tissue available for testing is frequently limited. Further, the degree to which the primary tumor is a faithful representation of metastatic disease has been questioned. As next-generation sequencing technology becomes more widely available for clinical use and additional molecularly targeted agents are considered as treatment options in colorectal cancer, it is important to characterize the extent of tumor heterogeneity between primary and metastatic tumors. We performed deep coverage, targeted next-generation sequencing of 230 key cancer-associated genes for 69 matched primary and metastatic tumors and normal tissue. Mutation profiles were 100% concordant for KRAS, NRAS, and BRAF, and were highly concordant for recurrent alterations in colorectal cancer. Additionally, whole genome sequencing of four patient trios did not reveal any additional site-specific targetable alterations. Colorectal cancer primary tumors and metastases exhibit high genomic concordance. As current clinical practices in colorectal cancer revolve around KRAS, NRAS, and BRAF mutation status, diagnostic sequencing of either primary or metastatic tissue as available is acceptable for most patients. Additionally, consistency between targeted sequencing and whole genome sequencing results suggests that targeted sequencing may be a suitable strategy for clinical diagnostic applications.
Mechanism of resistance to mesotrione in an Amaranthus tuberculatus population from Nebraska, USA
Hutchings, Sarah-Jane; Dale, Richard P.; Howell, Anushka; Morris, James A.; Kramer, Vance C.; Shivrain, Vinod K.; Mcindoe, Eddie
2017-01-01
Amaranthus tuberculatus is a troublesome weed in corn and soybean production systems in Midwestern USA, due in part to its ability to evolve multiple resistance to key herbicides including 4-hydroxyphenylpyruvate dioxygenase (HPPD). Here we have investigated the mechanism of resistance to mesotrione, an important chemical for managing broadleaf weeds in corn, in a multiple herbicide resistant population (NEB) from Nebraska. NEB showed a 2.4-fold and 45-fold resistance increase to mesotrione compared to a standard sensitive population (SEN) in pre-emergence and post-emergence dose-response pot tests, respectively. Sequencing of the whole HPPD gene from 12 each of sensitive and resistant plants did not detect any target-site mutations that could be associated with post-emergence resistance to mesotrione in NEB. Resistance was not due to HPPD gene duplication or over-expression before or after herbicide treatment, as revealed by qPCR. Additionally, no difference in mesotrione uptake was detected between NEB and SEN. In contrast, higher levels of mesotrione metabolism via 4-hydroxylation of the dione ring were observed in NEB compared to the sensitive population. Overall, the NEB population was characterised by lower levels of parent mesotrione exported to other parts of the plant, either as a consequence of metabolism in the treated leaves and/or impaired translocation of the herbicide. This study demonstrates another case of non-target-site based resistance to an important class of herbicides in an A. tuberculatus population. The knowledge generated here will help design strategies for managing multiple herbicide resistance in this problematic weed species. PMID:28662111
Widespread genetic heterogeneity in multiple myeloma: implications for targeted therapy
Lohr, Jens G.; Stojanov, Petar; Carter, Scott L.; Cruz-Gordillo, Peter; Lawrence, Michael S.; Auclair, Daniel; Sougnez, Carrie; Knoechel, Birgit; Gould, Joshua; Saksena, Gordon; Cibulskis, Kristian; McKenna, Aaron; Chapman, Michael A.; Straussman, Ravid; Levy, Joan; Perkins, Louise M.; Keats, Jonathan J.; Schumacher, Steven E.; Rosenberg, Mara; Getz, Gad
2014-01-01
SUMMARY We performed massively parallel sequencing of paired tumor/normal samples from 203 multiple myeloma (MM) patients and identified significantly mutated genes and copy number alterations, and discovered putative tumor suppressor genes by determining homozygous deletions and loss-of-heterozygosity. We observed frequent mutations in KRAS (particularly in previously treated patients), NRAS, BRAF, FAM46C, TP53 and DIS3 (particularly in non-hyperdiploid MM). Mutations were often present in subclonal populations, and multiple mutations within the same pathway (e.g. KRAS, NRAS and BRAF) were observed in the same patient. In vitro modeling predicts only partial treatment efficacy of targeting subclonal mutations, and even growth promotion of non-mutated subclones in some cases. These results emphasize the importance of heterogeneity analysis for treatment decisions. PMID:24434212
Widespread genetic heterogeneity in multiple myeloma: implications for targeted therapy.
Lohr, Jens G; Stojanov, Petar; Carter, Scott L; Cruz-Gordillo, Peter; Lawrence, Michael S; Auclair, Daniel; Sougnez, Carrie; Knoechel, Birgit; Gould, Joshua; Saksena, Gordon; Cibulskis, Kristian; McKenna, Aaron; Chapman, Michael A; Straussman, Ravid; Levy, Joan; Perkins, Louise M; Keats, Jonathan J; Schumacher, Steven E; Rosenberg, Mara; Getz, Gad; Golub, Todd R
2014-01-13
We performed massively parallel sequencing of paired tumor/normal samples from 203 multiple myeloma (MM) patients and identified significantly mutated genes and copy number alterations and discovered putative tumor suppressor genes by determining homozygous deletions and loss of heterozygosity. We observed frequent mutations in KRAS (particularly in previously treated patients), NRAS, BRAF, FAM46C, TP53, and DIS3 (particularly in nonhyperdiploid MM). Mutations were often present in subclonal populations, and multiple mutations within the same pathway (e.g., KRAS, NRAS, and BRAF) were observed in the same patient. In vitro modeling predicts only partial treatment efficacy of targeting subclonal mutations, and even growth promotion of nonmutated subclones in some cases. These results emphasize the importance of heterogeneity analysis for treatment decisions. Copyright © 2014 Elsevier Inc. All rights reserved.
Project Team, Saudi Genome
2015-01-01
Oil wells, endless deserts, stifling heat, masses of pilgrims, and wealthy-looking urban areas still dominate the widespread mental image of Saudi Arabia. Currently, this image is being extended to include a recent endeavor that is reserving a global share in the limelight as one of the top ten genomics projects currently underway: the Saudi Human Genome Program (SHGP). With sound funding, dedicated resources, and national determination, the SHGP targets the sequencing of 100,000 human genomes over the next five years to conduct world-class genomics-based biomedical research in the Saudi population. Why this project was conceived and thought to be feasible, what is the ultimate target, and how it operates are the questions we answer in this article.
Metagenomics: A new horizon in cancer research
Banerjee, Joyita; Mishra, Neetu; Dhas, Yogita
2015-01-01
Metagenomics has broadened the scope of targeting microbes responsible for inducing various types of cancers. About 16.1% of cancers are associated with microbial infection. Metagenomics is an equitable way of identifying and studying micro-organisms within their habitat. In cancer research, this approach has revolutionized the way of identifying, analyzing and targeting the microbial diversity present in the tissue specimens of cancer patients. The genomic analyses of these micro-organisms through next generation sequencing techniques invariably facilitate in recognizing the microbial population in biopsies and their evolutionary relationships with each other. In this review an attempt has been made to generate current metagenomic view on cancer microbiota. Different types of micro-organisms have been found to be linked to various types of cancers, thus, contributing significantly in understanding the disease at molecular level. PMID:26110115
Delavenne, E; Mounier, J; Déniel, F; Barbier, G; Le Blay, G
2012-04-16
Antifungal lactic acid bacteria (ALAB) biodiversity was evaluated in raw milk from ewe, cow and goat over one year period. Lactic acid bacteria were enumerated using 8 semi-selective media, and systematically screened for their antifungal activity against 4 spoilage fungi commonly encountered in dairy products. Depending on the selective medium, between 0.05% (Elliker agar) and 5.5% (LAMVAB agar) screened colonies showed an antifungal activity. The great majority of these active colonies originated from cow (49%) and goat (43%) milks, whereas only 8% were isolated from ewe milk. Penicillium expansum was the most frequently inhibited fungus with 48.5% of colonies active against P. expansum among the 1235 isolated, followed by Mucor plumbeus with 30.6% of active colonies, Kluyveromyces lactis with only 12.1% of active colonies and Pichia anomala with 8.7% of active colonies. In the tested conditions, 94% of the sequenced active colonies belonged to Lactobacillus. Among them, targeted fungal species differed according to the Lactobacillus group, whose presence largely depended on year period and milk origin. The Lb. casei and Lb. reuteri groups, predominantly recovered in summer/fall, were overrepresented in the population targeting M. plumbeus, whereas isolates from the Lb. plantarum group, predominantly recovered in spring, were overrepresented in the population targeting K. lactis, the ones belonging to the Lb. buchneri group, predominantly recovered in spring, were overrepresented in the population targeting P. anomala. Raw milk, especially cow and goat milks from the summer/fall period appeared to be a productive reservoir for antifungal lactobacilli. Copyright © 2012 Elsevier B.V. All rights reserved.
Kaundun, Shiv Shankhar; Bailly, Geraldine C; Dale, Richard P; Hutchings, Sarah-Jane; McIndoe, Eddie
2013-01-01
Acetyl-CoA carboxylase (ACCase) inhibiting herbicides are important products for the post-emergence control of grass weed species in small grain cereal crops. However, the appearance of resistance to ACCase herbicides over time has resulted in limited options for effective weed control of key species such as Lolium spp. In this study, we have used an integrated biological and molecular biology approach to investigate the mechanism of resistance to ACCase herbicides in a Lolium multiflorum Lam. from the UK (UK21). The study revealed a novel tryptophan to serine mutation at ACCase codon position 1999 impacting on ACCase inhibiting herbicides to varying degrees. The W1999S mutation confers dominant resistance to pinoxaden and partially recessive resistance to cycloxydim and sethoxydim. On the other hand, plants containing the W1999S mutation were sensitive to clethodim and tepraloxydim. Additionally population UK21 is characterised by other resistance mechanisms, very likely non non-target site based, affecting several aryloxyphenoxyproprionate (FOP) herbicides but not the practical field rate of pinoxaden. The positive identification of wild type tryptophan and mutant serine alleles at ACCase position 1999 could be readily achieved with an original DNA based derived cleaved amplified polymorphic sequence (dCAPS) assay that uses the same PCR product but two different enzymes for positively identifying the wild type tryptophan and mutant serine alleles identified here. This paper highlights intrinsic differences between ACCase inhibiting herbicides that could be exploited for controlling ryegrass populations such as UK21 characterised by compound-specific target site and non-target site resistance.
Kaundun, Shiv Shankhar; Bailly, Geraldine C.; Dale, Richard P.; Hutchings, Sarah-Jane; McIndoe, Eddie
2013-01-01
Background Acetyl-CoA carboxylase (ACCase) inhibiting herbicides are important products for the post-emergence control of grass weed species in small grain cereal crops. However, the appearance of resistance to ACCase herbicides over time has resulted in limited options for effective weed control of key species such as Lolium spp. In this study, we have used an integrated biological and molecular biology approach to investigate the mechanism of resistance to ACCase herbicides in a Lolium multiflorum Lam. from the UK (UK21). Methodology/Principal Findings The study revealed a novel tryptophan to serine mutation at ACCase codon position 1999 impacting on ACCase inhibiting herbicides to varying degrees. The W1999S mutation confers dominant resistance to pinoxaden and partially recessive resistance to cycloxydim and sethoxydim. On the other hand, plants containing the W1999S mutation were sensitive to clethodim and tepraloxydim. Additionally population UK21 is characterised by other resistance mechanisms, very likely non non-target site based, affecting several aryloxyphenoxyproprionate (FOP) herbicides but not the practical field rate of pinoxaden. The positive identification of wild type tryptophan and mutant serine alleles at ACCase position 1999 could be readily achieved with an original DNA based derived cleaved amplified polymorphic sequence (dCAPS) assay that uses the same PCR product but two different enzymes for positively identifying the wild type tryptophan and mutant serine alleles identified here. Conclusion/Significance This paper highlights intrinsic differences between ACCase inhibiting herbicides that could be exploited for controlling ryegrass populations such as UK21 characterised by compound-specific target site and non-target site resistance. PMID:23469130
Tsetsarkin, Konstantin A; Liu, Guangping; Kenney, Heather; Hermance, Meghan; Thangamani, Saravanan; Pletnev, Alexander G
2016-09-13
Tick-borne viruses include medically important zoonotic pathogens that can cause life-threatening diseases. Unlike mosquito-borne viruses, whose impact can be restrained via mosquito population control programs, for tick-borne viruses only vaccination remains the reliable means of disease prevention. For live vaccine viruses a concern exists, that spillovers from viremic vaccinees could result in introduction of genetically modified viruses into sustainable tick-vertebrate host transmission cycle in nature. To restrict tick-borne flavivirus (Langat virus, LGTV) vector tropism, we inserted target sequences for tick-specific microRNAs (mir-1, mir-275 and mir-279) individually or in combination into several distant regions of LGTV genome. This caused selective attenuation of viral replication in tick-derived cells. LGTV expressing combinations of target sequences for tick- and vertebrate CNS-specific miRNAs were developed. The resulting viruses replicated efficiently and remained stable in simian Vero cells, which do not express these miRNAs, however were severely restricted to replicate in tick-derived cells. In addition, simultaneous dual miRNA targeting led to silencing of virus replication in live Ixodes ricinus ticks and abolished virus neurotropism in highly permissive newborn mice. The concurrent restriction of adverse replication events in vertebrate and invertebrate hosts will, therefore, ensure the environmental safety of live tick-borne virus vaccine candidates.
Leighton, Philip A; Schusser, Benjamin; Yi, Henry; Glanville, Jacob; Harriman, William
2015-01-01
Chicken immune responses to human proteins are often more robust than rodent responses because of the phylogenetic relationship between the different species. For discovery of a diverse panel of unique therapeutic antibody candidates, chickens therefore represent an attractive host for human-derived targets. Recent advances in monoclonal antibody technology, specifically new methods for the molecular cloning of antibody genes directly from primary B cells, has ushered in a new era of generating monoclonal antibodies from non-traditional host animals that were previously inaccessible through hybridoma technology. However, such monoclonals still require post-discovery humanization in order to be developed as therapeutics. To obviate the need for humanization, a modified strain of chickens could be engineered to express a human-sequence immunoglobulin variable region repertoire. Here, human variable genes introduced into the chicken immunoglobulin loci through gene targeting were evaluated for their ability to be recognized and diversified by the native chicken recombination machinery that is present in the B-lineage cell line DT40. After expansion in culture the DT40 population accumulated genetic mutants that were detected via deep sequencing. Bioinformatic analysis revealed that the human targeted constructs are performing as expected in the cell culture system, and provide a measure of confidence that they will be functional in transgenic animals.
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing
Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi
2016-01-01
Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
A long-term target detection approach in infrared image sequence
NASA Astrophysics Data System (ADS)
Li, Hang; Zhang, Qi; Li, Yuanyuan; Wang, Liqiang
2015-12-01
An automatic target detection method used in long term infrared (IR) image sequence from a moving platform is proposed. Firstly, based on non-linear histogram equalization, target candidates are coarse-to-fine segmented by using two self-adapt thresholds generated in the intensity space. Then the real target is captured via two different selection approaches. At the beginning of image sequence, the genuine target with litter texture is discriminated from other candidates by using contrast-based confidence measure. On the other hand, when the target becomes larger, we apply online EM method to iteratively estimate and update the distributions of target's size and position based on the prior detection results, and then recognize the genuine one which satisfies both the constraints of size and position. Experimental results demonstrate that the presented method is accurate, robust and efficient.
Mok, Calvin A; Au, Vinci; Thompson, Owen A; Edgley, Mark L; Gevirtzman, Louis; Yochem, John; Lowry, Joshua; Memar, Nadin; Wallenfang, Matthew R; Rasoloson, Dominique; Bowerman, Bruce; Schnabel, Ralf; Seydoux, Geraldine; Moerman, Donald G; Waterston, Robert H
2017-10-01
Mutants remain a powerful means for dissecting gene function in model organisms such as Caenorhabditis elegans Massively parallel sequencing has simplified the detection of variants after mutagenesis but determining precisely which change is responsible for phenotypic perturbation remains a key step. Genetic mapping paradigms in C . elegans rely on bulk segregant populations produced by crosses with the problematic Hawaiian wild isolate and an excess of redundant information from whole-genome sequencing (WGS). To increase the repertoire of available mutants and to simplify identification of the causal change, we performed WGS on 173 temperature-sensitive (TS) lethal mutants and devised a novel mapping method. The mapping method uses molecular inversion probes (MIP-MAP) in a targeted sequencing approach to genetic mapping, and replaces the Hawaiian strain with a Million Mutation Project strain with high genomic and phenotypic similarity to the laboratory wild-type strain N2 We validated MIP-MAP on a subset of the TS mutants using a competitive selection approach to produce TS candidate mapping intervals with a mean size < 3 Mb. MIP-MAP successfully uses a non-Hawaiian mapping strain and multiplexed libraries are sequenced at a fraction of the cost of WGS mapping approaches. Our mapping results suggest that the collection of TS mutants contains a diverse library of TS alleles for genes essential to development and reproduction. MIP-MAP is a robust method to genetically map mutations in both viable and essential genes and should be adaptable to other organisms. It may also simplify tracking of individual genotypes within population mixtures. Copyright © 2017 by the Genetics Society of America.
Mok, Calvin A.; Au, Vinci; Thompson, Owen A.; Edgley, Mark L.; Gevirtzman, Louis; Yochem, John; Lowry, Joshua; Memar, Nadin; Wallenfang, Matthew R.; Rasoloson, Dominique; Bowerman, Bruce; Schnabel, Ralf; Seydoux, Geraldine; Moerman, Donald G.; Waterston, Robert H.
2017-01-01
Mutants remain a powerful means for dissecting gene function in model organisms such as Caenorhabditis elegans. Massively parallel sequencing has simplified the detection of variants after mutagenesis but determining precisely which change is responsible for phenotypic perturbation remains a key step. Genetic mapping paradigms in C. elegans rely on bulk segregant populations produced by crosses with the problematic Hawaiian wild isolate and an excess of redundant information from whole-genome sequencing (WGS). To increase the repertoire of available mutants and to simplify identification of the causal change, we performed WGS on 173 temperature-sensitive (TS) lethal mutants and devised a novel mapping method. The mapping method uses molecular inversion probes (MIP-MAP) in a targeted sequencing approach to genetic mapping, and replaces the Hawaiian strain with a Million Mutation Project strain with high genomic and phenotypic similarity to the laboratory wild-type strain N2. We validated MIP-MAP on a subset of the TS mutants using a competitive selection approach to produce TS candidate mapping intervals with a mean size < 3 Mb. MIP-MAP successfully uses a non-Hawaiian mapping strain and multiplexed libraries are sequenced at a fraction of the cost of WGS mapping approaches. Our mapping results suggest that the collection of TS mutants contains a diverse library of TS alleles for genes essential to development and reproduction. MIP-MAP is a robust method to genetically map mutations in both viable and essential genes and should be adaptable to other organisms. It may also simplify tracking of individual genotypes within population mixtures. PMID:28827289
Li, Ming; Wang, Rui; Xiang, Hua
2014-01-01
The prokaryotic immune system CRISPR/Cas (Clustered Regularly Interspaced Short Palindromic Repeats/CRISPR-associated genes) adapts to foreign invaders by acquiring their short deoxyribonucleic acid (DNA) fragments as spacers, which guide subsequent interference to foreign nucleic acids based on sequence matching. The adaptation mechanism avoiding acquiring ‘self’ DNA fragments is poorly understood. In Haloarcula hispanica, we previously showed that CRISPR adaptation requires being primed by a pre-existing spacer partially matching the invader DNA. Here, we further demonstrate that flanking a fully-matched target sequence, a functional PAM (protospacer adjacent motif) is still required to prime adaptation. Interestingly, interference utilizes only four PAM sequences, whereas adaptation-priming tolerates as many as 23 PAM sequences. This relaxed PAM selectivity explains how adaptation-priming maximizes its tolerance of PAM mutations (that escape interference) while avoiding mis-targeting the spacer DNA within CRISPR locus. We propose that the primed adaptation, which hitches and cooperates with the interference pathway, distinguishes target from non-target by CRISPR ribonucleic acid guidance and PAM recognition. PMID:24803673
Non-Adjacent Consonant Sequence Patterns in English Target Words during the First-Word Period
ERIC Educational Resources Information Center
Aoyama, Katsura; Davis, Barbara L.
2017-01-01
The goal of this study was to investigate non-adjacent consonant sequence patterns in target words during the first-word period in infants learning American English. In the spontaneous speech of eighteen participants, target words with a Consonant-Vowel-Consonant (C[subscript 1]VC[subscript 2]) shape were analyzed. Target words were grouped into…
Qiu, Ping; Stevens, Richard; Wei, Bo; Lahser, Fred; Howe, Anita Y. M.; Klappenbach, Joel A.; Marton, Matthew J.
2015-01-01
Genotyping of hepatitis C virus (HCV) plays an important role in the treatment of HCV. As new genotype-specific treatment options become available, it has become increasingly important to have accurate HCV genotype and subtype information to ensure that the most appropriate treatment regimen is selected. Most current genotyping methods are unable to detect mixed genotypes from two or more HCV infections. Next generation sequencing (NGS) allows for rapid and low cost mass sequencing of viral genomes and provides an opportunity to probe the viral population from a single host. In this paper, the possibility of using short NGS reads for direct HCV genotyping without genome assembly was evaluated. We surveyed the publicly-available genetic content of three HCV drug target regions (NS3, NS5A, NS5B) in terms of whether these genes contained genotype-specific regions that could predict genotype. Six genotypes and 38 subtypes were included in this study. An automated phylogenetic analysis based HCV genotyping method was implemented and used to assess different HCV target gene regions. Candidate regions of 250-bp each were found for all three genes that have enough genetic information to predict HCV genotypes/subtypes. Validation using public datasets shows 100% genotyping accuracy. To test whether these 250-bp regions were sufficient to identify mixed genotypes, we developed a random primer-based method to sequence HCV plasma samples containing mixtures of two HCV genotypes in different ratios. We were able to determine the genotypes without ambiguity and to quantify the ratio of the abundances of the mixed genotypes in the samples. These data provide a proof-of-concept that this random primed, NGS-based short-read genotyping approach does not need prior information about the viral population and is capable of detecting mixed viral infection. PMID:25830316
Delport, Tiffany C.; Asher, Amy J.; Beaumont, Linda J.; Webster, Koa N.; Harcourt, Robert G.; Power, Michelle L.
2014-01-01
Giardia and Cryptosporidium are amongst the most common protozoan parasites identified as causing enteric disease in pinnipeds. A number of Giardia assemblages and Cryptosporidium species and genotypes are common in humans and terrestrial mammals and have also been identified in marine mammals. To investigate the occurrence of these parasites in an endangered marine mammal, the Australian sea lion (Neophoca cinerea), genomic DNA was extracted from faecal samples collected from wild populations (n = 271) in Southern and Western Australia and three Australian captive populations (n = 19). These were screened using PCR targeting the 18S rRNA of Giardia and Cryptosporidium. Giardia duodenalis was detected in 28 wild sea lions and in seven captive individuals. Successful sequencing of the 18S rRNA gene assigned 27 Giardia isolates to assemblage B and one to assemblage A, both assemblages commonly found in humans. Subsequent screening at the gdh and β-giardin loci resulted in amplification of only one of the 35 18S rRNA positive samples at the β-giardin locus. Sequencing at the β-giardin locus assigned the assemblage B 18S rRNA confirmed isolate to assemblage AI. The geographic distribution of sea lion populations sampled in relation to human settlements indicated that Giardia presence in sea lions was highest in populations less than 25 km from humans. Cryptosporidium was not detected by PCR screening in either wild colonies or captive sea lion populations. These data suggest that the presence of G. duodenalis in the endangered Australian sea lion is likely the result of dispersal from human sources. Multilocus molecular analyses are essential for the determination of G. duodenalis assemblages and subsequent inferences on transmission routes to endangered marine mammal populations. PMID:25426423
Delport, Tiffany C; Asher, Amy J; Beaumont, Linda J; Webster, Koa N; Harcourt, Robert G; Power, Michelle L
2014-12-01
Giardia and Cryptosporidium are amongst the most common protozoan parasites identified as causing enteric disease in pinnipeds. A number of Giardia assemblages and Cryptosporidium species and genotypes are common in humans and terrestrial mammals and have also been identified in marine mammals. To investigate the occurrence of these parasites in an endangered marine mammal, the Australian sea lion (Neophoca cinerea), genomic DNA was extracted from faecal samples collected from wild populations (n = 271) in Southern and Western Australia and three Australian captive populations (n = 19). These were screened using PCR targeting the 18S rRNA of Giardia and Cryptosporidium. Giardia duodenalis was detected in 28 wild sea lions and in seven captive individuals. Successful sequencing of the 18S rRNA gene assigned 27 Giardia isolates to assemblage B and one to assemblage A, both assemblages commonly found in humans. Subsequent screening at the gdh and β-giardin loci resulted in amplification of only one of the 35 18S rRNA positive samples at the β-giardin locus. Sequencing at the β-giardin locus assigned the assemblage B 18S rRNA confirmed isolate to assemblage AI. The geographic distribution of sea lion populations sampled in relation to human settlements indicated that Giardia presence in sea lions was highest in populations less than 25 km from humans. Cryptosporidium was not detected by PCR screening in either wild colonies or captive sea lion populations. These data suggest that the presence of G. duodenalis in the endangered Australian sea lion is likely the result of dispersal from human sources. Multilocus molecular analyses are essential for the determination of G. duodenalis assemblages and subsequent inferences on transmission routes to endangered marine mammal populations.
Ma, Rong; Kaundun, Shiv S.; Tranel, Patrick J.; Riggins, Chance W.; McGinness, Daniel L.; Hager, Aaron G.; Hawkes, Tim; McIndoe, Eddie; Riechers, Dean E.
2013-01-01
Previous research reported the first case of resistance to mesotrione and other 4-hydroxyphenylpyruvate dioxygenase (HPPD) herbicides in a waterhemp (Amaranthus tuberculatus) population designated MCR (for McLean County mesotrione- and atrazine-resistant). Herein, experiments were conducted to determine if target site or nontarget site mechanisms confer mesotrione resistance in MCR. Additionally, the basis for atrazine resistance was investigated in MCR and an atrazine-resistant but mesotrione-sensitive population (ACR for Adams County mesotrione-sensitive but atrazine-resistant). A standard sensitive population (WCS for Wayne County herbicide-sensitive) was also used for comparison. Mesotrione resistance was not due to an alteration in HPPD sequence, HPPD expression, or reduced herbicide absorption. Metabolism studies using whole plants and excised leaves revealed that the time for 50% of absorbed mesotrione to degrade in MCR was significantly shorter than in ACR and WCS, which correlated with previous phenotypic responses to mesotrione and the quantity of the metabolite 4-hydroxy-mesotrione in excised leaves. The cytochrome P450 monooxygenase inhibitors malathion and tetcyclacis significantly reduced mesotrione metabolism in MCR and corn (Zea mays) excised leaves but not in ACR. Furthermore, malathion increased mesotrione activity in MCR seedlings in greenhouse studies. These results indicate that enhanced oxidative metabolism contributes significantly to mesotrione resistance in MCR. Sequence analysis of atrazine-resistant (MCR and ACR) and atrazine-sensitive (WCS) waterhemp populations detected no differences in the psbA gene. The times for 50% of absorbed atrazine to degrade in corn, MCR, and ACR leaves were shorter than in WCS, and a polar metabolite of atrazine was detected in corn, MCR, and ACR that cochromatographed with a synthetic atrazine-glutathione conjugate. Thus, elevated rates of metabolism via distinct detoxification mechanisms contribute to mesotrione and atrazine resistance within the MCR population. PMID:23872617
Choi, Hong-Kyu; Kim, Dongjin; Uhm, Taesik; Limpens, Eric; Lim, Hyunju; Mun, Jeong-Hwan; Kalo, Peter; Penmetsa, R Varma; Seres, Andrea; Kulikova, Olga; Roe, Bruce A; Bisseling, Ton; Kiss, Gyorgy B; Cook, Douglas R
2004-01-01
A core genetic map of the legume Medicago truncatula has been established by analyzing the segregation of 288 sequence-characterized genetic markers in an F(2) population composed of 93 individuals. These molecular markers correspond to 141 ESTs, 80 BAC end sequence tags, and 67 resistance gene analogs, covering 513 cM. In the case of EST-based markers we used an intron-targeted marker strategy with primers designed to anneal in conserved exon regions and to amplify across intron regions. Polymorphisms were significantly more frequent in intron vs. exon regions, thus providing an efficient mechanism to map transcribed genes. Genetic and cytogenetic analysis produced eight well-resolved linkage groups, which have been previously correlated with eight chromosomes by means of FISH with mapped BAC clones. We anticipated that mapping of conserved coding regions would have utility for comparative mapping among legumes; thus 60 of the EST-based primer pairs were designed to amplify orthologous sequences across a range of legume species. As an initial test of this strategy, we used primers designed against M. truncatula exon sequences to rapidly map genes in M. sativa. The resulting comparative map, which includes 68 bridging markers, indicates that the two Medicago genomes are highly similar and establishes the basis for a Medicago composite map. PMID:15082563
Living laboratory: whole-genome sequencing as a learning healthcare enterprise.
Angrist, M; Jamal, L
2015-04-01
With the proliferation of affordable large-scale human genomic data come profound and vexing questions about management of such data and their clinical uncertainty. These issues challenge the view that genomic research on human beings can (or should) be fully segregated from clinical genomics, either conceptually or practically. Here, we argue that the sharp distinction between clinical care and research is especially problematic in the context of large-scale genomic sequencing of people with suspected genetic conditions. Core goals of both enterprises (e.g. understanding genotype-phenotype relationships; generating an evidence base for genomic medicine) are more likely to be realized at a population scale if both those ordering and those undergoing sequencing for diagnostic reasons are routinely and longitudinally studied. Rather than relying on expensive and lengthy randomized clinical trials and meta-analyses, we propose leveraging nascent clinical-research hybrid frameworks into a broader, more permanent instantiation of exploratory medical sequencing. Such an investment could enlighten stakeholders about the real-life challenges posed by whole-genome sequencing, such as establishing the clinical actionability of genetic variants, returning 'off-target' results to families, developing effective service delivery models and monitoring long-term outcomes. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Age effects on discrimination of timing in auditory sequences
NASA Astrophysics Data System (ADS)
Fitzgibbons, Peter J.; Gordon-Salant, Sandra
2004-08-01
The experiments examined age-related changes in temporal sensitivity to increments in the interonset intervals (IOI) of components in tonal sequences. Discrimination was examined using reference sequences consisting of five 50-ms tones separated by silent intervals; tone frequencies were either fixed at 4 kHz or varied within a 2-4-kHz range to produce spectrally complex patterns. The tonal IOIs within the reference sequences were either equal (200 or 600 ms) or varied individually with an average value of 200 or 600 ms to produce temporally complex patterns. The difference limen (DL) for increments of IOI was measured. Comparison sequences featured either equal increments in all tonal IOIs or increments in a single target IOI, with the sequential location of the target changing randomly across trials. Four groups of younger and older adults with and without sensorineural hearing loss participated. Results indicated that DLs for uniform changes of sequence rate were smaller than DLs for single target intervals, with the largest DLs observed for single targets embedded within temporally complex sequences. Older listeners performed more poorly than younger listeners in all conditions, but the largest age-related differences were observed for temporally complex stimulus conditions. No systematic effects of hearing loss were observed.
The influence of phonological priming on variability in articulation
NASA Astrophysics Data System (ADS)
Babel, Molly E.; Munson, Benjamin
2004-05-01
Previous research [Sevald and Dell, Cognition 53, 91-127 (1994)] has found that reiterant sequences of CVC words are produced more quickly when the prime word and target word share VC sequences (i.e., sequences like sit sick) than when they are identical (sequences like sick sick). Even slower production rates are found when primes and targets share a CV sequence (sequences like kick sick). These data have been used to support a model of speech production in which lexical items and their constituent phonemes are activated sequentially. The current experiment investigated whether phonological priming also influences variability in the acoustic characteristics of words. Specifically, we examined whether greater variability in the acoustic characteristics of target words was noted in the CV-related prime context than in the identical-prime context, and whether less variability was noted in the VC-related context. Thirty adult subjects with typical speech, language, and hearing ability produced reiterant two-word sequences that varied in their phonological similarity. The duration, first, and second formant frequencies of the target-words' vowels were measured. Preliminary analyses indicate that phonological priming does not have a systematic effect on variability in these acoustic parameters.
Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis; Pestel-Caron, Martine
2014-10-01
Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Shortt, Jonathan A.; Card, Daren C.; Schield, Drew R.; Liu, Yang; Zhong, Bo; Castoe, Todd A.
2017-01-01
Background In areas where schistosomiasis control programs have been implemented, morbidity and prevalence have been greatly reduced. However, to sustain these reductions and move towards interruption of transmission, new tools for disease surveillance are needed. Genomic methods have the potential to help trace the sources of new infections, and allow us to monitor drug resistance. Large-scale genotyping efforts for schistosome species have been hindered by cost, limited numbers of established target loci, and the small amount of DNA obtained from miracidia, the life stage most readily acquired from humans. Here, we present a method using next generation sequencing to provide high-resolution genomic data from S. japonicum for population-based studies. Methodology/Principal Findings We applied whole genome amplification followed by double digest restriction site associated DNA sequencing (ddRADseq) to individual S. japonicum miracidia preserved on Whatman FTA cards. We found that we could effectively and consistently survey hundreds of thousands of variants from 10,000 to 30,000 loci from archived miracidia as old as six years. An analysis of variation from eight miracidia obtained from three hosts in two villages in Sichuan showed clear population structuring by village and host even within this limited sample. Conclusions/Significance This high-resolution sequencing approach yields three orders of magnitude more information than microsatellite genotyping methods that have been employed over the last decade, creating the potential to answer detailed questions about the sources of human infections and to monitor drug resistance. Costs per sample range from $50-$200, depending on the amount of sequence information desired, and we expect these costs can be reduced further given continued reductions in sequencing costs, improvement of protocols, and parallelization. This approach provides new promise for using modern genome-scale sampling to S. japonicum surveillance, and could be applied to other schistosome species and other parasitic helminthes. PMID:28107347
Zhu, Yuan O; Aw, Pauline P K; de Sessions, Paola Florez; Hong, Shuzhen; See, Lee Xian; Hong, Lewis Z; Wilm, Andreas; Li, Chen Hao; Hue, Stephane; Lim, Seng Gee; Nagarajan, Niranjan; Burkholder, William F; Hibberd, Martin
2017-10-27
Viral populations are complex, dynamic, and fast evolving. The evolution of groups of closely related viruses in a competitive environment is termed quasispecies. To fully understand the role that quasispecies play in viral evolution, characterizing the trajectories of viral genotypes in an evolving population is the key. In particular, long-range haplotype information for thousands of individual viruses is critical; yet generating this information is non-trivial. Popular deep sequencing methods generate relatively short reads that do not preserve linkage information, while third generation sequencing methods have higher error rates that make detection of low frequency mutations a bioinformatics challenge. Here we applied BAsE-Seq, an Illumina-based single-virion sequencing technology, to eight samples from four chronic hepatitis B (CHB) patients - once before antiviral treatment and once after viral rebound due to resistance. With single-virion sequencing, we obtained 248-8796 single-virion sequences per sample, which allowed us to find evidence for both hard and soft selective sweeps. We were able to reconstruct population demographic history that was independently verified by clinically collected data. We further verified four of the samples independently through PacBio SMRT and Illumina Pooled deep sequencing. Overall, we showed that single-virion sequencing yields insight into viral evolution and population dynamics in an efficient and high throughput manner. We believe that single-virion sequencing is widely applicable to the study of viral evolution in the context of drug resistance and host adaptation, allows differentiation between soft or hard selective sweeps, and may be useful in the reconstruction of intra-host viral population demographic history.
Ketter-Ratzon, Dafna; Tirosh-Levy, Sharon; Nachum-Biala, Yaarit; Saar, Tal; Qura'n, Lara; Zivotofsky, Doni; Abdeen, Ziad; Baneth, Gad; Steinman, Amir
2017-06-01
Equine theileriosis caused by Theileria equi is endemic in the Middle East, where it causes a severe disease as well as widespread subclinical infection. The aim of this study was to evaluate the diversity of T. equi genotypes in Israel and the neighboring Palestinian Authority and Jordan. Blood samples from 355 horses from Israel, the Palestinian Authority and Jordan were tested for the prevalence of T. equi DNA. Two hundred and fourteen (60%) were found positive for T. equi infection by PCR. Of those, the 18S rRNA (1458bp) and the EMA-1 (745bp) genes of T. equi were sequenced from 15 horse samples that represent Israel's geographical distribution together with four samples from the Palestinian Authority and two from Jordan. The results were used for genotype characterization and phylogenetic analysis of T. equi in the equine population in Israel and its surroundings. Three 18S rRNA genotype clades were found in Israel (A, C and D) with clade D being the most prevalent and included all four isolates from the PA. In contrast, the EMA-1 gene showed little diversity with all sequences clustering in the same clade apart from one Jordanian sequence. Results suggest that although the Israeli horse population is small and relatively confined geographically, it is probable that the genetic variability, which was found among Israeli horses, is a result of introduction of horses from other countries. It also suggests that the EMA-1 gene is probably not a good target for the evaluation of variance in T. equi populations. Characterization of the different genotypes prevalent in a certain region is important in order to map out the intra-species sequence heterogeneity of the parasite, which is needed in order to develop new diagnostic tools and vaccines. Copyright © 2017 Elsevier GmbH. All rights reserved.
Davis, G L; McMullen, M D; Baysdorfer, C; Musket, T; Grant, D; Staebell, M; Xu, G; Polacco, M; Koster, L; Melia-Hancock, S; Houchins, K; Chao, S; Coe, E H
1999-01-01
We have constructed a 1736-locus maize genome map containing1156 loci probed by cDNAs, 545 probed by random genomic clones, 16 by simple sequence repeats (SSRs), 14 by isozymes, and 5 by anonymous clones. Sequence information is available for 56% of the loci with 66% of the sequenced loci assigned functions. A total of 596 new ESTs were mapped from a B73 library of 5-wk-old shoots. The map contains 237 loci probed by barley, oat, wheat, rice, or tripsacum clones, which serve as grass genome reference points in comparisons between maize and other grass maps. Ninety core markers selected for low copy number, high polymorphism, and even spacing along the chromosome delineate the 100 bins on the map. The average bin size is 17 cM. Use of bin assignments enables comparison among different maize mapping populations and experiments including those involving cytogenetic stocks, mutants, or quantitative trait loci. Integration of nonmaize markers in the map extends the resources available for gene discovery beyond the boundaries of maize mapping information into the expanse of map, sequence, and phenotype information from other grass species. This map provides a foundation for numerous basic and applied investigations including studies of gene organization, gene and genome evolution, targeted cloning, and dissection of complex traits. PMID:10388831
Yates, Kathleen B.; Bi, Kevin; Darko, Samuel; Godec, Jernej; Gerdemann, Ulrike; Swadling, Leo; Douek, Daniel C.; Klenerman, Paul; Barnes, Eleanor J.; Sharpe, Arlene H.
2017-01-01
Abstract The T cell compartment must contain diversity in both T cell receptor (TCR) repertoire and cell state to provide effective immunity against pathogens. However, it remains unclear how differences in the TCR contribute to heterogeneity in T cell state. Single cell RNA-sequencing (scRNA-seq) can allow simultaneous measurement of TCR sequence and global transcriptional profile from single cells. However, current methods for TCR inference from scRNA-seq are limited in their sensitivity and require long sequencing reads, thus increasing the cost and decreasing the number of cells that can be feasibly analyzed. Here we present TRAPeS, a publicly available tool that can efficiently extract TCR sequence information from short-read scRNA-seq libraries. We apply it to investigate heterogeneity in the CD8+ T cell response in humans and mice, and show that it is accurate and more sensitive than existing approaches. Coupling TRAPeS with transcriptome analysis of CD8+ T cells specific for a single epitope from Yellow Fever Virus (YFV), we show that the recently described ‘naive-like’ memory population have significantly longer CDR3 regions and greater divergence from germline sequence than do effector-memory phenotype cells. This suggests that TCR usage is associated with the differentiation state of the CD8+ T cell response to YFV. PMID:28934479
Drummond, Alexei J; Nicholls, Geoff K; Rodrigo, Allen G; Solomon, Wiremu
2002-01-01
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences. PMID:12136032
Drummond, Alexei J; Nicholls, Geoff K; Rodrigo, Allen G; Solomon, Wiremu
2002-07-01
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.
Polanski, A; Kimmel, M; Chakraborty, R
1998-05-12
Distribution of pairwise differences of nucleotides from data on a sample of DNA sequences from a given segment of the genome has been used in the past to draw inferences about the past history of population size changes. However, all earlier methods assume a given model of population size changes (such as sudden expansion), parameters of which (e.g., time and amplitude of expansion) are fitted to the observed distributions of nucleotide differences among pairwise comparisons of all DNA sequences in the sample. Our theory indicates that for any time-dependent population size, N(tau) (in which time tau is counted backward from present), a time-dependent coalescence process yields the distribution, p(tau), of the time of coalescence between two DNA sequences randomly drawn from the population. Prediction of p(tau) and N(tau) requires the use of a reverse Laplace transform known to be unstable. Nevertheless, simulated data obtained from three models of monotone population change (stepwise, exponential, and logistic) indicate that the pattern of a past population size change leaves its signature on the pattern of DNA polymorphism. Application of the theory to the published mtDNA sequences indicates that the current mtDNA sequence variation is not inconsistent with a logistic growth of the human population.
Genome-wide selection components analysis in a fish with male pregnancy.
Flanagan, Sarah P; Jones, Adam G
2017-04-01
A major goal of evolutionary biology is to identify the genome-level targets of natural and sexual selection. With the advent of next-generation sequencing, whole-genome selection components analysis provides a promising avenue in the search for loci affected by selection in nature. Here, we implement a genome-wide selection components analysis in the sex role reversed Gulf pipefish, Syngnathus scovelli. Our approach involves a double-digest restriction-site associated DNA sequencing (ddRAD-seq) technique, applied to adult females, nonpregnant males, pregnant males, and their offspring. An F ST comparison of allele frequencies among these groups reveals 47 genomic regions putatively experiencing sexual selection, as well as 468 regions showing a signature of differential viability selection between males and females. A complementary likelihood ratio test identifies similar patterns in the data as the F ST analysis. Sexual selection and viability selection both tend to favor the rare alleles in the population. Ultimately, we conclude that genome-wide selection components analysis can be a useful tool to complement other approaches in the effort to pinpoint genome-level targets of selection in the wild. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.
Immunotherapy in managing metastatic melanoma: which treatment when?
Amaral, Teresa; Meraz-Torres, Francisco; Garbe, Claus
2017-12-01
Ten to fifteen percent of melanoma patients develop distant or unresectable metastasis requiring systemic treatment. Around 45% of the patients diagnosed with metastatic cutaneous melanoma harbor a BRAFV600 mutation and derive benefit from combined targeted therapy with MAPK pathway inhibitors. These offer a rapid response that translates into improvement of symptoms and increased quality of life. However, resistance often develops with subsequent progressive disease. Immunotherapy with checkpoint inhibitors may be offered to BRAF-mutated and wild-type patients and is associated with longer and durable responses that can continue over years. Areas covered: In this review, the authors discuss the late evidence for targeted and immunotherapy in melanoma patients, as well as therapy sequencing. Immunotherapy in special populations is also addressed. Expert opinion: Effective treatments are currently available. However, there are still unanswered questions of the best therapy sequence, the clear superiority of combined immunotherapy versus monotherapy in all patients, and therapy duration. Since different promising treatments will become available, clinical trials comparing the diverse options in terms of safety, efficacy and cost- effectiveness are required to make the right decisions. Consequently, patients should be encouraged to participate in clinical trials, whenever possible.
Oldach, Klaus H; Peck, David M; Nair, Ramakrishnan M; Sokolova, Maria; Harris, John; Bogacki, Paul; Ballard, Ross
2014-04-17
The nematode Pratylenchus neglectus has a wide host range and is able to feed on the root systems of cereals, oilseeds, grain and pasture legumes. Under the Mediterranean low rainfall environments of Australia, annual Medicago pasture legumes are used in rotation with cereals to fix atmospheric nitrogen and improve soil parameters. Considerable efforts are being made in breeding programs to improve resistance and tolerance to Pratylenchus neglectus in the major crops wheat and barley, which makes it vital to develop appropriate selection tools in medics. A strong source of tolerance to root damage by the root lesion nematode (RLN) Pratylenchus neglectus had previously been identified in line RH-1 (strand medic, M. littoralis). Using RH-1, we have developed a single seed descent (SSD) population of 138 lines by crossing it to the intolerant cultivar Herald. After inoculation, RLN-associated root damage clearly segregated in the population. Genetic analysis was performed by constructing a genetic map using simple sequence repeat (SSR) and gene-based SNP markers. A highly significant quantitative trait locus (QTL), QPnTolMl.1, was identified explaining 49% of the phenotypic variation in the SSD population. All SSRs and gene-based markers in the QTL region were derived from chromosome 1 of the sequenced genome of the closely related species M. truncatula. Gene-based markers were validated in advanced breeding lines derived from the RH-1 parent and also a second RLN tolerance source, RH-2 (M. truncatula ssp. tricycla). Comparative analysis to sequenced legume genomes showed that the physical QTL interval exists as a synteny block in Lotus japonicus, common bean, soybean and chickpea. Furthermore, using the sequenced genome information of M. truncatula, the QTL interval contains 55 genes out of which five are discussed as potential candidate genes responsible for the mapped tolerance. The closely linked set of SNP-based PCR markers is directly applicable to select for two different sources of RLN tolerance in breeding programs. Moreover, genome sequence information has allowed proposing candidate genes for further functional analysis and nominates QPnTolMl.1 as a target locus for RLN tolerance in economically important grain legumes, e.g. chickpea.
Using populations of human and microbial genomes for organism detection in metagenomes
Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel; ...
2015-04-29
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less
Using populations of human and microbial genomes for organism detection in metagenomes.
Ames, Sasha K; Gardner, Shea N; Marti, Jose Manuel; Slezak, Tom R; Gokhale, Maya B; Allen, Jonathan E
2015-07-01
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected. © 2015 Ames et al.; Published by Cold Spring Harbor Laboratory Press.
Using populations of human and microbial genomes for organism detection in metagenomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less
Yu, Qin; Abdallah, Ibrahim; Han, Heping; Owen, Mechelle; Powles, Stephen
2009-09-01
This study investigates mechanisms of multiple resistance to glyphosate, acetyl-coenzyme A carboxylase (ACCase) and acetolactate synthase (ALS)-inhibiting herbicides in two Lolium rigidum populations from Australia. When treated with glyphosate, susceptible (S) plants accumulated 4- to 6-fold more shikimic acid than resistant (R) plants. The resistant plants did not have the known glyphosate resistance endowing mutation of 5-enolpyruvylshikimate-3 phosphate synthase (EPSPS) at Pro-106, nor was there over-expression of EPSPS in either of the R populations. However, [(14)C]-glyphosate translocation experiments showed that the R plants in both populations have altered glyphosate translocation patterns compared to the S plants. The R plants showed much less glyphosate translocation to untreated young leaves, but more to the treated leaf tip, than did the S plants. Sequencing of the carboxyl transferase domain of the plastidic ACCase gene revealed no resistance endowing amino acid substitutions in the two R populations, and the ALS in vitro inhibition assay demonstrated herbicide-sensitive ALS in the ALS R population (WALR70). By using the cytochrome P450 inhibitor malathion and amitrole with ALS and ACCase herbicides, respectively, we showed that malathion reverses chlorsulfuron resistance and amitrole reverses diclofop resistance in the R population examined. Therefore, we conclude that multiple glyphosate, ACCase and ALS herbicide resistance in the two R populations is due to the presence of distinct non-target site based resistance mechanisms for each herbicide. Glyphosate resistance is due to reduced rates of glyphosate translocation, and resistance to ACCase and ALS herbicides is likely due to enhanced herbicide metabolism involving different cytochrome P450 enzymes.
Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding.
Lan, Freeman; Demaree, Benjamin; Ahmed, Noorsher; Abate, Adam R
2017-07-01
The application of single-cell genome sequencing to large cell populations has been hindered by technical challenges in isolating single cells during genome preparation. Here we present single-cell genomic sequencing (SiC-seq), which uses droplet microfluidics to isolate, fragment, and barcode the genomes of single cells, followed by Illumina sequencing of pooled DNA. We demonstrate ultra-high-throughput sequencing of >50,000 cells per run in a synthetic community of Gram-negative and Gram-positive bacteria and fungi. The sequenced genomes can be sorted in silico based on characteristic sequences. We use this approach to analyze the distributions of antibiotic-resistance genes, virulence factors, and phage sequences in microbial communities from an environmental sample. The ability to routinely sequence large populations of single cells will enable the de-convolution of genetic heterogeneity in diverse cell populations.
Abo, Ryan P; Ducar, Matthew; Garcia, Elizabeth P; Thorner, Aaron R; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M; Hahn, William C; Meyerson, Matthew; Lindeman, Neal I; Van Hummelen, Paul; MacConaill, Laura E
2015-02-18
Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Altmüller, Janine; Budde, Birgit S; Nürnberg, Peter
2014-02-01
Abstract Targeted re-sequencing such as gene panel sequencing (GPS) has become very popular in medical genetics, both for research projects and in diagnostic settings. The technical principles of the different enrichment methods have been reviewed several times before; however, new enrichment products are constantly entering the market, and researchers are often puzzled about the requirement to take decisions about long-term commitments, both for the enrichment product and the sequencing technology. This review summarizes important considerations for the experimental design and provides helpful recommendations in choosing the best sequencing strategy for various research projects and diagnostic applications.
Labeled nucleotide phosphate (NP) probes
Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY
2009-02-03
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Kaundun, Shiv Shankhar; Hutchings, Sarah-Jane; Dale, Richard P.; McIndoe, Eddie
2013-01-01
Background Knowledge of the mechanisms of herbicide resistance is important for designing long term sustainable weed management strategies. Here, we have used an integrated biology and molecular approach to investigate the mechanisms of resistance to acetyl-CoA carboxylase inhibiting herbicides in a UK black-grass population (BG2). Methodology/Principal Findings Comparison between BG2 phenotypes using single discriminant rates of herbicides and genotypes based on ACCase gene sequencing showed that the I1781L, a novel I1781T, but not the W2027C mutations, were associated with resistance to cycloxydim. All plants were killed with clethodim and a few individuals containing the I1781L mutation were partially resistant to tepraloxydim. Whole plant dose response assays demonstrated that a single copy of the mutant T1781 allele conferred fourfold resistance levels to cycloxydim and clodinafop-propargyl. In contrast, the impact of the I1781T mutation was low (Rf = 1.6) and non-significant on pinoxaden. BG2 was also characterised by high levels of resistance, very likely non-target site based, to the two cereal selective herbicides clodinafop-propargyl and pinoxaden and not to the poorly metabolisable cyclohexanedione herbicides. Analysis of 480 plants from 40 cycloxydim resistant black grass populations from the UK using two very effective and high throughput dCAPS assays established for detecting any amino acid changes at the 1781 ACCase codon and for positively identifying the threonine residue, showed that the occurrence of the T1781 is extremely rare compared to the L1781 allele. Conclusion/Significance This study revealed a novel mutation at ACCase codon position 1781 and adequately assessed target site and non-target site mechanisms in conferring resistance to several ACCase herbicides in a black-grass population. It highlights that over time the level of suspected non-target site resistance to some cereal selective ACCase herbicides have in some instances surpassed that of target site resistance, including the one endowed by the most commonly encountered I1781L mutation. PMID:23936046
Hoshino, Tatsuhiko; Inagaki, Fumio
2017-01-01
Next-generation sequencing (NGS) is a powerful tool for analyzing environmental DNA and provides the comprehensive molecular view of microbial communities. For obtaining the copy number of particular sequences in the NGS library, however, additional quantitative analysis as quantitative PCR (qPCR) or digital PCR (dPCR) is required. Furthermore, number of sequences in a sequence library does not always reflect the original copy number of a target gene because of biases caused by PCR amplification, making it difficult to convert the proportion of particular sequences in the NGS library to the copy number using the mass of input DNA. To address this issue, we applied stochastic labeling approach with random-tag sequences and developed a NGS-based quantification protocol, which enables simultaneous sequencing and quantification of the targeted DNA. This quantitative sequencing (qSeq) is initiated from single-primer extension (SPE) using a primer with random tag adjacent to the 5' end of target-specific sequence. During SPE, each DNA molecule is stochastically labeled with the random tag. Subsequently, first-round PCR is conducted, specifically targeting the SPE product, followed by second-round PCR to index for NGS. The number of random tags is only determined during the SPE step and is therefore not affected by the two rounds of PCR that may introduce amplification biases. In the case of 16S rRNA genes, after NGS sequencing and taxonomic classification, the absolute number of target phylotypes 16S rRNA gene can be estimated by Poisson statistics by counting random tags incorporated at the end of sequence. To test the feasibility of this approach, the 16S rRNA gene of Sulfolobus tokodaii was subjected to qSeq, which resulted in accurate quantification of 5.0 × 103 to 5.0 × 104 copies of the 16S rRNA gene. Furthermore, qSeq was applied to mock microbial communities and environmental samples, and the results were comparable to those obtained using digital PCR and relative abundance based on a standard sequence library. We demonstrated that the qSeq protocol proposed here is advantageous for providing less-biased absolute copy numbers of each target DNA with NGS sequencing at one time. By this new experiment scheme in microbial ecology, microbial community compositions can be explored in more quantitative manner, thus expanding our knowledge of microbial ecosystems in natural environments.
Biswas, Ambarish; Gagnon, Joshua N.; Brouns, Stan J.J.; Fineran, Peter C.; Brown, Chris M.
2013-01-01
The bacterial and archaeal CRISPR/Cas adaptive immune system targets specific protospacer nucleotide sequences in invading organisms. This requires base pairing between processed CRISPR RNA and the target protospacer. For type I and II CRISPR/Cas systems, protospacer adjacent motifs (PAM) are essential for target recognition, and for type III, mismatches in the flanking sequences are important in the antiviral response. In this study, we examine the properties of each class of CRISPR. We use this information to provide a tool (CRISPRTarget) that predicts the most likely targets of CRISPR RNAs (http://bioanalysis.otago.ac.nz/CRISPRTarget). This can be used to discover targets in newly sequenced genomic or metagenomic data. To test its utility, we discover features and targets of well-characterized Streptococcus thermophilus and Sulfolobus solfataricus type II and III CRISPR/Cas systems. Finally, in Pectobacterium species, we identify new CRISPR targets and propose a model of temperate phage exposure and subsequent inhibition by the type I CRISPR/Cas systems. PMID:23492433
Dubinett - Targeted Sequencing 2012 — EDRN Public Portal
we propose to use targeted massively parallel DNA sequencing to identify somatic alterations within mutational hotspots in matched sets of primary lung tumors, premalignant lesions, and adjacent,histologically normal lung tissue.
A high-throughput Sanger strategy for human mitochondrial genome sequencing
2013-01-01
Background A population reference database of complete human mitochondrial genome (mtGenome) sequences is needed to enable the use of mitochondrial DNA (mtDNA) coding region data in forensic casework applications. However, the development of entire mtGenome haplotypes to forensic data quality standards is difficult and laborious. A Sanger-based amplification and sequencing strategy that is designed for automated processing, yet routinely produces high quality sequences, is needed to facilitate high-volume production of these mtGenome data sets. Results We developed a robust 8-amplicon Sanger sequencing strategy that regularly produces complete, forensic-quality mtGenome haplotypes in the first pass of data generation. The protocol works equally well on samples representing diverse mtDNA haplogroups and DNA input quantities ranging from 50 pg to 1 ng, and can be applied to specimens of varying DNA quality. The complete workflow was specifically designed for implementation on robotic instrumentation, which increases throughput and reduces both the opportunities for error inherent to manual processing and the cost of generating full mtGenome sequences. Conclusions The described strategy will assist efforts to generate complete mtGenome haplotypes which meet the highest data quality expectations for forensic genetic and other applications. Additionally, high-quality data produced using this protocol can be used to assess mtDNA data developed using newer technologies and chemistries. Further, the amplification strategy can be used to enrich for mtDNA as a first step in sample preparation for targeted next-generation sequencing. PMID:24341507
Bidirectional Retroviral Integration Site PCR Methodology and Quantitative Data Analysis Workflow.
Suryawanshi, Gajendra W; Xu, Song; Xie, Yiming; Chou, Tom; Kim, Namshin; Chen, Irvin S Y; Kim, Sanggu
2017-06-14
Integration Site (IS) assays are a critical component of the study of retroviral integration sites and their biological significance. In recent retroviral gene therapy studies, IS assays, in combination with next-generation sequencing, have been used as a cell-tracking tool to characterize clonal stem cell populations sharing the same IS. For the accurate comparison of repopulating stem cell clones within and across different samples, the detection sensitivity, data reproducibility, and high-throughput capacity of the assay are among the most important assay qualities. This work provides a detailed protocol and data analysis workflow for bidirectional IS analysis. The bidirectional assay can simultaneously sequence both upstream and downstream vector-host junctions. Compared to conventional unidirectional IS sequencing approaches, the bidirectional approach significantly improves IS detection rates and the characterization of integration events at both ends of the target DNA. The data analysis pipeline described here accurately identifies and enumerates identical IS sequences through multiple steps of comparison that map IS sequences onto the reference genome and determine sequencing errors. Using an optimized assay procedure, we have recently published the detailed repopulation patterns of thousands of Hematopoietic Stem Cell (HSC) clones following transplant in rhesus macaques, demonstrating for the first time the precise time point of HSC repopulation and the functional heterogeneity of HSCs in the primate system. The following protocol describes the step-by-step experimental procedure and data analysis workflow that accurately identifies and quantifies identical IS sequences.
Chromosomal targeting by CRISPR-Cas systems can contribute to genome plasticity in bacteria
Dy, Ron L; Pitman, Andrew R; Fineran, Peter C
2013-01-01
The clustered regularly interspaced short palindromic repeats (CRISPR) and their associated (Cas) proteins form adaptive immune systems in bacteria to combat phage and other foreign genetic elements. Typically, short spacer sequences are acquired from the invader DNA and incorporated into CRISPR arrays in the bacterial genome. Small RNAs are generated that contain these spacer sequences and enable sequence-specific destruction of the foreign nucleic acids. Occasionally, spacers are acquired from the chromosome, which instead leads to targeting of the host genome. Chromosomal targeting is highly toxic to the bacterium, providing a strong selective pressure for a variety of evolutionary routes that enable host cell survival. Mutations that inactivate the CRISPR-Cas functionality, such as within the cas genes, CRISPR repeat, protospacer adjacent motifs (PAM), and target sequence, mediate escape from toxicity. This self-targeting might provide some explanation for the incomplete distribution of CRISPR-Cas systems in less than half of sequenced bacterial genomes. More importantly, self-genome targeting can cause large-scale genomic alterations, including remodeling or deletion of pathogenicity islands and other non-mobile chromosomal regions. While control of horizontal gene transfer is perceived as their main function, our recent work illuminates an alternative role of CRISPR-Cas systems in causing host genomic changes and influencing bacterial evolution. PMID:24251073
NASA Astrophysics Data System (ADS)
Khosla, Deepak; Huber, David J.; Martin, Kevin
2017-05-01
This paper† describes a technique in which we improve upon the prior performance of the Rapid Serial Visual Presentation (RSVP) EEG paradigm for image classification though the insertion of visual attention distracters and overall sequence reordering based upon the expected ratio of rare to common "events" in the environment and operational context. Inserting distracter images maintains the ratio of common events to rare events at an ideal level, maximizing the rare event detection via P300 EEG response to the RSVP stimuli. The method has two steps: first, we compute the optimal number of distracters needed for an RSVP stimuli based on the desired sequence length and expected number of targets and insert the distracters into the RSVP sequence, and then we reorder the RSVP sequence to maximize P300 detection. We show that by reducing the ratio of target events to nontarget events using this method, we can allow RSVP sequences with more targets without sacrificing area under the ROC curve (azimuth).
Methods for decoding Cas9 protospacer adjacent motif (PAM) sequences: A brief overview.
Karvelis, Tautvydas; Gasiunas, Giedrius; Siksnys, Virginijus
2017-05-15
Recently the Cas9, an RNA guided DNA endonuclease, emerged as a powerful tool for targeted genome manipulations. Cas9 protein can be reprogrammed to cleave, bind or nick any DNA target by simply changing crRNA sequence, however a short nucleotide sequence, termed PAM, is required to initiate crRNA hybridization to the DNA target. PAM sequence is recognized by Cas9 protein and must be determined experimentally for each Cas9 variant. Exploration of Cas9 orthologs could offer a diversity of PAM sequences and novel biochemical properties that may be beneficial for genome editing applications. Here we briefly review and compare Cas9 PAM identification assays that can be adopted for other PAM-dependent CRISPR-Cas systems. Copyright © 2017 Elsevier Inc. All rights reserved.
Josephs, Eric A.; Kocak, D. Dewran; Fitzgibbon, Christopher J.; McMenemy, Joshua; Gersbach, Charles A.; Marszalek, Piotr E.
2015-01-01
CRISPR-associated endonuclease Cas9 cuts DNA at variable target sites designated by a Cas9-bound RNA molecule. Cas9's ability to be directed by single ‘guide RNA’ molecules to target nearly any sequence has been recently exploited for a number of emerging biological and medical applications. Therefore, understanding the nature of Cas9's off-target activity is of paramount importance for its practical use. Using atomic force microscopy (AFM), we directly resolve individual Cas9 and nuclease-inactive dCas9 proteins as they bind along engineered DNA substrates. High-resolution imaging allows us to determine their relative propensities to bind with different guide RNA variants to targeted or off-target sequences. Mapping the structural properties of Cas9 and dCas9 to their respective binding sites reveals a progressive conformational transformation at DNA sites with increasing sequence similarity to its target. With kinetic Monte Carlo (KMC) simulations, these results provide evidence of a ‘conformational gating’ mechanism driven by the interactions between the guide RNA and the 14th–17th nucleotide region of the targeted DNA, the stabilities of which we find correlate significantly with reported off-target cleavage rates. KMC simulations also reveal potential methodologies to engineer guide RNA sequences with improved specificity by considering the invasion of guide RNAs into targeted DNA duplex. PMID:26384421
Droege, Marcus; Hill, Brendon
2008-08-31
The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer.
'2A-Like' Signal Sequences Mediating Translational Recoding: A Novel Form of Dual Protein Targeting.
Roulston, Claire; Luke, Garry A; de Felipe, Pablo; Ruan, Lin; Cope, Jonathan; Nicholson, John; Sukhodub, Andriy; Tilsner, Jens; Ryan, Martin D
2016-08-01
We report the initial characterization of an N-terminal oligopeptide '2A-like' sequence that is able to function both as a signal sequence and as a translational recoding element. Owing to this translational recoding activity, two forms of nascent polypeptide are synthesized: (i) when 2A-mediated translational recoding has not occurred: the nascent polypeptide is fused to the 2A-like N-terminal signal sequence and the fusion translation product is targeted to the exocytic pathway, and, (ii) a translation product where 2A-mediated translational recoding has occurred: the 2A-like signal sequence is synthesized as a separate translation product and, therefore, the nascent (downstream) polypeptide lacks the 2A-like signal sequence and is localized to the cytoplasm. This type of dual-functional signal sequence results, therefore, in the partitioning of the translation products between the two sub-cellular sites and represents a newly described form of dual protein targeting. © 2016 The Authors. Traffic published by John Wiley & Sons Ltd.
Grossmann, Sebastian; Nowak, Piotr; Neogi, Ujjwal
2015-01-01
HIV-1 near full-length genome (HIV-NFLG) sequencing from plasma is an attractive multidimensional tool to apply in large-scale population-based molecular epidemiological studies. It also enables genotypic resistance testing (GRT) for all drug target sites allowing effective intervention strategies for control and prevention in high-risk population groups. Thus, the main objective of this study was to develop a simplified subtype-independent, cost- and labour-efficient HIV-NFLG protocol that can be used in clinical management as well as in molecular epidemiological studies. Plasma samples (n=30) were obtained from HIV-1B (n=10), HIV-1C (n=10), CRF01_AE (n=5) and CRF01_AG (n=5) infected individuals with minimum viral load >1120 copies/ml. The amplification was performed with two large amplicons of 5.5 kb and 3.7 kb, sequenced with 17 primers to obtain HIV-NFLG. GRT was validated against ViroSeq™ HIV-1 Genotyping System. After excluding four plasma samples with low-quality RNA, a total of 26 samples were attempted. Among them, NFLG was obtained from 24 (92%) samples with the lowest viral load being 3000 copies/ml. High (>99%) concordance was observed between HIV-NFLG and ViroSeq™ when determining the drug resistance mutations (DRMs). The N384I connection mutation was additionally detected by NFLG in two samples. Our high efficiency subtype-independent HIV-NFLG is a simple and promising approach to be used in large-scale molecular epidemiological studies. It will facilitate the understanding of the HIV-1 pandemic population dynamics and outline effective intervention strategies. Furthermore, it can potentially be applicable in clinical management of drug resistance by evaluating DRMs against all available antiretrovirals in a single assay.
Pan, Lang; Gao, Haitao; Xia, Wenwen; Zhang, Teng; Dong, Liyao
2016-03-01
Non-target site resistance (NTSR) to herbicides is an increasing concern for weed control. Metabolic herbicide resistance is an important mechanism for NTSR. However, little is known about metabolic resistance at the genetic level. In this study, we have identified three fenoxaprop-P-ethyl-resistant American sloughgrass (Beckmannia syzigachne Steud.) populations, in which the molecular basis for NTSR remains unclear. To reveal the mechanisms of metabolic resistance, the genes likely to be involved in herbicide metabolism (e.g. for cytochrome P450s, esterases, hydrolases, oxidases, peroxidases, glutathione S-transferases, glycosyltransferases, and transporter proteins) were isolated using transcriptome sequencing, in combination with RT-PCR (reverse transcription-PCR) and RACE (rapid amplification of cDNA ends). Consequently, we established a herbicide-metabolizing enzyme library containing at least 332 genes, and each of these genes was cloned and the sequence and the expression level compared between the fenoxaprop-P-ethyl-resistant and susceptible populations. Fifteen metabolic enzyme genes were found to be possibly involved in fenoxaprop-P-ethyl resistance. In addition, we found five metabolizing enzyme genes that have a different gene sequence in plants of susceptible versus resistant B. syzigachne populations. These genes may be major candidates for herbicide metabolic resistance. This established metabolic enzyme library represents an important step forward towards a better understanding of herbicide metabolism and metabolic resistance in this and possibly other closely related weed species. This new information may help to understand weed metabolic resistance and to develop novel strategies of weed management. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
2009-01-01
Background Polymerase chain reaction (PCR) is very useful in many areas of molecular biology research. It is commonly observed that PCR success is critically dependent on design of an effective primer pair. Current tools for primer design do not adequately address the problem of PCR failure due to mis-priming on target-related sequences and structural variations in the genome. Methods We have developed an integrated graphical web-based application for primer design, called RExPrimer, which was written in Python language. The software uses Primer3 as the primer designing core algorithm. Locally stored sequence information and genomic variant information were hosted on MySQLv5.0 and were incorporated into RExPrimer. Results RExPrimer provides many functionalities for improved PCR primer design. Several databases, namely annotated human SNP databases, insertion/deletion (indel) polymorphisms database, pseudogene database, and structural genomic variation databases were integrated into RExPrimer, enabling an effective without-leaving-the-website validation of the resulting primers. By incorporating these databases, the primers reported by RExPrimer avoid mis-priming to related sequences (e.g. pseudogene, segmental duplication) as well as possible PCR failure because of structural polymorphisms (SNP, indel, and copy number variation (CNV)). To prevent mismatching caused by unexpected SNPs in the designed primers, in particular the 3' end (SNP-in-Primer), several SNP databases covering the broad range of population-specific SNP information are utilized to report SNPs present in the primer sequences. Population-specific SNP information also helps customize primer design for a specific population. Furthermore, RExPrimer offers a graphical user-friendly interface through the use of scalable vector graphic image that intuitively presents resulting primers along with the corresponding gene structure. In this study, we demonstrated the program effectiveness in successfully generating primers for strong homologous sequences. Conclusion The improvements for primer design incorporated into RExPrimer were demonstrated to be effective in designing primers for challenging PCR experiments. Integration of SNP and structural variation databases allows for robust primer design for a variety of PCR applications, irrespective of the sequence complexity in the region of interest. This software is freely available at http://www4a.biotec.or.th/rexprimer. PMID:19958502
St. John, Elizabeth P.; Simen, Birgitte B.; Turenchalk, Gregory S.; Braverman, Michael S.; Abbate, Isabella; Aerssens, Jeroen; Bouchez, Olivier; Gabriel, Christian; Izopet, Jacques; Meixenberger, Karolin; Di Giallonardo, Francesca; Schlapbach, Ralph; Paredes, Roger; Sakwa, James; Schmitz-Agheguian, Gudrun G.; Thielen, Alexander; Victor, Martin
2016-01-01
Background Ultra deep sequencing is of increasing use not only in research but also in diagnostics. For implementation of ultra deep sequencing assays in clinical laboratories for routine diagnostics, intra- and inter-laboratory testing are of the utmost importance. Methods A multicenter study was conducted to validate an updated assay design for 454 Life Sciences’ GS FLX Titanium system targeting protease/reverse transcriptase (RTP) and env (V3) regions to identify HIV-1 drug-resistance mutations and determine co-receptor use with high sensitivity. The study included 30 HIV-1 subtype B and 6 subtype non-B samples with viral titers (VT) of 3,940–447,400 copies/mL, two dilution series (52,129–1,340 and 25,130–734 copies/mL), and triplicate samples. Amplicons spanning PR codons 10–99, RT codons 1–251 and the entire V3 region were generated using barcoded primers. Analysis was performed using the GS Amplicon Variant Analyzer and geno2pheno for tropism. For comparison, population sequencing was performed using the ViroSeq HIV-1 genotyping system. Results The median sequencing depth across the 11 sites was 1,829 reads per position for RTP (IQR 592–3,488) and 2,410 for V3 (IQR 786–3,695). 10 preselected drug resistant variants were measured across sites and showed high inter-laboratory correlation across all sites with data (P<0.001). The triplicate samples of a plasmid mixture confirmed the high inter-laboratory consistency (mean% ± stdev: 4.6 ±0.5, 4.8 ±0.4, 4.9 ±0.3) and revealed good intra-laboratory consistency (mean% range ± stdev range: 4.2–5.2 ± 0.04–0.65). In the two dilutions series, no variants >20% were missed, variants 2–10% were detected at most sites (even at low VT), and variants 1–2% were detected by some sites. All mutations detected by population sequencing were also detected by UDS. Conclusions This assay design results in an accurate and reproducible approach to analyze HIV-1 mutant spectra, even at variant frequencies well below those routinely detectable by population sequencing. PMID:26756901
Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu
2014-01-01
Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways. PMID:25475013
Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu
2014-12-05
Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways.
Gao, M L; Zhong, X M; Ma, X; Ning, H J; Zhu, D; Zou, J Z
2016-06-02
To make genetic diagnosis of Alagille syndrome (ALGS) patients using target gene sequence capture and next generation sequencing technology. Target gene sequence capture and next generation sequencing were used to detect ALGS gene of 4 patients. They were hospitalized at the Affiliated Hospital, Capital Institute of Pediatrics between January 2014 and December 2015, referred to clinical diagnosis of ALGS typical and atypical respectively in 2 cases. Blood samples were collected from patients and their parents and genomic DNA was extracted from lymphocytes. Target gene sequence capture and next generation sequencing was detected. Sanger sequencing was used to confirm the results of the patients and their parents. Cholestasis, heart defects, inverted triangular face and butterfly vertebrae were presented as main clinical features in 4 male patients. The first hospital visiting ages ranged from 3 months and 14 days to 3 years and 1 month. The age of onset ranged from 3 days to 42 days (median 23 days). According to the clinical diagnostic criteria of ALGS, patient 1 and patient 2 were considered as typical ALGS. The other 2 patients were considered as atypical ALGS. Four Jagged 1(JAG1) pathogenic mutations were detected. Three different missense mutations were detected in patient 1 to patient 3 with ALGS(c.839C>T(p.W280X), c. 703G>A(p.R235X), c. 1720C>T(p.V574M)). The JAG1 mutation of patient 3 was first reported. Patient 4 had one novel insertion mutation (c.1779_1780insA(p.Ile594AsnfsTer23)). Parental analysis verified that the JAG1 missense mutation of 3 patients were de novo. The results of sanger sequencing was consistent with the results of the next generation sequencing. Target gene sequence capture combined with next generation sequencing can detect two pathogenic genes in ALGS and test genes of other related diseases in infantile cholestatic diseases simultaneously and presents a high throughput, high efficiency and low cost. It may provide molecular diagnosis and treatment for clinicians with good clinical application prospects.
Lewandowska, Dagmara W; Zagordi, Osvaldo; Geissberger, Fabienne-Desirée; Kufner, Verena; Schmutz, Stefan; Böni, Jürg; Metzner, Karin J; Trkola, Alexandra; Huber, Michael
2017-08-08
Sequence-specific PCR is the most common approach for virus identification in diagnostic laboratories. However, as specific PCR only detects pre-defined targets, novel virus strains or viruses not included in routine test panels will be missed. Recently, advances in high-throughput sequencing allow for virus-sequence-independent identification of entire virus populations in clinical samples, yet standardized protocols are needed to allow broad application in clinical diagnostics. Here, we describe a comprehensive sample preparation protocol for high-throughput metagenomic virus sequencing using random amplification of total nucleic acids from clinical samples. In order to optimize metagenomic sequencing for application in virus diagnostics, we tested different enrichment and amplification procedures on plasma samples spiked with RNA and DNA viruses. A protocol including filtration, nuclease digestion, and random amplification of RNA and DNA in separate reactions provided the best results, allowing reliable recovery of viral genomes and a good correlation of the relative number of sequencing reads with the virus input. We further validated our method by sequencing a multiplexed viral pathogen reagent containing a range of human viruses from different virus families. Our method proved successful in detecting the majority of the included viruses with high read numbers and compared well to other protocols in the field validated against the same reference reagent. Our sequencing protocol does work not only with plasma but also with other clinical samples such as urine and throat swabs. The workflow for virus metagenomic sequencing that we established proved successful in detecting a variety of viruses in different clinical samples. Our protocol supplements existing virus-specific detection strategies providing opportunities to identify atypical and novel viruses commonly not accounted for in routine diagnostic panels.
Norman, Paul J.; Norberg, Steven J.; Guethlein, Lisbeth A.; Nemat-Gorgani, Neda; Royce, Thomas; Wroblewski, Emily E.; Dunn, Tamsen; Mann, Tobias; Alicata, Claudia; Hollenbach, Jill A.; Chang, Weihua; Shults Won, Melissa; Gunderson, Kevin L.; Abi-Rached, Laurent; Ronaghi, Mostafa; Parham, Peter
2017-01-01
The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome. PMID:28360230
Inferential comprehension of 3-6 year olds within the context of story grammar: a scoping review.
Filiatrault-Veilleux, Paméla; Bouchard, Caroline; Trudeau, Natacha; Desmarais, Chantal
2015-01-01
The ability to make inferences plays a crucial role in reading comprehension and the educational success of school-aged children. However, it starts to unfold much earlier than school entry and literacy. Given that it is likely to be targeted in speech language therapy, it would be useful for clinicians to have access to information about a developmental sequence of inferential comprehension. Yet, at this time, there is no clear proposition of the way in which this ability develops in young children prior to school entry. To reduce the knowledge gap with regards to inferential comprehension in young children by conducting a scoping review of the literature. The two objectives of this research are: (1) to describe typically developing children's comprehension of causal inferences targeting elements of story grammar, with the goal of proposing milestones in the development of this ability; and (2) to highlight key elements of the methodology used to gather this information in a paediatric population. A total of 16 studies from six databases that met the inclusion criteria were qualitatively analysed in the context of a scoping review. This methodological approach was used to identify common themes and gaps in the knowledge base to achieve the intended objectives. Results permit the description of key elements in the development of six types of causal inference targeting elements of story grammar in children between 3 and 6 years old. Results also demonstrate the various methods used to assess this ability in young children and highlight particularly interesting procedures for use with this younger population. These findings point to the need for additional studies to understand this ability better and to develop strategies to stimulate an evidence-based developmental sequence in children from an early age. © 2015 The Authors. International Journal of Language & Communication Disorders published by John Wiley & Sons Ltd on behalf of Royal College of Speech and Language Therapists.
Shin, Jeong Hong; Jung, Soobin; Ramakrishna, Suresh; Kim, Hyongbum Henry; Lee, Junwon
2018-07-07
Genome editing technology using programmable nucleases has rapidly evolved in recent years. The primary mechanism to achieve precise integration of a transgene is mainly based on homology-directed repair (HDR). However, an HDR-based genome-editing approach is less efficient than non-homologous end-joining (NHEJ). Recently, a microhomology-mediated end-joining (MMEJ)-based transgene integration approach was developed, showing feasibility both in vitro and in vivo. We expanded this method to achieve targeted sequence substitution (TSS) of mutated sequences with normal sequences using double-guide RNAs (gRNAs), and a donor template flanking the microhomologies and target sequence of the gRNAs in vitro and in vivo. Our method could realize more efficient sequence substitution than the HDR-based method in vitro using a reporter cell line, and led to the survival of a hereditary tyrosinemia mouse model in vivo. The proposed MMEJ-based TSS approach could provide a novel therapeutic strategy, in addition to HDR, to achieve gene correction from a mutated sequence to a normal sequence. Copyright © 2018 Elsevier Inc. All rights reserved.
Identification of distant drug off-targets by direct superposition of binding pocket surfaces.
Schumann, Marcel; Armen, Roger S
2013-01-01
Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target ("distant off-targets"). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target ("distant off-target").
Ramos, Enrique; Levinson, Benjamin T; Chasnoff, Sara; Hughes, Andrew; Young, Andrew L; Thornton, Katherine; Li, Allie; Vallania, Francesco L M; Province, Michael; Druley, Todd E
2012-12-06
Rare genetic variation in the human population is a major source of pathophysiological variability and has been implicated in a host of complex phenotypes and diseases. Finding disease-related genes harboring disparate functional rare variants requires sequencing of many individuals across many genomic regions and comparing against unaffected cohorts. However, despite persistent declines in sequencing costs, population-based rare variant detection across large genomic target regions remains cost prohibitive for most investigators. In addition, DNA samples are often precious and hybridization methods typically require large amounts of input DNA. Pooled sample DNA sequencing is a cost and time-efficient strategy for surveying populations of individuals for rare variants. We set out to 1) create a scalable, multiplexing method for custom capture with or without individual DNA indexing that was amenable to low amounts of input DNA and 2) expand the functionality of the SPLINTER algorithm for calling substitutions, insertions and deletions across either candidate genes or the entire exome by integrating the variant calling algorithm with the dynamic programming aligner, Novoalign. We report methodology for pooled hybridization capture with pre-enrichment, indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person. Modified solid phase reversible immobilization bead purification strategies enable no sample transfers from sonication in 96-well plates through adapter ligation, resulting in 50% less library preparation reagent consumption. Custom Y-shaped adapters containing novel 7 base pair index sequences with a Hamming distance of ≥2 were directly ligated onto fragmented source DNA eliminating the need for PCR to incorporate indexes, and was followed by a custom blocking strategy using a single oligonucleotide regardless of index sequence. These results were obtained aligning raw reads against the entire genome using Novoalign followed by variant calling of non-indexed pools using SPLINTER or SAMtools for indexed samples. With these pipelines, we find sensitivity and specificity of 99.4% and 99.7% for pooled exome sequencing. Sensitivity, and to a lesser degree specificity, proved to be a function of coverage. For rare variants (≤2% minor allele frequency), we achieved sensitivity and specificity of ≥94.9% and ≥99.99% for custom capture of 2.5 Mb in multiplexed libraries of 22-48 individuals with only ≥5-fold coverage/chromosome, but these parameters improved to ≥98.7 and 100% with 20-fold coverage/chromosome. This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.
Target Site Recognition by a Diversity-Generating Retroelement
Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.
2011-01-01
Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701
Spatial serial order processing in schizophrenia.
Fraser, David; Park, Sohee; Clark, Gina; Yohanna, Daniel; Houk, James C
2004-10-01
The aim of this study was to examine serial order processing deficits in 21 schizophrenia patients and 16 age- and education-matched healthy controls. In a spatial serial order working memory task, one to four spatial targets were presented in a randomized sequence. Subjects were required to remember the locations and the order in which the targets were presented. Patients showed a marked deficit in ability to remember the sequences compared with controls. Increasing the number of targets within a sequence resulted in poorer memory performance for both control and schizophrenia subjects, but the effect was much more pronounced in the patients. Targets presented at the end of a long sequence were more vulnerable to memory error in schizophrenia patients. Performance deficits were not attributable to motor errors, but to errors in target choice. The results support the idea that the memory errors seen in schizophrenia patients may be due to saturating the working memory network at relatively low levels of memory load.
Streeter, Jason E.; Gessner, Ryan; Miles, Iman; Dayton, Paul A.
2010-01-01
Molecular imaging with ultrasound relies on microbubble contrast agents (MCAs) selectively adhering to a ligand-specific target. Prior studies have shown that only small quantities of microbubbles are retained at their target sites, therefore, enhancing contrast sensitivity to low concentrations of microbubbles is essential to improve molecular imaging techniques. In order to assess the effect of MCA diameter on imaging sensitivity, perfusion and molecular imaging studies were performed with microbubbles of varying size distributions. To assess signal improvement and MCA circulation time as a function of size and concentration, blood perfusion was imaged in rat kidneys using nontargeted size-sorted MCAs with a Siemens Sequoia ultrasound system (Siemans, Mountain View, CA) in cadence pulse sequencing (CPS) mode. Molecular imaging sensitivity improvements were studied with size-sorted αvβ3-targeted bubbles in both fibrosarcoma and R3230 rat tumor models. In perfusion imaging studies, video intensity and contrast persistence was ≈8 times and ≈3 times greater respectively, for “sorted 3-micron” MCAs (diameter, 3.3 ± 1.95 μm) when compared to “unsorted” MCAs (diameter, 0.9 ± 0.45 μm) at low concentrations. In targeted experiments, application of sorted 3-micron MCAs resulted in a ≈20 times video intensity increase over unsorted populations. Tailoring size-distributions results in substantial imaging sensitivity improvement over unsorted populations, which is essential in maximizing sensitivity to small numbers of MCAs for molecular imaging. PMID:20236606
Zhang, Jimmy F; James, Francis; Shukla, Anju; Girisha, Katta M; Paciorkowski, Alex R
2017-06-27
We built India Allele Finder, an online searchable database and command line tool, that gives researchers access to variant frequencies of Indian Telugu individuals, using publicly available fastq data from the 1000 Genomes Project. Access to appropriate population-based genomic variant annotation can accelerate the interpretation of genomic sequencing data. In particular, exome analysis of individuals of Indian descent will identify population variants not reflected in European exomes, complicating genomic analysis for such individuals. India Allele Finder offers improved ease-of-use to investigators seeking to identify and annotate sequencing data from Indian populations. We describe the use of India Allele Finder to identify common population variants in a disease quartet whole exome dataset, reducing the number of candidate single nucleotide variants from 84 to 7. India Allele Finder is freely available to investigators to annotate genomic sequencing data from Indian populations. Use of India Allele Finder allows efficient identification of population variants in genomic sequencing data, and is an example of a population-specific annotation tool that simplifies analysis and encourages international collaboration in genomics research.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.
Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A
1993-01-01
The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Ashman, Tia-Lynn; Tennessen, Jacob A; Dalton, Rebecca M; Govindarajulu, Rajanikanth; Koski, Matthew H; Liston, Aaron
2015-10-19
Gynodioecy, the coexistence of females and hermaphrodites, occurs in 20% of angiosperm families and often enables transitions between hermaphroditism and dioecy. Clarifying mechanisms of sex determination in gynodioecious species can thus illuminate sexual system evolution. Genetic determination of gynodioecy, however, can be complex and is not fully characterized in any wild species. We used targeted sequence capture to genetically map a novel nuclear contributor to male sterility in a self-pollinated hermaphrodite of Fragaria vesca subsp. bracteata from the southern portion of its range. To understand its interaction with another identified locus and possibly additional loci, we performed crosses within and between two populations separated by 2000 km, phenotyped the progeny and sequenced candidate markers at both sex-determining loci. The newly mapped locus contains a high density of pentatricopeptide repeat genes, a class commonly involved in restoration of fertility caused by cytoplasmic male sterility. Examination of all crosses revealed three unlinked epistatically interacting loci that determine sexual phenotype and vary in frequency between populations. Fragaria vesca subsp. bracteata represents the first wild gynodioecious species with genomic evidence of both cytoplasmic and nuclear genes in sex determination. We propose a model for the interactions between these loci and new hypotheses for the evolution of sex determining chromosomes in the subdioecious and dioecious Fragaria. Copyright © 2015 Ashman et al.
Individual microRNAs (miRNAs) display distinct mRNA targeting "rules".
Wang, Wang-Xia; Wilfred, Bernard R; Xie, Kevin; Jennings, Mary H; Hu, Yanling Hu; Stromberg, Arnold J; Nelson, Peter T
2010-01-01
MicroRNAs (miRNAs) guide Argonaute (AGO)-containing microribonucleoprotein (miRNP) complexes to target mRNAs.It has been assumed that miRNAs behave similarly to each other with regard to mRNA target recognition. The usual assumptions, which are based on prior studies, are that miRNAs target preferentially sequences in the 3'UTR of mRNAs,guided by the 5' "seed" portion of the miRNAs. Here we isolated AGO- and miRNA-containing miRNPs from human H4 tumor cells by co-immunoprecipitation (co-IP) with anti-AGO antibody. Cells were transfected with miR-107, miR-124,miR-128, miR-320, or a negative control miRNA. Co-IPed RNAs were subjected to downstream high-density Affymetrix Human Gene 1.0 ST microarray analyses using an assay we validated previously-a "RIP-Chip" experimental design. RIP-Chip data provided a list of mRNAs recruited into the AGO-miRNP in correlation to each miRNA. These experimentally identified miRNA targets were analyzed for complementary six nucleotide "seed" sequences within the transfected miRNAs. We found that miR-124 targets tended to have sequences in the 3'UTR that would be recognized by the 5' seed of miR-124, as described in previous studies. By contrast, miR-107 targets tended to have 'seed' sequences in the mRNA open reading frame, but not the 3' UTR. Further, mRNA targets of miR-128 and miR-320 are less enriched for 6-mer seed sequences in comparison to miR-107 and miR-124. In sum, our data support the importance of the 5' seed in determining binding characteristics for some miRNAs; however, the "binding rules" are complex, and individual miRNAs can have distinct sequence determinants that lead to mRNA targeting.
Yasmin, T; Nabi, A H M Nurun
2016-05-01
Ebola virus (EBV) has become a serious threat to public health. Different approaches were applied to predict continuous and discontinuous B cell epitopes as well as T cell epitopes from the sequence-based and available three-dimensional structural analyses of each protein of EBV. Peptides '(79) VPSATKRWGFRSGVPP(94) ' from GP1 and '(515) LHYWTTQDEGAAIGLA(530) ' from GP2 of Ebola were found to be the consensus peptidic sequences predicted as linear B cell epitope of which the latter contains a region (519) TTQDEG(524) that fulfilled all the criteria of accessibility, hydrophilicity, flexibility and beta turn region for becoming an ideal B cell epitope. Different nonamers as T cell epitopes were obtained that interacted with different numbers of MHC class I and class II alleles with a binding affinity of <100 nm. Interestingly, these alleles also bound to the MHC class I alleles mostly prevalent in African and South Asian regions. Of these, 'LANETTQAL' and 'FLYDRLAST' nonamers were predicted to be the most potent T cell epitopes and they, respectively, interacted with eight and twelve class I alleles that covered 63.79% and 54.16% of world population, respectively. These nonamers were found to be the core sequences of 15mer peptides that interacted with the most common class II allele, HLA-DRB1*01:01. They were further validated for their binding to specific class I alleles using docking technique. Thus, these predicted epitopes may be used as vaccine targets against EBV and can be validated in model hosts to verify their efficacy as vaccine. © 2016 The Foundation for the Scandinavian Journal of Immunology.
Mutational landscape of gastric adenocarcinoma in Chinese: implications for prognosis and therapy.
Chen, Kexin; Yang, Da; Li, Xiangchun; Sun, Baocun; Song, Fengju; Cao, Wenfeng; Brat, Daniel J; Gao, Zhibo; Li, Haixin; Liang, Han; Zhao, Yanrui; Zheng, Hong; Li, Miao; Buckner, Jan; Patterson, Scott D; Ye, Xiang; Reinhard, Christoph; Bhathena, Anahita; Joshi, Deepa; Mischel, Paul S; Croce, Carlo M; Wang, Yi Michael; Raghavakaimal, Sreekumar; Li, Hui; Lu, Xin; Pan, Yang; Chang, Han; Ba, Sujuan; Luo, Longhai; Cavenee, Webster K; Zhang, Wei; Hao, Xishan
2015-01-27
Gastric cancer (GC) is a highly heterogeneous disease. To identify potential clinically actionable therapeutic targets that may inform individualized treatment strategies, we performed whole-exome sequencing on 78 GCs of differing histologies and anatomic locations, as well as whole-genome sequencing on two GC cases, each with three primary tumors and two matching lymph node metastases. The data showed two distinct GC subtypes with either high-clonality (HiC) or low-clonality (LoC). The HiC subtype of intratumoral heterogeneity was associated with older age, TP53 (tumor protein P53) mutation, enriched C > G transition, and significantly shorter survival, whereas the LoC subtype was associated with younger age, ARID1A (AT rich interactive domain 1A) mutation, and significantly longer survival. Phylogenetic tree analysis of whole-genome sequencing data from multiple samples of two patients supported the clonal evolution of GC metastasis and revealed the accumulation of genetic defects that necessitate combination therapeutics. The most recurrently mutated genes, which were validated in a separate cohort of 216 cases by targeted sequencing, were members of the homologous recombination DNA repair, Wnt, and PI3K-ERBB pathways. Notably, the drugable NRG1 (neuregulin-1) and ERBB4 (V-Erb-B2 avian erythroblastic leukemia viral oncogene homolog 4) ligand-receptor pair were mutated in 10% of GC cases. Mutations of the BRCA2 (breast cancer 2, early onset) gene, found in 8% of our cohort and validated in The Cancer Genome Atlas GC cohort, were associated with significantly longer survivals. These data define distinct clinicogenetic forms of GC in the Chinese population that are characterized by specific mutation sets that can be investigated for efficacy of single and combination therapies.
Contribution of Peptide Backbone to Anti-Citrullinated Peptide Antibody Reactivity
Trier, Nicole Hartwig; Dam, Catharina Essendrup; Olsen, Dorthe Tange; Hansen, Paul Robert; Houen, Gunnar
2015-01-01
Rheumatoid arthritis (RA) is one of the most common autoimmune diseases, affecting approximately 1–2% of the world population. One of the characteristic features of RA is the presence of autoantibodies. Especially the highly specific anti-citrullinated peptide antibodies (ACPAs), which have been found in up to 70% of RA patients’ sera, have received much attention. Several citrullinated proteins are associated with RA, suggesting that ACPAs may react with different sequence patterns, separating them from traditional antibodies, whose reactivity usually is specific towards a single target. As ACPAs have been suggested to be involved in the development of RA, knowledge about these antibodies may be crucial. In this study, we examined the influence of peptide backbone for ACPA reactivity in immunoassays. The antibodies were found to be reactive with a central Cit-Gly motif being essential for ACPA reactivity and to be cross-reactive between the selected citrullinated peptides. The remaining amino acids within the citrullinated peptides were found to be of less importance for antibody reactivity. Moreover, these findings indicated that the Cit-Gly motif in combination with peptide backbone is essential for antibody reactivity. Based on these findings it was speculated that any amino acid sequence, which brings the peptide into a properly folded structure for antibody recognition is sufficient for antibody reactivity. These findings are in accordance with the current hypothesis that structural homology rather than sequence homology are favored between citrullinated epitopes. These findings are important in relation to clarifying the etiology of RA and to determine the nature of ACPAs, e.g. why some Cit-Gly-containing sequences are not targeted by ACPAs. PMID:26657009
Unlocking hidden genomic sequence
Keith, Jonathan M.; Cochran, Duncan A. E.; Lala, Gita H.; Adams, Peter; Bryant, Darryn; Mitchelson, Keith R.
2004-01-01
Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs. PMID:14973330
2009-01-01
Background The oomycete Aphanomyces astaci is regarded as the causative agent of crayfish plague and represents an evident hazard for European crayfish species. Native crayfish populations infected with this pathogen suffer up to 100% mortality. The existence of multiple transmission paths necessitates the development of a reliable, robust and efficient test to detect the pathogen. Currently, A. astaci is diagnosed by a PCR-based assay that suffers from cross-reactivity to other species. We developed an alternative closed-tube assay for A. astaci, which achieves robustness through simultaneous amplification of multiple functionally constrained genes. Results Two novel constitutively expressed members of the glycosyl hydrolase (GH18) gene family of chitinases were isolated from the A. astaci strain Gb04. The primary amino acid sequence of these chitinase genes, termed CHI2 and CHI3, is composed of an N-terminal signal peptide directing the post-translational transport of the protein into the extracellular space, the catalytic GH18 domain, a proline-, serine-, and threonine-rich domain and a C-terminal cysteine-rich putative chitin-binding site. The A. astaci mycelium grown in a pepton-glucose medium showed significant temporal changes in steady-state CHI2 and CHI3 mRNA amounts indicating functional constraint. Their different temporal occurrence with maxima at 48 and 24 hours of incubation for CHI2 and CHI3, respectively, is in accordance with the multifunctionality of GH18 family members. To identify A. astaci-specific primer target sites in these novel genes, we determined the partial sequence homologs in the related oomycetes A. frigidophilus, A. invadans, A. helicoides, A. laevis, A. repetans, Achlya racemosa, Leptolegnia caudata, and Saprolegnia parasitica, as well as in the relevant fungi Fusarium solani and Trichosporon cutaneum. An A. astaci-specific primer pair targeting the novel genes CHI2 and CHI3 as well as CHI1 - a third GH18 family member - was multiplexed with primers targeting the 5.8S rRNA used as an endogenous control. A species was typed unambiguously as A. astaci if two peaks were concomitantly detected by melting curve analysis (MCA). For sensitive detection of the pathogen, but also for quantification of agent levels in susceptible crayfish and carrier crayfish, a TaqMan-probe based real-time PCR (qPCR) assay was developed. It targets the same chitinase genes and allows quantification down to 25 target sequences. Conclusion The simultaneous qualitative detection of multiple sequences by qPCR/MCA represents a promising approach to detect species with elevated levels of genetic variation and/or limited available sequence information. The homogenous closed-tube format, reduced detection time, higher specificity, and the considerably reduced chance of false negative detection achieved by targeting multiple genes (CHI1, CHI2, CHI3, and the endogenous control) at least two of which are subject to high functional constraint, are the major advantages of this multiplex assay compared to other diagnostic methods. Sensitive quantification achieved with TaqMan qPCR facilitates to monitor infection status and pathogen distribution in different tissues and can help prevent disease transmission. PMID:19719847
He, Yaodong; Ma, Tiantian; Zhang, Xiaobo
2017-01-01
MicroRNAs (miRNAs), important factors in animal innate immunity, suppress the expressions of their target genes by binding to target mRNA’s 3′ untranslated regions (3′UTRs). However, the mechanism of synchronous regulation of multiple targets by a single miRNA remains unclear. In this study, the interaction between a white spot syndrome virus (WSSV) miRNA (WSSV-miR-N32) and its two viral targets (wsv459 and wsv322) was characterized in WSSV-infected shrimp. The outcomes indicated that WSSV-encoded miRNA (WSSV-miR-N32) significantly inhibited virus infection by simultaneously targeting wsv459 and wsv322. The silencing of wsv459 or wsv322 by siRNA led to significant decrease of WSSV copies in shrimp, showing that the two viral genes were required for WSSV infection. WSSV-miR-N32 could mediate 5′–3′ exonucleolytic digestion of its target mRNAs, which stopped at the sites of target mRNA 3′UTRs close to the sequence complementary to the miRNA seed sequence. The complementary bases (to the target mRNA sequence) of a miRNA 9th–18th non-seed sequence were essential for the miRNA targeting. Therefore, our findings presented novel insights into the mechanism of miRNA-mediated suppression of target gene expressions, which would be helpful for understanding the roles of miRNAs in innate immunity of invertebrate. PMID:29230209
2011-01-01
Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes) from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus) were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region) were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski) using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP) reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73%) already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the time depth of the domestic horse mtDNA gene pool. PMID:22082251
Zimmer, C T; Maiwald, F; Schorn, C; Bass, C; Ott, M-C; Nauen, R
2014-08-01
The pollen beetle Meligethes aeneus is the most important coleopteran pest in European oilseed rape cultivation, annually infesting millions of hectares and responsible for substantial yield losses if not kept under economic damage thresholds. This species is primarily controlled with insecticides but has recently developed high levels of resistance to the pyrethroid class. The aim of the present study was to provide a transcriptomic resource to investigate mechanisms of resistance. cDNA was sequenced on both Roche (Indianapolis, IN, USA) and Illumina (LGC Genomics, Berlin, Germany) platforms, resulting in a total of ∼53 m reads which assembled into 43 396 expressed sequence tags (ESTs). Manual annotation revealed good coverage of genes encoding insecticide target sites and detoxification enzymes. A total of 77 nonredundant cytochrome P450 genes were identified. Mapping of Illumina RNAseq sequences (from susceptible and pyrethroid-resistant strains) against the reference transcriptome identified a cytochrome P450 (CYP6BQ23) as highly overexpressed in pyrethroid resistance strains. Single-nucleotide polymorphism analysis confirmed the presence of a target-site resistance mutation (L1014F) in the voltage-gated sodium channel of one resistant strain. Our results provide new insights into the important genes associated with pyrethroid resistance in M. aeneus. Furthermore, a comprehensive EST resource is provided for future studies on insecticide modes of action and resistance mechanisms in pollen beetle. © 2014 The Royal Entomological Society.
Papasavva, Thessalia; van IJcken, Wilfred F J; Kockx, Christel E M; van den Hout, Mirjam C G N; Kountouris, Petros; Kythreotis, Loukas; Kalogirou, Eleni; Grosveld, Frank G; Kleanthous, Marina
2013-01-01
β-Thalassaemia is one of the most common autosomal recessive single-gene disorder worldwide, with a carrier frequency of 12% in Cyprus. Prenatal tests for at risk pregnancies use invasive methods and development of a non-invasive prenatal diagnostic (NIPD) method is of paramount importance to prevent unnecessary risks inherent to invasive methods. Here, we describe such a method by assessing a modified version of next generation sequencing (NGS) using the Illumina platform, called ‘targeted sequencing', based on the detection of paternally inherited fetal alleles in maternal plasma. We selected four single-nucleotide polymorphisms (SNPs) located in the β-globin locus with a high degree of heterozygosity in the Cypriot population. Spiked genomic samples were used to determine the specificity of the platform. We could detect the minor alleles in the expected ratio, showing the specificity of the platform. We then developed a multiplexed format for the selected SNPs and analysed ten maternal plasma samples from pregnancies at risk. The presence or absence of the paternal mutant allele was correctly determined in 27 out of 34 samples analysed. With haplotype analysis, NIPD was possible on eight out of ten families. This is the first study carried out for the NIPD of β-thalassaemia using targeted NGS and haplotype analysis. Preliminary results show that NGS is effective in detecting paternally inherited alleles in the maternal plasma. PMID:23572027
Carpenter, Meredith L.; Buenrostro, Jason D.; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E.; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M. Thomas P.; Willerslev, Eske; Greenleaf, William J.; Bustamante, Carlos D.
2013-01-01
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples. PMID:24568772
Xi, Yanwei; Arbabi, Aryan; McNaughton, Amy J M; Hamilton, Alison; Hull, Danna; Perras, Helene; Chiu, Tillie; Morrison, Shawna; Goldsmith, Claire; Creede, Emilie; Anger, Gregory J; Honeywell, Christina; Cloutier, Mireille; Macchio, Natasha; Kiss, Courtney; Liu, Xudong; Crocker, Susan; Davies, Gregory A; Brudno, Michael; Armour, Christine M
2017-01-01
To develop an alternate noninvasive prenatal testing method for the assessment of trisomy 21 (T21) using a targeted semiconductor sequencing approach. A customized AmpliSeq panel was designed with 1,067 primer pairs targeting specific regions on chromosomes 21, 18, 13, and others. A total of 235 samples, including 30 affected with T21, were sequenced with an Ion Torrent Proton sequencer, and a method was developed for assessing the probability of fetal aneuploidy via derivation of a risk score. Application of the derived risk score yields a bimodal distribution, with the affected samples clustering near 1.0 and the unaffected near 0. For a risk score cutoff of 0.345, above which all would be considered at "high risk," all 30 T21-positive pregnancies were correctly predicted to be affected, and 199 of the 205 non-T21 samples were correctly predicted. The average hands-on time spent on library preparation and sequencing was 19 h in total, and the average number of reads of sequence obtained was 3.75 million per sample. With the described targeted sequencing approach on the semiconductor platform using a custom-designed library and a probabilistic statistical approach, we have demonstrated the feasibility of an alternate method of assessment for fetal T21. © 2017 S. Karger AG, Basel.
USDA-ARS?s Scientific Manuscript database
Fine-mapping of causal variants is becoming feasible for complex traits in livestock GWAS, as an increasing number of animals are sequenced. Imputation has been routinely applied to ascertain sequence variants in large genotyped populations based on small reference populations of sequenced animals. ...
USDA-ARS?s Scientific Manuscript database
Imputation has been routinely applied to ascertain sequence variants in large genotyped populations based on reference populations of sequenced animals. With the implementation of the 1000 Bull Genomes Project and increasing numbers of animals sequenced, fine-mapping of causal variants is becoming f...
Mandelker, Diana; Zhang, Liying; Kemel, Yelena; Stadler, Zsofia K; Joseph, Vijai; Zehir, Ahmet; Pradhan, Nisha; Arnold, Angela; Walsh, Michael F; Li, Yirong; Balakrishnan, Anoop R; Syed, Aijazuddin; Prasad, Meera; Nafa, Khedoudja; Carlo, Maria I; Cadoo, Karen A; Sheehan, Meg; Fleischut, Megan H; Salo-Mullen, Erin; Trottier, Magan; Lipkin, Steven M; Lincoln, Anne; Mukherjee, Semanti; Ravichandran, Vignesh; Cambria, Roy; Galle, Jesse; Abida, Wassim; Arcila, Marcia E; Benayed, Ryma; Shah, Ronak; Yu, Kenneth; Bajorin, Dean F; Coleman, Jonathan A; Leach, Steven D; Lowery, Maeve A; Garcia-Aguilar, Julio; Kantoff, Philip W; Sawyers, Charles L; Dickler, Maura N; Saltz, Leonard; Motzer, Robert J; O'Reilly, Eileen M; Scher, Howard I; Baselga, Jose; Klimstra, David S; Solit, David B; Hyman, David M; Berger, Michael F; Ladanyi, Marc; Robson, Mark E; Offit, Kenneth
2017-09-05
Guidelines for cancer genetic testing based on family history may miss clinically actionable genetic changes with established implications for cancer screening or prevention. To determine the proportion and potential clinical implications of inherited variants detected using simultaneous sequencing of the tumor and normal tissue ("tumor-normal sequencing") compared with genetic test results based on current guidelines. From January 2014 until May 2016 at Memorial Sloan Kettering Cancer Center, 10 336 patients consented to tumor DNA sequencing. Since May 2015, 1040 of these patients with advanced cancer were referred by their oncologists for germline analysis of 76 cancer predisposition genes. Patients with clinically actionable inherited mutations whose genetic test results would not have been predicted by published decision rules were identified. Follow-up for potential clinical implications of mutation detection was through May 2017. Tumor and germline sequencing compared with the predicted yield of targeted germline sequencing based on clinical guidelines. Proportion of clinically actionable germline mutations detected by universal tumor-normal sequencing that would not have been detected by guideline-directed testing. Of 1040 patients, the median age was 58 years (interquartile range, 50.5-66 years), 65.3% were male, and 81.3% had stage IV disease at the time of genomic analysis, with prostate, renal, pancreatic, breast, and colon cancer as the most common diagnoses. Of the 1040 patients, 182 (17.5%; 95% CI, 15.3%-19.9%) had clinically actionable mutations conferring cancer susceptibility, including 149 with moderate- to high-penetrance mutations; 101 patients tested (9.7%; 95% CI, 8.1%-11.7%) would not have had these mutations detected using clinical guidelines, including 65 with moderate- to high-penetrance mutations. Frequency of inherited mutations was related to case mix, stage, and founder mutations. Germline findings led to discussion or initiation of change to targeted therapy in 38 patients tested (3.7%) and predictive testing in the families of 13 individuals (1.3%), including 6 for whom genetic evaluation would not have been initiated by guideline-based testing. In this referral population with selected advanced cancers, universal sequencing of a broad panel of cancer-related genes in paired germline and tumor DNA samples was associated with increased detection of individuals with potentially clinically significant heritable mutations over the predicted yield of targeted germline testing based on current clinical guidelines. Knowledge of these additional mutations can help guide therapeutic and preventive interventions, but whether all of these interventions would improve outcomes for patients with cancer or their family members requires further study. clinicaltrials.gov Identifier: NCT01775072.
Artificial mismatch hybridization
Guo, Zhen; Smith, Lloyd M.
1998-01-01
An improved nucleic acid hybridization process is provided which employs a modified oligonucleotide and improves the ability to discriminate a control nucleic acid target from a variant nucleic acid target containing a sequence variation. The modified probe contains at least one artificial mismatch relative to the control nucleic acid target in addition to any mismatch(es) arising from the sequence variation. The invention has direct and advantageous application to numerous existing hybridization methods, including, applications that employ, for example, the Polymerase Chain Reaction, allele-specific nucleic acid sequencing methods, and diagnostic hybridization methods.
Bacterial charity work leads to population-wide resistance.
Lee, Henry H; Molla, Michael N; Cantor, Charles R; Collins, James J
2010-09-02
Bacteria show remarkable adaptability in the face of antibiotic therapeutics. Resistance alleles in drug target-specific sites and general stress responses have been identified in individual end-point isolates. Less is known, however, about the population dynamics during the development of antibiotic-resistant strains. Here we follow a continuous culture of Escherichia coli facing increasing levels of antibiotic and show that the vast majority of isolates are less resistant than the population as a whole. We find that the few highly resistant mutants improve the survival of the population's less resistant constituents, in part by producing indole, a signalling molecule generated by actively growing, unstressed cells. We show, through transcriptional profiling, that indole serves to turn on drug efflux pumps and oxidative-stress protective mechanisms. The indole production comes at a fitness cost to the highly resistant isolates, and whole-genome sequencing reveals that this bacterial altruism is made possible by drug-resistance mutations unrelated to indole production. This work establishes a population-based resistance mechanism constituting a form of kin selection whereby a small number of resistant mutants can, at some cost to themselves, provide protection to other, more vulnerable, cells, enhancing the survival capacity of the overall population in stressful environments.
Calculating expected DNA remnants from ancient founding events in human population genetics
Stacey, Andrew; Sheffield, Nathan C; Crandall, Keith A
2008-01-01
Background Recent advancements in sequencing and computational technologies have led to rapid generation and analysis of high quality genetic data. Such genetic data have achieved wide acceptance in studies of historic human population origins and admixture. However, in studies relating to small, recent admixture events, genetic factors such as historic population sizes, genetic drift, and mutation can have pronounced effects on data reliability and utility. To address these issues we conducted genetic simulations targeting influential genetic parameters in admixed populations. Results We performed a series of simulations, adjusting variable values to assess the affect of these genetic parameters on current human population studies and what these studies infer about past population structure. Final mean allele frequencies varied from 0.0005 to over 0.50, depending on the parameters. Conclusion The results of the simulations illustrate that, while genetic data may be sensitive and powerful in large genetic studies, caution must be used when applying genetic information to small, recent admixture events. For some parameter sets, genetic data will not be adequate to detect historic admixture. In such cases, studies should consider anthropologic, archeological, and linguistic data where possible. PMID:18928554
Nucleic acid analysis using terminal-phosphate-labeled nucleotides
Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY
2008-04-22
The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Multi-targeted priming for genome-wide gene expression assays.
Adomas, Aleksandra B; Lopez-Giraldez, Francesc; Clark, Travis A; Wang, Zheng; Townsend, Jeffrey P
2010-08-17
Complementary approaches to assaying global gene expression are needed to assess gene expression in regions that are poorly assayed by current methodologies. A key component of nearly all gene expression assays is the reverse transcription of transcribed sequences that has traditionally been performed by priming the poly-A tails on many of the transcribed genes in eukaryotes with oligo-dT, or by priming RNA indiscriminately with random hexamers. We designed an algorithm to find common sequence motifs that were present within most protein-coding genes of Saccharomyces cerevisiae and of Neurospora crassa, but that were not present within their ribosomal RNA or transfer RNA genes. We then experimentally tested whether degenerately priming these motifs with multi-targeted primers improved the accuracy and completeness of transcriptomic assays. We discovered two multi-targeted primers that would prime a preponderance of genes in the genomes of Saccharomyces cerevisiae and Neurospora crassa while avoiding priming ribosomal RNA or transfer RNA. Examining the response of Saccharomyces cerevisiae to nitrogen deficiency and profiling Neurospora crassa early sexual development, we demonstrated that using multi-targeted primers in reverse transcription led to superior performance of microarray profiling and next-generation RNA tag sequencing. Priming with multi-targeted primers in addition to oligo-dT resulted in higher sensitivity, a larger number of well-measured genes and greater power to detect differences in gene expression. Our results provide the most complete and detailed expression profiles of the yeast nitrogen starvation response and N. crassa early sexual development to date. Furthermore, our multi-targeting priming methodology for genome-wide gene expression assays provides selective targeting of multiple sequences and counter-selection against undesirable sequences, facilitating a more complete and precise assay of the transcribed sequences within the genome.
Gonçalves, Vanessa F; Parra, Flavia C; Gonçalves-Dornelas, Higgor; Rodrigues-Carvalho, Claudia; Silva, Hilton P; Pena, Sergio Dj
2010-12-01
Brazilian Amerindians have experienced a drastic population decrease in the past 500 years. Indeed, many native groups from eastern Brazil have vanished. However, their mitochondrial mtDNA haplotypes, still persist in Brazilians, at least 50 million of whom carry Amerindian mitochondrial lineages. Our objective was to test whether, by analyzing extant rural populations from regions anciently occupied by specific Amerindian groups, we could identify potentially authentic mitochondrial lineages, a strategy we have named 'homopatric targeting'. We studied 173 individuals from Queixadinha, a small village located in a territory previously occupied by the now extinct Botocudo Amerindian nation. Pedigree analysis revealed 74 unrelated matrilineages, which were screened for Amerindian mtDNA lineages by restriction fragment length polymorphism. A cosmopolitan control group was composed of 100 individuals from surrounding cities. All Amerindian lineages identified had their hypervariable segment HVSI sequenced, yielding 13 Amerindian haplotypes in Queixadinha, nine of which were not present in available databanks or in the literature. Among these haplotypes, there was a significant excess of haplogroup C (70%) and absence of haplogroup A lineages, which were the most common in the control group. The novelty of the haplotypes and the excess of the C haplogroup suggested that we might indeed have identified Botocudo lineages. To validate our strategy, we studied teeth extracted from 14 ancient skulls of Botocudo Amerindians from the collection of the National Museum of Rio de Janeiro. We recovered mtDNA sequences from all the teeth, identifying only six different haplotypes (a low haplotypic diversity of 0.8352 ± 0.0617), one of which was present among the lineages observed in the extant individuals studied. These findings validate the technique of homopatric targeting as a useful new strategy to study the peopling and colonization of the New World, especially when direct analysis of genetic material is not possible.
2010-01-01
Background Brazilian Amerindians have experienced a drastic population decrease in the past 500 years. Indeed, many native groups from eastern Brazil have vanished. However, their mitochondrial mtDNA haplotypes, still persist in Brazilians, at least 50 million of whom carry Amerindian mitochondrial lineages. Our objective was to test whether, by analyzing extant rural populations from regions anciently occupied by specific Amerindian groups, we could identify potentially authentic mitochondrial lineages, a strategy we have named 'homopatric targeting'. Results We studied 173 individuals from Queixadinha, a small village located in a territory previously occupied by the now extinct Botocudo Amerindian nation. Pedigree analysis revealed 74 unrelated matrilineages, which were screened for Amerindian mtDNA lineages by restriction fragment length polymorphism. A cosmopolitan control group was composed of 100 individuals from surrounding cities. All Amerindian lineages identified had their hypervariable segment HVSI sequenced, yielding 13 Amerindian haplotypes in Queixadinha, nine of which were not present in available databanks or in the literature. Among these haplotypes, there was a significant excess of haplogroup C (70%) and absence of haplogroup A lineages, which were the most common in the control group. The novelty of the haplotypes and the excess of the C haplogroup suggested that we might indeed have identified Botocudo lineages. To validate our strategy, we studied teeth extracted from 14 ancient skulls of Botocudo Amerindians from the collection of the National Museum of Rio de Janeiro. We recovered mtDNA sequences from all the teeth, identifying only six different haplotypes (a low haplotypic diversity of 0.8352 ± 0.0617), one of which was present among the lineages observed in the extant individuals studied. Conclusions These findings validate the technique of homopatric targeting as a useful new strategy to study the peopling and colonization of the New World, especially when direct analysis of genetic material is not possible. PMID:21122100
Liu, Yang; El-Kassaby, Yousry A.
2017-01-01
While DNA methylation carries genetic signals and is instrumental in the evolution of organismal complexity, small RNAs (sRNAs), ~18–24 ribonucleotide (nt) sequences, are crucial mediators of methylation as well as gene silencing. However, scant study deals with sRNA evolution via featuring their expression dynamics coupled with species of different evolutionary time. Here we report an atlas of sRNAs and microRNAs (miRNAs, single-stranded sRNAs) produced over time at seed-set of two major spermatophytes represented by populations of Picea glauca and Arabidopsis thaliana with different seed-set duration. We applied diverse profiling methods to examine sRNA and miRNA features, including size distribution, sequence conservation and reproduction-specific regulation, as well as to predict their putative targets. The top 27 most abundant miRNAs were highly overlapped between the two species (e.g., miR166,−319 and−396), but in P. glauca, they were less abundant and significantly less correlated with seed-set phases. The most abundant sRNAs in libraries were deeply conserved miRNAs in the plant kingdom for Arabidopsis but long sRNAs (24-nt) for P. glauca. We also found significant difference in normalized expression between populations for population-specific sRNAs but not for lineage-specific ones. Moreover, lineage-specific sRNAs were enriched in the 21-nt size class. This pattern is consistent in both species and alludes to a specific type of sRNAs (e.g., miRNA, tasiRNA) being selected for. In addition, we deemed 24 and 9 sRNAs in P. glauca and Arabidopsis, respectively, as sRNA candidates targeting known adaptive genes. Temperature had significant influence on selected gene and miRNA expression at seed development in both species. This study increases our integrated understanding of sRNA evolution and its potential link to genomic architecture (e.g., sRNA derivation from genome and sRNA-mediated genomic events) and organismal complexity (e.g., association between different sRNA expression and their functionality). PMID:29046688
Rapid amplification of 5' complementary DNA ends (5' RACE).
2005-08-01
This method is used to extend partial cDNA clones by amplifying the 5' sequences of the corresponding mRNAs 1-3. The technique requires knowledge of only a small region of sequence within the partial cDNA clone. During PCR, the thermostable DNA polymerase is directed to the appropriate target RNA by a single primer derived from the region of known sequence; the second primer required for PCR is complementary to a general feature of the target-in the case of 5' RACE, to a homopolymeric tail added (via terminal transferase) to the 3' termini of cDNAs transcribed from a preparation of mRNA. This synthetic tail provides a primer-binding site upstream of the unknown 5' sequence of the target mRNA. The products of the amplification reaction are cloned into a plasmid vector for sequencing and subsequent manipulation.
Inhibition in motor imagery: a novel action mode switching paradigm.
Rieger, Martina; Dahm, Stephan F; Koch, Iring
2017-04-01
Motor imagery requires that actual movements are prevented (i.e., inhibited) from execution. To investigate at what level inhibition takes place in motor imagery, we developed a novel action mode switching paradigm. Participants imagined (indicating only start and end) and executed movements from start buttons to target buttons, and we analyzed trial sequence effects. Trial sequences depended on current action mode (imagination or execution), previous action mode (pure blocks/same mode, mixed blocks/same mode, or mixed blocks/other mode), and movement sequence (action repetition, hand repetition, or hand alternation). Results provided evidence for global inhibition (indicated by switch benefits in execution-imagination (E-I)-sequences in comparison to I-I-sequences), effector-specific inhibition (indicated by hand repetition costs after an imagination trial), and target inhibition (indicated by target repetition benefits in I-I-sequences). No evidence for subthreshold motor activation or action-specific inhibition (inhibition of the movement of an effector to a specific target) was obtained. Two (global inhibition and effector-specific inhibition) of the three observed mechanisms are active inhibition mechanisms. In conclusion, motor imagery is not simply a weaker form of execution, which often is implied in views focusing on similarities between imagination and execution.
Genetic mutations in human rectal cancers detected by targeted sequencing.
Bai, Jun; Gao, Jinglong; Mao, Zhijun; Wang, Jianhua; Li, Jianhui; Li, Wensheng; Lei, Yu; Li, Shuaishuai; Wu, Zhuo; Tang, Chuanning; Jones, Lindsey; Ye, Hua; Lou, Feng; Liu, Zhiyuan; Dong, Zhishou; Guo, Baishuai; Huang, Xue F; Chen, Si-Yi; Zhang, Enke
2015-10-01
Colorectal cancer (CRC) is widespread with significant mortality. Both inherited and sporadic mutations in various signaling pathways influence the development and progression of the cancer. Identifying genetic mutations in CRC is important for optimal patient treatment and many approaches currently exist to uncover these mutations, including next-generation sequencing (NGS) and commercially available kits. In the present study, we used a semiconductor-based targeted DNA-sequencing approach to sequence and identify genetic mutations in 91 human rectal cancer samples. Analysis revealed frequent mutations in KRAS (58.2%), TP53 (28.6%), APC (16.5%), FBXW7 (9.9%) and PIK3CA (9.9%), and additional mutations in BRAF, CTNNB1, ERBB2 and SMAD4 were also detected at lesser frequencies. Thirty-eight samples (41.8%) also contained two or more mutations, with common combination mutations occurring between KRAS and TP53 (42.1%), and KRAS and APC (31.6%). DNA sequencing for individual cancers is of clinical importance for targeted drug therapy and the advantages of such targeted gene sequencing over other NGS platforms or commercially available kits in sensitivity, cost and time effectiveness may aid clinicians in treating CRC patients in the near future.
Powell, John H; Amish, Stephen J; Haynes, Gwilym D; Luikart, Gordon; Latch, Emily K
2016-09-01
Mule deer (Odocoileus hemionus) are an excellent nonmodel species for empirically testing hypotheses in landscape and population genomics due to their large population sizes (low genetic drift), relatively continuous distribution, diversity of occupied habitats and phenotypic variation. Because few genomic resources are currently available for this species, we used exon data from a cattle (Bos taurus) reference genome to direct targeted resequencing of 5935 genes in mule deer. We sequenced approximately 3.75 Mbp at minimum 20X coverage in each of the seven mule deer, identifying 23 204 single nucleotide polymorphisms (SNPs) within, or adjacent to, 6886 exons in 3559 genes. We found 91 SNP loci (from 69 genes) with putatively fixed allele frequency differences between the two major lineages of mule deer (mule deer and black-tailed deer), and our estimate of mean genetic divergence (genome-wide FST = 0.123) between these lineages was consistent with previous findings using microsatellite loci. We detected an over-representation of gamete generation and amino acid transport genes among the genes with SNPs exhibiting potentially fixed allele frequency differences between lineages. This targeted resequencing approach using exon capture techniques has identified a suite of loci that can be used in future research to investigate the genomic basis of adaptation and differentiation between black-tailed deer and mule deer. This study also highlights techniques (and an exon capture array) that will facilitate population genomic research in other cervids and nonmodel organisms. © 2016 John Wiley & Sons Ltd.
Plastid primers for angiosperm phylogenetics and phylogeography.
Prince, Linda M
2015-06-01
PCR primers are available for virtually every region of the plastid genome. Selection of which primer pairs to use is second only to selection of the genic region. This is particularly true for research at the species/population interface. Primer pairs for 130 regions of the chloroplast genome were evaluated in 12 species distributed across the angiosperms. Likelihood of amplification success was inferred based upon number and location of mismatches to target sequence. Intraspecific sequence variability was evaluated under three different criteria in four species. Many published primer pairs should work across all taxa sampled, with the exception of failure due to genomic reorganization events. Universal barcoding primers were the least likely to work (65% success). The list of most variable regions for use within species has little in common with the lists identified in prior studies. Published primer sequences should amplify a diversity of flowering plant DNAs, even those designed for specific taxonomic groups. "Universal" primers may have extremely limited utility. There was little consistency in likelihood of amplification success for any given publication across lineages or within lineage across publications.
Josko, Deborah
2014-01-01
The advent of DNA sequencing technologies and the various applications that can be performed will have a dramatic effect on medicine and healthcare in the near future. There are several DNA sequencing platforms available on the market for research and clinical use. Based on the medical laboratory scientist or researcher's needs and taking into consideration laboratory space and budget, one can chose which platform will be beneficial to their institution and their patient population. Although some of the instrument costs seem high, diagnosing a patient quickly and accurately will save hospitals money with fewer hospital stays and targeted treatment based on an individual's genetic make-up. By determining the type of disease an individual has, based on the mutations present or having the ability to prescribe the appropriate antimicrobials based on the knowledge of the organism's resistance patterns, the clinician will be better able to treat and diagnose a patient which ultimately will improve patient outcomes and prognosis.
Kuhn, Alexandre; Ong, Yao Min; Quake, Stephen R; Burkholder, William F
2015-07-08
Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.
A polygenic burden of rare disruptive mutations in schizophrenia
Purcell, Shaun M.; Moran, Jennifer L.; Fromer, Menachem; Ruderfer, Douglas; Solovieff, Nadia; Roussos, Panos; O’Dushlaine, Colm; Chambert, Kimberly; Bergen, Sarah E.; Kähler, Anna; Duncan, Laramie; Stahl, Eli; Genovese, Giulio; Fernández, Esperanza; Collins, Mark O; Komiyama, Noboru H.; Choudhary, Jyoti S.; Magnusson, Patrik K. E.; Banks, Eric; Shakir, Khalid; Garimella, Kiran; Fennell, Tim; de Pristo, Mark; Grant, Seth G.N.; Haggarty, Stephen; Gabriel, Stacey; Scolnick, Edward M.; Lander, Eric S.; Hultman, Christina; Sullivan, Patrick F.; McCarroll, Steven A.; Sklar, Pamela
2014-01-01
By analyzing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we have demonstrated a polygenic burden primarily arising from rare (<1/10,000), disruptive mutations distributed across many genes. Especially enriched genesets included the voltage-gated calcium ion channel and the signaling complex formed by the activity-regulated cytoskeleton-associated (ARC) scaffold protein of the postsynaptic density (PSD), sets previously implicated by genome-wide association studies (GWAS) and copy-number variation (CNV) studies. Similar to reports in autism, targets of the fragile × mental retardation protein (FMRP, product of FMR1) were enriched for case mutations. No individual gene-based test achieved significance after correction for multiple testing and we did not detect any alleles of moderately low frequency (~0.5-1%) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene mapping paradigms in neuropsychiatric disease. PMID:24463508
Tank, David C.
2016-01-01
Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or—more importantly—recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500bp for 349 samples. PMID:26828929
Illusory conjunctions of pitch and duration in unfamiliar tone sequences.
Thompson, W F; Hall, M D; Pressing, J
2001-02-01
In 3 experiments, the authors examined short-term memory for pitch and duration in unfamiliar tone sequences. Participants were presented a target sequence consisting of 2 tones (Experiment 1) or 7 tones (Experiments 2 and 3) and then a probe tone. Participants indicated whether the probe tone matched 1 of the target tones in both pitch and duration. Error rates were relatively low if the probe tone matched 1 of the target tones or if it differed from target tones in pitch, duration, or both. Error rates were remarkably high, however, if the probe tone combined the pitch of 1 target tone with the duration of a different target tone. The results suggest that illusory conjunctions of these dimensions frequently occur. A mathematical model is presented that accounts for the relative contribution of pitch errors, duration errors, and illusory conjunctions of pitch and duration.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.
2007-12-11
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
1999-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
2002-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.
2010-11-09
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.
2000-01-01
The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.