Sample records for targeted sequencing identified

  1. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  2. Intravenous phage display identifies peptide sequences that target the burn-injured intestine.

    PubMed

    Costantini, Todd W; Eliceiri, Brian P; Putnam, James G; Bansal, Vishal; Baird, Andrew; Coimbra, Raul

    2012-11-01

    The injured intestine is responsible for significant morbidity and mortality after severe trauma and burn; however, targeting the intestine with therapeutics aimed at decreasing injury has proven difficult. We hypothesized that we could use intravenous phage display technology to identify peptide sequences that target the injured intestinal mucosa in a murine model, and then confirm the cross-reactivity of this peptide sequence with ex vivo human gut. Four hours following 30% TBSA burn we performed an in vivo, intravenous systemic administration of phage library containing 10(12) phage in balb/c mice to biopan for gut-targeting peptides. In vivo assessment of the candidate peptide sequences identified after 4 rounds of internalization was performed by injecting 1×10(12) copies of each selected phage clone into sham or burned animals. Internalization into the gut was assessed using quantitative polymerase chain reaction. We then incubated this gut-targeting peptide sequence with human intestine and visualized fluorescence using confocal microscopy. We identified 3 gut-targeting peptide sequences which caused collapse of the phage library (4-1: SGHQLLLNKMP, 4-5: ILANDLTAPGPR, 4-11: SFKPSGLPAQSL). Sequence 4-5 was internalized into the intestinal mucosa of burned animals 9.3-fold higher than sham animals injected with the same sequence (2.9×10(5)vs. 3.1×10(4) particles per mg tissue). Sequences 4-1 and 4-11 were both internalized into the gut, but did not demonstrate specificity for the injured mucosa. Phage sequence 4-11 demonstrated cross-reactivity with human intestine. In the future, this gut-targeting peptide sequence could serve as a platform for the delivery of biotherapeutics. Copyright © 2012 Elsevier Inc. All rights reserved.

  3. Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

    PubMed Central

    Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

    2005-01-01

    Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134

  4. Accurate and exact CNV identification from targeted high-throughput sequence data.

    PubMed

    Nord, Alex S; Lee, Ming; King, Mary-Claire; Walsh, Tom

    2011-04-12

    Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data. Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate. Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.

  5. Phage display selection of peptides that target calcium-binding proteins.

    PubMed

    Vetter, Stefan W

    2013-01-01

    Phage display allows to rapidly identify peptide sequences with binding affinity towards target proteins, for example, calcium-binding proteins (CBPs). Phage technology allows screening of 10(9) or more independent peptide sequences and can identify CBP binding peptides within 2 weeks. Adjusting of screening conditions allows selecting CBPs binding peptides that are either calcium-dependent or independent. Obtained peptide sequences can be used to identify CBP target proteins based on sequence homology or to quickly obtain peptide-based CBP inhibitors to modulate CBP-target interactions. The protocol described here uses a commercially available phage display library, in which random 12-mer peptides are displayed on filamentous M13 phages. The library was screened against the calcium-binding protein S100B.

  6. A tale of two sequences: microRNA-target chimeric reads.

    PubMed

    Broughton, James P; Pasquinelli, Amy E

    2016-04-04

    In animals, a functional interaction between a microRNA (miRNA) and its target RNA requires only partial base pairing. The limited number of base pair interactions required for miRNA targeting provides miRNAs with broad regulatory potential and also makes target prediction challenging. Computational approaches to target prediction have focused on identifying miRNA target sites based on known sequence features that are important for canonical targeting and may miss non-canonical targets. Current state-of-the-art experimental approaches, such as CLIP-seq (cross-linking immunoprecipitation with sequencing), PAR-CLIP (photoactivatable-ribonucleoside-enhanced CLIP), and iCLIP (individual-nucleotide resolution CLIP), require inference of which miRNA is bound at each site. Recently, the development of methods to ligate miRNAs to their target RNAs during the preparation of sequencing libraries has provided a new tool for the identification of miRNA target sites. The chimeric, or hybrid, miRNA-target reads that are produced by these methods unambiguously identify the miRNA bound at a specific target site. The information provided by these chimeric reads has revealed extensive non-canonical interactions between miRNAs and their target mRNAs, and identified many novel interactions between miRNAs and noncoding RNAs.

  7. GRIL-seq provides a method for identifying direct targets of bacterial small regulatory RNA by in vivo proximity ligation.

    PubMed

    Han, Kook; Tjaden, Brian; Lory, Stephen

    2016-12-22

    The first step in the post-transcriptional regulatory function of most bacterial small non-coding RNAs (sRNAs) is base pairing with partially complementary sequences of targeted transcripts. We present a simple method for identifying sRNA targets in vivo and defining processing sites of the regulated transcripts. The technique, referred to as global small non-coding RNA target identification by ligation and sequencing (GRIL-seq), is based on preferential ligation of sRNAs to the ends of base-paired targets in bacteria co-expressing T4 RNA ligase, followed by sequencing to identify the chimaeras. In addition to the RNA chaperone Hfq, the GRIL-seq method depends on the activity of the pyrophosphorylase RppH. Using PrrF1, an iron-regulated sRNA in Pseudomonas aeruginosa, we demonstrated that direct regulatory targets of this sRNA can readily be identified. Therefore, GRIL-seq represents a powerful tool not only for identifying direct targets of sRNAs in a variety of environments, but also for uncovering novel roles for sRNAs and their targets in complex regulatory networks.

  8. Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

    DOE PAGES

    Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa; ...

    2014-09-01

    Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less

  9. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

    PubMed

    Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

    2018-04-11

    Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions. It also allows the definition of sequence length and sequence variability of the target region as well as the less variable flanking regions for tailoring to MPS platforms. As shown in this study, TIA can be used to discover identity-linked SNP islands within the human genome, useful for differentiating individuals by targeted resequencing on MPS technologies.

  10. Targeted therapy according to next generation sequencing-based panel sequencing.

    PubMed

    Saito, Motonobu; Momma, Tomoyuki; Kono, Koji

    2018-04-17

    Targeted therapy against actionable gene mutations shows a significantly higher response rate as well as longer survival compared to conventional chemotherapy, and has become a standard therapy for many cancers. Recent progress in next-generation sequencing (NGS) has enabled to identify huge number of genetic aberrations. Based on sequencing results, patients recommend to undergo targeted therapy or immunotherapy. In cases where there are no available approved drugs for the genetic mutations detected in the patients, it is recommended to be facilitate the registration for the clinical trials. For that purpose, a NGS-based sequencing panel that can simultaneously target multiple genes in a single investigation has been used in daily clinical practice. To date, various types of sequencing panels have been developed to investigate genetic aberrations with tumor somatic genome variants (gain-of-function or loss-of-function mutations, high-level copy number alterations, and gene fusions) through comprehensive bioinformatics. Because sequencing panels are efficient and cost-effective, they are quickly being adopted outside the lab, in hospitals and clinics, in order to identify personal targeted therapy for individual cancer patients.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa

    Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less

  12. Mining, identification and function analysis of microRNAs and target genes in peanut (Arachis hypogaea L.).

    PubMed

    Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua

    2017-02-01

    In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  13. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  14. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  15. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  16. Design of the hairpin ribozyme for targeting specific RNA sequences.

    PubMed

    Hampel, A; DeYoung, M B; Galasinski, S; Siwkowski, A

    1997-01-01

    The following steps should be taken when designing the hairpin ribozyme to cleave a specific target sequence: 1. Select a target sequence containing BN*GUC where B is C, G, or U. 2. Select the target sequence in areas least likely to have extensive interfering structure. 3. Design the conventional hairpin ribozyme as shown in Fig. 1, such that it can form a 4 bp helix 2 and helix 1 lengths up to 10 bp. 4. Synthesize this ribozyme from single-stranded DNA templates with a double-stranded T7 promoter. 5. Prepare a series of short substrates capable of forming a range of helix 1 lengths of 5-10 bp. 6. Identify these by direct RNA sequencing. 7. Assay the extent of cleavage of each substrate to identify the optimal length of helix 1. 8. Prepare the hairpin tetraloop ribozyme to determine if catalytic efficiency can be improved.

  17. Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

    PubMed Central

    Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-01-01

    Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039

  18. UniDrug-target: a computational tool to identify unique drug targets in pathogenic bacteria.

    PubMed

    Chanumolu, Sree Krishna; Rout, Chittaranjan; Chauhan, Rajinder S

    2012-01-01

    Targeting conserved proteins of bacteria through antibacterial medications has resulted in both the development of resistant strains and changes to human health by destroying beneficial microbes which eventually become breeding grounds for the evolution of resistances. Despite the availability of more than 800 genomes sequences, 430 pathways, 4743 enzymes, 9257 metabolic reactions and protein (three-dimensional) 3D structures in bacteria, no pathogen-specific computational drug target identification tool has been developed. A web server, UniDrug-Target, which combines bacterial biological information and computational methods to stringently identify pathogen-specific proteins as drug targets, has been designed. Besides predicting pathogen-specific proteins essentiality, chokepoint property, etc., three new algorithms were developed and implemented by using protein sequences, domains, structures, and metabolic reactions for construction of partial metabolic networks (PMNs), determination of conservation in critical residues, and variation analysis of residues forming similar cavities in proteins sequences. First, PMNs are constructed to determine the extent of disturbances in metabolite production by targeting a protein as drug target. Conservation of pathogen-specific protein's critical residues involved in cavity formation and biological function determined at domain-level with low-matching sequences. Last, variation analysis of residues forming similar cavities in proteins sequences from pathogenic versus non-pathogenic bacteria and humans is performed. The server is capable of predicting drug targets for any sequenced pathogenic bacteria having fasta sequences and annotated information. The utility of UniDrug-Target server was demonstrated for Mycobacterium tuberculosis (H37Rv). The UniDrug-Target identified 265 mycobacteria pathogen-specific proteins, including 17 essential proteins which can be potential drug targets. UniDrug-Target is expected to accelerate pathogen-specific drug targets identification which will increase their success and durability as drugs developed against them have less chance to develop resistances and adverse impact on environment. The server is freely available at http://117.211.115.67/UDT/main.html. The standalone application (source codes) is available at http://www.bioinformatics.org/ftp/pub/bioinfojuit/UDT.rar.

  19. Next-generation sequencing for targeted discovery of rare mutations in rice

    USDA-ARS?s Scientific Manuscript database

    Advances in DNA sequencing (i.e., next-generation sequencing, NGS) have greatly increased the power and efficiency of detecting rare mutations in large mutant populations. Targeting Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach for identifying gene mutations resulting fro...

  20. Dubinett - Targeted Sequencing 2012 — EDRN Public Portal

    Cancer.gov

    we propose to use targeted massively parallel DNA sequencing to identify somatic alterations within mutational hotspots in matched sets of primary lung tumors, premalignant lesions, and adjacent,histologically normal lung tissue.

  1. Labeled nucleotide phosphate (NP) probes

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2009-02-03

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  2. TP53, PIK3CA, FBXW7 and KRAS Mutations in Esophageal Cancer Identified by Targeted Sequencing.

    PubMed

    Zheng, Huili; Wang, Yan; Tang, Chuanning; Jones, Lindsey; Ye, Hua; Zhang, Guangchun; Cao, Weihai; Li, Jingwen; Liu, Lifeng; Liu, Zhencong; Zhang, Chao; Lou, Feng; Liu, Zhiyuan; Li, Yangyang; Shi, Zhenfen; Zhang, Jingbo; Zhang, Dandan; Sun, Hong; Dong, Haichao; Dong, Zhishou; Guo, Baishuai; Yan, H E; Lu, Qingyu; Huang, Xue; Chen, Si-Yi

    2016-01-01

    Esophageal cancer (EC) is a common malignancy with significant morbidity and mortality. As individual cancers exhibit unique mutation patterns, identifying and characterizing gene mutations in EC that may serve as biomarkers might help predict patient outcome and guide treatment. Traditionally, personalized cancer DNA sequencing was impractical and expensive. Recent technological advancements have made targeted DNA sequencing more cost- and time-effective with reliable results. This technology may be useful for clinicians to direct patient treatment. The Ion PGM and AmpliSeq Cancer Panel was used to identify mutations at 737 hotspot loci of 45 cancer-related genes in 64 EC samples from Chinese patients. Frequent mutations were found in TP53 and less frequent mutations in PIK3CA, FBXW7 and KRAS. These results demonstrate that targeted sequencing can reliably identify mutations in individual tumors that make this technology a possibility for clinical use. Copyright© 2016, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.

  3. TargetLink, a new method for identifying the endogenous target set of a specific microRNA in intact living cells.

    PubMed

    Xu, Yan; Chen, Yan; Li, Daliang; Liu, Qing; Xuan, Zhenyu; Li, Wen-Hong

    2017-02-01

    MicroRNAs are small non-coding RNAs acting as posttranscriptional repressors of gene expression. Identifying mRNA targets of a given miRNA remains an outstanding challenge in the field. We have developed a new experimental approach, TargetLink, that applied locked nucleic acid (LNA) as the affinity probe to enrich target genes of a specific microRNA in intact cells. TargetLink also consists a rigorous and systematic data analysis pipeline to identify target genes by comparing LNA-enriched sequences between experimental and control samples. Using miR-21 as a test microRNA, we identified 12 target genes of miR-21 in a human colorectal cancer cell by this approach. The majority of the identified targets interacted with miR-21 via imperfect seed pairing. Target validation confirmed that miR-21 repressed the expression of the identified targets. The cellular abundance of the identified miR-21 target transcripts varied over a wide range, with some targets expressed at a rather low level, confirming that both abundant and rare transcripts are susceptible to regulation by microRNAs, and that TargetLink is an efficient approach for identifying the target set of a specific microRNA in intact cells. C20orf111, one of the novel targets identified by TargetLink, was found to reside in the nuclear speckle and to be reliably repressed by miR-21 through the interaction at its coding sequence.

  4. Prospective identification of parasitic sequences in phage display screens

    PubMed Central

    Matochko, Wadim L.; Cory Li, S.; Tang, Sindy K.Y.; Derda, Ratmir

    2014-01-01

    Phage display empowered the development of proteins with new function and ligands for clinically relevant targets. In this report, we use next-generation sequencing to analyze phage-displayed libraries and uncover a strong bias induced by amplification preferences of phage in bacteria. This bias favors fast-growing sequences that collectively constitute <0.01% of the available diversity. Specifically, a library of 109 random 7-mer peptides (Ph.D.-7) includes a few thousand sequences that grow quickly (the ‘parasites’), which are the sequences that are typically identified in phage display screens published to date. A similar collapse was observed in other libraries. Using Illumina and Ion Torrent sequencing and multiple biological replicates of amplification of Ph.D.-7 library, we identified a focused population of 770 ‘parasites’. In all, 197 sequences from this population have been identified in literature reports that used Ph.D.-7 library. Many of these enriched sequences have confirmed function (e.g. target binding capacity). The bias in the literature, thus, can be viewed as a selection with two different selection pressures: (i) target-binding selection, and (ii) amplification-induced selection. Enrichment of parasitic sequences could be minimized if amplification bias is removed. Here, we demonstrate that emulsion amplification in libraries of ∼106 diverse clones prevents the biased selection of parasitic clones. PMID:24217917

  5. RNAi screen for rapid therapeutic target identification in leukemia patients

    PubMed Central

    Tyner, Jeffrey W.; Deininger, Michael W.; Loriaux, Marc M.; Chang, Bill H.; Gotlib, Jason R.; Willis, Stephanie G.; Erickson, Heidi; Kovacsovics, Tibor; O'Hare, Thomas; Heinrich, Michael C.; Druker, Brian J.

    2009-01-01

    Targeted therapy has vastly improved outcomes in certain types of cancer. Extension of this paradigm across a broad spectrum of malignancies will require an efficient method to determine the molecular vulnerabilities of cancerous cells. Improvements in sequencing technology will soon enable high-throughput sequencing of entire genomes of cancer patients; however, determining the relevance of identified sequence variants will require complementary functional analyses. Here, we report an RNAi-assisted protein target identification (RAPID) technology that individually assesses targeting of each member of the tyrosine kinase gene family. We demonstrate that RAPID screening of primary leukemia cells from 30 patients identifies targets that are critical to survival of the malignant cells from 10 of these individuals. We identify known, activating mutations in JAK2 and K-RAS, as well as patient-specific sensitivity to down-regulation of FLT1, CSF1R, PDGFR, ROR1, EPHA4/5, JAK1/3, LMTK3, LYN, FYN, PTK2B, and N-RAS. We also describe a previously undescribed, somatic, activating mutation in the thrombopoietin receptor that is sensitive to down-stream pharmacologic inhibition. Hence, the RAPID technique can quickly identify molecular vulnerabilities in malignant cells. Combination of this technique with whole-genome sequencing will represent an ideal tool for oncogenic target identification such that specific therapies can be matched with individual patients. PMID:19433805

  6. Identification of miRNAs and their targets in wild tomato at moderately and acutely elevated temperatures by high-throughput sequencing and degradome analysis

    PubMed Central

    Zhou, Rong; Wang, Qian; Jiang, Fangling; Cao, Xue; Sun, Mintao; Liu, Min; Wu, Zhen

    2016-01-01

    MicroRNAs (miRNAs) are 19–24 nucleotide (nt) noncoding RNAs that play important roles in abiotic stress responses in plants. High temperatures have been the subject of considerable attention due to their negative effects on plant growth and development. Heat-responsive miRNAs have been identified in some plants. However, there have been no reports on the global identification of miRNAs and their targets in tomato at high temperatures, especially at different elevated temperatures. Here, three small-RNA libraries and three degradome libraries were constructed from the leaves of the heat-tolerant tomato at normal, moderately and acutely elevated temperatures (26/18 °C, 33/33 °C and 40/40 °C, respectively). Following high-throughput sequencing, 662 conserved and 97 novel miRNAs were identified in total with 469 conserved and 91 novel miRNAs shared in the three small-RNA libraries. Of these miRNAs, 96 and 150 miRNAs were responsive to the moderately and acutely elevated temperature, respectively. Following degradome sequencing, 349 sequences were identified as targets of 138 conserved miRNAs, and 13 sequences were identified as targets of eight novel miRNAs. The expression levels of seven miRNAs and six target genes obtained by quantitative real-time PCR (qRT-PCR) were largely consistent with the sequencing results. This study enriches the number of heat-responsive miRNAs and lays a foundation for the elucidation of the miRNA-mediated regulatory mechanism in tomatoes at elevated temperatures. PMID:27653374

  7. RISC RNA sequencing for context-specific identification of in vivo microRNA targets.

    PubMed

    Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

    2011-01-07

    MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1645 mRNAs consistently targeted to mouse cardiac RISCs. We used this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing "seed" sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context and is applicable to any tissue and any disease state.

  8. Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

    PubMed

    Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

    2018-03-01

    Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.

  9. Discovery of Influenza A Virus Sequence Pairs and Their Combinations for Simultaneous Heterosubtypic Targeting that Hedge against Antiviral Resistance

    PubMed Central

    Lin, Jing; Pramono, Zacharias Aloysius Dwi; Maurer-Stroh, Sebastian

    2016-01-01

    The multiple circulating human influenza A virus subtypes coupled with the perpetual genomic mutations and segment reassortment events challenge the development of effective therapeutics. The capacity to drug most RNAs motivates the investigation on viral RNA targets. 123,060 segment sequences from 35,938 strains of the most prevalent subtypes also infecting humans–H1N1, 2009 pandemic H1N1, H3N2, H5N1 and H7N9, were used to identify 1,183 conserved RNA target sequences (≥15-mer) in the internal segments. 100% theoretical coverage in simultaneous heterosubtypic targeting is achieved by pairing specific sequences from the same segment (“Duals”) or from two segments (“Doubles”); 1,662 Duals and 28,463 Doubles identified. By combining specific Duals and/or Doubles to form a target graph wherein an edge connecting two vertices (target sequences) represents a Dual or Double, it is possible to hedge against antiviral resistance besides maintaining 100% heterosubtypic coverage. To evaluate the hedging potential, we define the hedge-factor as the minimum number of resistant target sequences that will render the graph to become resistant i.e. eliminate all the edges therein; a target sequence or a graph is considered resistant when it cannot achieve 100% heterosubtypic coverage. In an n-vertices graph (n ≥ 3), the hedge-factor is maximal (= n– 1) when it is a complete graph i.e. every distinct pair in a graph is either a Dual or Double. Computational analyses uncover an extensive number of complete graphs of different sizes. Monte Carlo simulations show that the mutation counts and time elapsed for a target graph to become resistant increase with the hedge-factor. Incidentally, target sequences which were reported to reduce virus titre in experiments are included in our target graphs. The identity of target sequence pairs for heterosubtypic targeting and their combinations for hedging antiviral resistance are useful toolkits to construct target graphs for different therapeutic objectives. PMID:26771381

  10. Microfluidic droplet enrichment for targeted sequencing

    PubMed Central

    Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.

    2015-01-01

    Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629

  11. Nucleic acid analysis using terminal-phosphate-labeled nucleotides

    DOEpatents

    Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

    2008-04-22

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  12. Identifying transposon insertions and their effects from RNA-sequencing data.

    PubMed

    de Ruiter, Julian R; Kas, Sjors M; Schut, Eva; Adams, David J; Koudijs, Marco J; Wessels, Lodewyk F A; Jonkers, Jos

    2017-07-07

    Insertional mutagenesis using engineered transposons is a potent forward genetic screening technique used to identify cancer genes in mouse model systems. In the analysis of these screens, transposon insertion sites are typically identified by targeted DNA-sequencing and subsequently assigned to predicted target genes using heuristics. As such, these approaches provide no direct evidence that insertions actually affect their predicted targets or how transcripts of these genes are affected. To address this, we developed IM-Fusion, an approach that identifies insertion sites from gene-transposon fusions in standard single- and paired-end RNA-sequencing data. We demonstrate IM-Fusion on two separate transposon screens of 123 mammary tumors and 20 B-cell acute lymphoblastic leukemias, respectively. We show that IM-Fusion accurately identifies transposon insertions and their true target genes. Furthermore, by combining the identified insertion sites with expression quantification, we show that we can determine the effect of a transposon insertion on its target gene(s) and prioritize insertions that have a significant effect on expression. We expect that IM-Fusion will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and provide initial insight into the biological effects of insertions on candidate cancer genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Identification of MicroRNA Targets of Capsicum spp. Using MiRTrans—a Trans-Omics Approach

    PubMed Central

    Zhang, Lu; Qin, Cheng; Mei, Junpu; Chen, Xiaocui; Wu, Zhiming; Luo, Xirong; Cheng, Jiaowen; Tang, Xiangqun; Hu, Kailin; Li, Shuai C.

    2017-01-01

    The microRNA (miRNA) can regulate the transcripts that are involved in eukaryotic cell proliferation, differentiation, and metabolism. Especially for plants, our understanding of miRNA targets, is still limited. Early attempts of prediction on sequence alignments have been plagued by enormous false positives. It is helpful to improve target prediction specificity by incorporating the other data sources such as the dependency between miRNA and transcript expression or even cleaved transcripts by miRNA regulations, which are referred to as trans-omics data. In this paper, we developed MiRTrans (Prediction of MiRNA targets by Trans-omics data) to explore miRNA targets by incorporating miRNA sequencing, transcriptome sequencing, and degradome sequencing. MiRTrans consisted of three major steps. First, the target transcripts of miRNAs were predicted by scrutinizing their sequence characteristics and collected as an initial potential targets pool. Second, false positive targets were eliminated if the expression of miRNA and its targets were weakly correlated by lasso regression. Third, degradome sequencing was utilized to capture the miRNA targets by examining the cleaved transcripts that regulated by miRNAs. Finally, the predicted targets from the second and third step were combined by Fisher's combination test. MiRTrans was applied to identify the miRNA targets for Capsicum spp. (i.e., pepper). It can generate more functional miRNA targets than sequence-based predictions by evaluating functional enrichment. MiRTrans identified 58 miRNA-transcript pairs with high confidence from 18 miRNA families conserved in eudicots. Most of these targets were transcription factors; this lent support to the role of miRNA as key regulator in pepper. To our best knowledge, this work is the first attempt to investigate the miRNA targets of pepper, as well as their regulatory networks. Surprisingly, only a small proportion of miRNA-transcript pairs were shared between degradome sequencing and expression dependency predictions, suggesting that miRNA targets predicted by a single technology alone may be prone to report false negatives. PMID:28443105

  14. Screening of broad spectrum natural pesticides against conserved target arginine kinase in cotton pests by molecular modeling.

    PubMed

    Sakthivel, Seethalakshmi; Habeeb, S K M; Raman, Chandrasekar

    2018-03-12

    Cotton is an economically important crop and its production is challenged by the diversity of pests and related insecticide resistance. Identification of the conserved target across the cotton pest will help to design broad spectrum insecticide. In this study, we have identified conserved sequences by Expressed Sequence Tag profiling from three cotton pests namely Aphis gossypii, Helicoverpa armigera, and Spodoptera exigua. One target protein arginine kinase having a key role in insect physiology and energy metabolism was studied further using homology modeling, virtual screening, molecular docking, and molecular dynamics simulation to identify potential biopesticide compounds from the Zinc natural database. We have identified four compounds having excellent inhibitor potential against the identified broad spectrum target which are highly specific to invertebrates.

  15. GRIL-Seq, a method for identifying direct targets of bacterial small regulatory RNA by in vivo proximity ligation

    PubMed Central

    Han, Kook; Tjaden, Brian; Lory, Stephen

    2017-01-01

    The first step in the post-transcriptional regulatory function of most bacterial small non-coding RNAs (sRNAs) is base-pairing with partially complementary sequences of targeted transcripts. We present a simple method for identifying sRNA targets in vivo and defining processing sites of the regulated transcripts. The technique (referred to as GRIL-Seq) is based on preferential ligation of sRNAs to ends of base-paired targets in bacteria co-expressing T4 RNA ligase, followed by sequencing to identify the chimeras. In addition to the RNA chaperone Hfq, the GRIL-Seq method depends on the activity of the pyrophosphorylase RppH. Using PrrF1, an iron-regulated sRNA in Pseudomonas aeruginosa, we demonstrate that direct regulatory targets of this sRNA can be readily identified. Therefore, GRIL-Seq represents a powerful tool not only for identifying direct targets of sRNAs in a variety of environments, but can also result in uncovering novel roles for sRNAs and their targets in complex regulatory networks. PMID:28005055

  16. Genetic mutations in human rectal cancers detected by targeted sequencing.

    PubMed

    Bai, Jun; Gao, Jinglong; Mao, Zhijun; Wang, Jianhua; Li, Jianhui; Li, Wensheng; Lei, Yu; Li, Shuaishuai; Wu, Zhuo; Tang, Chuanning; Jones, Lindsey; Ye, Hua; Lou, Feng; Liu, Zhiyuan; Dong, Zhishou; Guo, Baishuai; Huang, Xue F; Chen, Si-Yi; Zhang, Enke

    2015-10-01

    Colorectal cancer (CRC) is widespread with significant mortality. Both inherited and sporadic mutations in various signaling pathways influence the development and progression of the cancer. Identifying genetic mutations in CRC is important for optimal patient treatment and many approaches currently exist to uncover these mutations, including next-generation sequencing (NGS) and commercially available kits. In the present study, we used a semiconductor-based targeted DNA-sequencing approach to sequence and identify genetic mutations in 91 human rectal cancer samples. Analysis revealed frequent mutations in KRAS (58.2%), TP53 (28.6%), APC (16.5%), FBXW7 (9.9%) and PIK3CA (9.9%), and additional mutations in BRAF, CTNNB1, ERBB2 and SMAD4 were also detected at lesser frequencies. Thirty-eight samples (41.8%) also contained two or more mutations, with common combination mutations occurring between KRAS and TP53 (42.1%), and KRAS and APC (31.6%). DNA sequencing for individual cancers is of clinical importance for targeted drug therapy and the advantages of such targeted gene sequencing over other NGS platforms or commercially available kits in sensitivity, cost and time effectiveness may aid clinicians in treating CRC patients in the near future.

  17. Uncovering Small RNA-Mediated Responses to Cold Stress in a Wheat Thermosensitive Genic Male-Sterile Line by Deep Sequencing1[W][OA

    PubMed Central

    Tang, Zhonghui; Zhang, Liping; Xu, Chenguang; Yuan, Shaohua; Zhang, Fengting; Zheng, Yonglian; Zhao, Changping

    2012-01-01

    The male sterility of thermosensitive genic male sterile (TGMS) lines of wheat (Triticum aestivum) is strictly controlled by temperature. The early phase of anther development is especially susceptible to cold stress. MicroRNAs (miRNAs) play an important role in plant development and in responses to environmental stress. In this study, deep sequencing of small RNA (smRNA) libraries obtained from spike tissues of the TGMS line under cold and control conditions identified a total of 78 unique miRNA sequences from 30 families and trans-acting small interfering RNAs (tasiRNAs) derived from two TAS3 genes. To identify smRNA targets in the wheat TGMS line, we applied the degradome sequencing method, which globally and directly identifies the remnants of smRNA-directed target cleavage. We identified 26 targets of 16 miRNA families and three targets of tasiRNAs. Comparing smRNA sequencing data sets and TaqMan quantitative polymerase chain reaction results, we identified six miRNAs and one tasiRNA (tasiRNA-ARF [for Auxin-Responsive Factor]) as cold stress-responsive smRNAs in spike tissues of the TGMS line. We also determined the expression profiles of target genes that encode transcription factors in response to cold stress. Interestingly, the expression of cold stress-responsive smRNAs integrated in the auxin-signaling pathway and their target genes was largely noncorrelated. We investigated the tissue-specific expression of smRNAs using a tissue microarray approach. Our data indicated that miR167 and tasiRNA-ARF play roles in regulating the auxin-signaling pathway and possibly in the developmental response to cold stress. These data provide evidence that smRNA regulatory pathways are linked with male sterility in the TGMS line during cold stress. PMID:22508932

  18. GWASeq: targeted re-sequencing follow up to GWAS.

    PubMed

    Salomon, Matthew P; Li, Wai Lok Sibon; Edlund, Christopher K; Morrison, John; Fortini, Barbara K; Win, Aung Ko; Conti, David V; Thomas, Duncan C; Duggan, David; Buchanan, Daniel D; Jenkins, Mark A; Hopper, John L; Gallinger, Steven; Le Marchand, Loïc; Newcomb, Polly A; Casey, Graham; Marjoram, Paul

    2016-03-03

    For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS. We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer. Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.

  19. HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment1

    PubMed Central

    Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.

    2016-01-01

    Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175

  20. Interactions between the R2R3-MYB Transcription Factor, AtMYB61, and Target DNA Binding Sites

    PubMed Central

    Prouse, Michael B.; Campbell, Malcolm M.

    2013-01-01

    Despite the prominent roles played by R2R3-MYB transcription factors in the regulation of plant gene expression, little is known about the details of how these proteins interact with their DNA targets. For example, while Arabidopsis thaliana R2R3-MYB protein AtMYB61 is known to alter transcript abundance of a specific set of target genes, little is known about the specific DNA sequences to which AtMYB61 binds. To address this gap in knowledge, DNA sequences bound by AtMYB61 were identified using cyclic amplification and selection of targets (CASTing). The DNA targets identified using this approach corresponded to AC elements, sequences enriched in adenosine and cytosine nucleotides. The preferred target sequence that bound with the greatest affinity to AtMYB61 recombinant protein was ACCTAC, the AC-I element. Mutational analyses based on the AC-I element showed that ACC nucleotides in the AC-I element served as the core recognition motif, critical for AtMYB61 binding. Molecular modelling predicted interactions between AtMYB61 amino acid residues and corresponding nucleotides in the DNA targets. The affinity between AtMYB61 and specific target DNA sequences did not correlate with AtMYB61-driven transcriptional activation with each of the target sequences. CASTing-selected motifs were found in the regulatory regions of genes previously shown to be regulated by AtMYB61. Taken together, these findings are consistent with the hypothesis that AtMYB61 regulates transcription from specific cis-acting AC elements in vivo. The results shed light on the specifics of DNA binding by an important family of plant-specific transcriptional regulators. PMID:23741471

  1. Next-generation sequencing in schizophrenia and other neuropsychiatric disorders.

    PubMed

    Schreiber, Matthew; Dorschner, Michael; Tsuang, Debby

    2013-10-01

    Schizophrenia is a debilitating lifelong illness that lacks a cure and poses a worldwide public health burden. The disease is characterized by a heterogeneous clinical and genetic presentation that complicates research efforts to identify causative genetic variations. This review examines the potential of current findings in schizophrenia and in other related neuropsychiatric disorders for application in next-generation technologies, particularly whole-exome sequencing (WES) and whole-genome sequencing (WGS). These approaches may lead to the discovery of underlying genetic factors for schizophrenia and may thereby identify and target novel therapeutic targets for this devastating disorder. © 2013 Wiley Periodicals, Inc.

  2. The GENCODE exome: sequencing the complete human exome

    PubMed Central

    Coffey, Alison J; Kokocinski, Felix; Calafato, Maria S; Scott, Carol E; Palta, Priit; Drury, Eleanor; Joyce, Christopher J; LeProust, Emily M; Harrow, Jen; Hunt, Sarah; Lehesjoki, Anna-Elina; Turner, Daniel J; Hubbard, Tim J; Palotie, Aarno

    2011-01-01

    Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing. PMID:21364695

  3. Modified Cross-Linking, Ligation, and Sequencing of Hybrids (qCLASH) Identifies Kaposi's Sarcoma-Associated Herpesvirus MicroRNA Targets in Endothelial Cells.

    PubMed

    Gay, Lauren A; Sethuraman, Sunantha; Thomas, Merin; Turner, Peter C; Renne, Rolf

    2018-04-15

    Kaposi's sarcoma (KS) tumors are derived from endothelial cells and express Kaposi's sarcoma-associated herpesvirus (KSHV) microRNAs (miRNAs). Although miRNA targets have been identified in B cell lymphoma-derived cells and epithelial cells, little has been done to characterize the KSHV miRNA targetome in endothelial cells. A recent innovation in the identification of miRNA targetomes, cross-linking, ligation, and sequencing of hybrids (CLASH), unambiguously identifies miRNAs and their targets by ligating the two species while both species are still bound within the RNA-induced silencing complex (RISC). We developed a streamlined quick CLASH (qCLASH) protocol that requires a lower cell input than the original method and therefore has the potential to be used on patient biopsy samples. Additionally, we developed a fast-growing, KSHV-negative endothelial cell line derived from telomerase-immortalized vein endothelial long-term culture (TIVE-LTC) cells. qCLASH was performed on uninfected cells and cells infected with either wild-type KSHV or a mutant virus lacking miR-K12-11/11*. More than 1,400 cellular targets of KSHV miRNAs were identified. Many of the targets identified by qCLASH lacked a canonical seed sequence match. Additionally, most target regions in mRNAs originated from the coding DNA sequence (CDS) rather than the 3' untranslated region (UTR). This set of genes includes some that were previously identified in B cells and some new genes that warrant further study. Pathway analysis of endothelial cell targets showed enrichment in cell cycle control, apoptosis, and glycolysis pathways, among others. Characterization of these new targets and the functional consequences of their repression will be important in furthering our understanding of the role of KSHV miRNAs in oncogenesis. IMPORTANCE KS lesions consist of endothelial cells latently infected with KSHV. Cells that make up these lesions express KSHV miRNAs. Identification of the targets of KSHV miRNAs will help us understand their role in viral oncogenesis. The cross-linking and sequencing of hybrids (CLASH) protocol is a method for unambiguously identifying miRNA targetomes. We developed a streamlined version of CLASH, called quick CLASH (qCLASH). qCLASH requires a lower initial input of cells than for its parent protocol. Additionally, a new fast-growing KSHV-negative endothelial cell line, named TIVE-EX-LTC cells, was established. qCLASH was performed on TIVE-EX-LTC cells latently infected with wild-type (WT) KSHV or a mutant virus lacking miR-K12-11/11*. A number of novel targets of KSHV miRNAs were identified, including targets of miR-K12-11, the ortholog of the cellular oncogenic miRNA (oncomiR) miR-155. Many of the miRNA targets were involved in processes related to oncogenesis, such as glycolysis, apoptosis, and cell cycle control. Copyright © 2018 American Society for Microbiology.

  4. [Detection of pathogenic mutations in Marfan syndrome by targeted next-generation semiconductor sequencing].

    PubMed

    Lu, Chaoxia; Wu, Wei; Xiao, Jifang; Meng, Yan; Zhang, Shuyang; Zhang, Xue

    2013-06-01

    To detect pathogenic mutations in Marfan syndrome (MFS) using an Ion Torrent Personal Genome Machine (PGM) and to validate the result of targeted next-generation semiconductor sequencing for the diagnosis of genetic disorders. Peripheral blood samples were collected from three MFS patients and a normal control with informed consent. Genomic DNA was isolated by standard method and then subjected to targeted sequencing using an Ion Ampliseq(TM) Inherited Disease Panel. Three multiplex PCR reactions were carried out to amplify the coding exons of 328 genes including FBN1, TGFBR1 and TGFBR2. DNA fragments from different samples were ligated with barcoded sequencing adaptors. Template preparation and emulsion PCR, and Ion Sphere Particles enrichment were carried out using an Ion One Touch system. The ion sphere particles were sequenced on a 318 chip using the PGM platform. Data from the PGM runs were processed using an Ion Torrent Suite 3.2 software to generate sequence reads. After sequence alignment and extraction of SNPs and indels, all the variants were filtered against dbSNP137. DNA sequences were visualized with an Integrated Genomics Viewer. The most likely disease-causing variants were analyzed by Sanger sequencing. The PGM sequencing has yielded an output of 855.80 Mb, with a > 100 × median sequencing depth and a coverage of > 98% for the targeted regions in all the four samples. After data analysis and database filtering, one known missense mutation (p.E1811K) and two novel premature termination mutations (p.E2264X and p.L871FfsX23) in the FBN1 gene were identified in the three MFS patients. All mutations were verified by conventional Sanger sequencing. Pathogenic FBN1 mutations have been identified in all patients with MFS, indicating that the targeted next-generation sequencing on the PGM sequencers can be applied for accurate and high-throughput testing of genetic disorders.

  5. Identification of tissue-specific targeting peptide

    NASA Astrophysics Data System (ADS)

    Jung, Eunkyoung; Lee, Nam Kyung; Kang, Sang-Kee; Choi, Seung-Hoon; Kim, Daejin; Park, Kisoo; Choi, Kihang; Choi, Yun-Jaie; Jung, Dong Hyun

    2012-11-01

    Using phage display technique, we identified tissue-targeting peptide sets that recognize specific tissues (bone-marrow dendritic cell, kidney, liver, lung, spleen and visceral adipose tissue). In order to rapidly evaluate tissue-specific targeting peptides, we performed machine learning studies for predicting the tissue-specific targeting activity of peptides on the basis of peptide sequence information using four machine learning models and isolated the groups of peptides capable of mediating selective targeting to specific tissues. As a representative liver-specific targeting sequence, the peptide "DKNLQLH" was selected by the sequence similarity analysis. This peptide has a high degree of homology with protein ligands which can interact with corresponding membrane counterparts. We anticipate that our models will be applicable to the prediction of tissue-specific targeting peptides which can recognize the endothelial markers of target tissues.

  6. Identification of a Novel De Novo Heterozygous Deletion in the SOX10 Gene in Waardenburg Syndrome Type II Using Next-Generation Sequencing.

    PubMed

    Li, Haonan; Jin, Peng; Hao, Qian; Zhu, Wei; Chen, Xia; Wang, Ping

    2017-11-01

    Waardenburg syndrome (WS) is a rare autosomal dominant disorder associated with pigmentation abnormalities and sensorineural hearing loss. In this study, we investigated the genetic cause of WSII in a patient and evaluated the reliability of the targeted next-generation exome sequencing method for the genetic diagnosis of WS. Clinical evaluations were conducted on the patient and targeted next-generation sequencing (NGS) was used to identify the candidate genes responsible for WSII. Multiplex ligation-dependent probe amplification (MLPA) and real-time quantitative polymerase chain reaction (qPCR) were performed to confirm the targeted NGS results. Targeted NGS detected the entire deletion of the coding sequence (CDS) of the SOX10 gene in the WSII patient. MLPA results indicated that all exons of the SOX10 heterozygous deletion were detected; no aberrant copy number in the PAX3 and microphthalmia-associated transcription factor (MITF) genes was found. Real-time qPCR results identified the mutation as a de novo heterozygous deletion. This is the first report of using a targeted NGS method for WS candidate gene sequencing; its accuracy was verified by using the MLPA and qPCR methods. Our research provides a valuable method for the genetic diagnosis of WS.

  7. Detection of Somatic Mutations in Gastroenteropancreatic Neuroendocrine Tumors Using Targeted Deep Sequencing.

    PubMed

    Backman, Samuel; Norlén, Olov; Eriksson, Barbro; Skogseid, Britt; Stålberg, Peter; Crona, Joakim

    2017-02-01

    Mutations affecting the mechanistic target of rapamycin (MTOR) signalling pathway are frequent in human cancer and have been identified in up to 15% of pancreatic neuroendocrine tumours (NETs). Grade A evidence supports the efficacy of MTOR inhibition with everolimus in pancreatic NETs. Although a significant proportion of patients experience disease stabilization, only a minority will show objective tumour responses. It has been proposed that genomic mutations resulting in activation of MTOR signalling could be used to predict sensitivity to everolimus. Patients with NETs that underwent treatment with everolimus at our Institution were identified and those with available tumour tissue were selected for further analysis. Targeted next-generation sequencing (NGS) was used to re-sequence 22 genes that were selected on the basis of documented involvement in the MTOR signalling pathway or in the tumourigenesis of gastroenterpancreatic NETs. Radiological responses were documented using Response Evaluation Criteria in Solid Tumours. Six patients were identified, one had a partial response and four had stable disease. Sequencing of tumour tissue resulted in a median sequence depth of 667.1 (range=404-1301) with 1-fold coverage of 95.9-96.5% and 10-fold coverage of 87.6-92.2%. A total of 494 genetic variants were discovered, four of which were identified as pathogenic. All pathogenic variants were validated using Sanger sequencing and were found exclusively in menin 1 (MEN1) and death domain associated protein (DAXX) genes. No mutations in the MTOR pathway-related genes were observed. Targeted NGS is a feasible method with high diagnostic yield for genetic characterization of pancreatic NETs. A potential association between mutations in NETs and response to everolimus should be investigated by future studies. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  8. PHASTpep: Analysis Software for Discovery of Cell-Selective Peptides via Phage Display and Next-Generation Sequencing

    PubMed Central

    Dasa, Siva Sai Krishna; Kelly, Kimberly A.

    2016-01-01

    Next-generation sequencing has enhanced the phage display process, allowing for the quantification of millions of sequences resulting from the biopanning process. In response, many valuable analysis programs focused on specificity and finding targeted motifs or consensus sequences were developed. For targeted drug delivery and molecular imaging, it is also necessary to find peptides that are selective—targeting only the cell type or tissue of interest. We present a new analysis strategy and accompanying software, PHage Analysis for Selective Targeted PEPtides (PHASTpep), which identifies highly specific and selective peptides. Using this process, we discovered and validated, both in vitro and in vivo in mice, two sequences (HTTIPKV and APPIMSV) targeted to pancreatic cancer-associated fibroblasts that escaped identification using previously existing software. Our selectivity analysis makes it possible to discover peptides that target a specific cell type and avoid other cell types, enhancing clinical translatability by circumventing complications with systemic use. PMID:27186887

  9. Deep sequencing and in silico analysis of small RNA library reveals novel miRNA from leaf Persicaria minor transcriptome.

    PubMed

    Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan

    2018-03-01

    In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.

  10. Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations

    PubMed Central

    Marinier, Eric; Zaheer, Rahat; Berry, Chrystal; Weedmark, Kelly A.; Domaratzki, Michael; Mabon, Philip; Knox, Natalie C.; Reimer, Aleisha R.; Graham, Morag R.; Chui, Linda; Patterson-Fortin, Laura; Zhang, Jian; Pagotto, Franco; Farber, Jeff; Mahony, Jim; Seyer, Karine; Bekal, Sadjia; Tremblay, Cécile; Isaac-Renton, Judy; Prystajecky, Natalie; Chen, Jessica; Slade, Peter

    2017-01-01

    Abstract The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. PMID:29048594

  11. Experience of targeted Usher exome sequencing as a clinical test

    PubMed Central

    Besnard, Thomas; García-García, Gema; Baux, David; Vaché, Christel; Faugère, Valérie; Larrieu, Lise; Léonard, Susana; Millan, Jose M; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2014-01-01

    We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service. PMID:24498627

  12. SNP discovery by high-throughput sequencing in soybean

    PubMed Central

    2010-01-01

    Background With the advance of new massively parallel genotyping technologies, quantitative trait loci (QTL) fine mapping and map-based cloning become more achievable in identifying genes for important and complex traits. Development of high-density genetic markers in the QTL regions of specific mapping populations is essential for fine-mapping and map-based cloning of economically important genes. Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation existing between any diverse genotypes that are usually used for QTL mapping studies. The massively parallel sequencing technologies (Roche GS/454, Illumina GA/Solexa, and ABI/SOLiD), have been widely applied to identify genome-wide sequence variations. However, it is still remains unclear whether sequence data at a low sequencing depth are enough to detect the variations existing in any QTL regions of interest in a crop genome, and how to prepare sequencing samples for a complex genome such as soybean. Therefore, with the aims of identifying SNP markers in a cost effective way for fine-mapping several QTL regions, and testing the validation rate of the putative SNPs predicted with Solexa short sequence reads at a low sequencing depth, we evaluated a pooled DNA fragment reduced representation library and SNP detection methods applied to short read sequences generated by Solexa high-throughput sequencing technology. Results A total of 39,022 putative SNPs were identified by the Illumina/Solexa sequencing system using a reduced representation DNA library of two parental lines of a mapping population. The validation rates of these putative SNPs predicted with low and high stringency were 72% and 85%, respectively. One hundred sixty four SNP markers resulted from the validation of putative SNPs and have been selectively chosen to target a known QTL, thereby increasing the marker density of the targeted region to one marker per 42 K bp. Conclusions We have demonstrated how to quickly identify large numbers of SNPs for fine mapping of QTL regions by applying massively parallel sequencing combined with genome complexity reduction techniques. This SNP discovery approach is more efficient for targeting multiple QTL regions in a same genetic population, which can be applied to other crops. PMID:20701770

  13. Recurrent Targeted Genes of Hepatitis B Virus in the Liver Cancer Genomes Identified by a Next-Generation Sequencing–Based Approach

    PubMed Central

    Ding, Dong; Lou, Xiaoyan; Hua, Dasong; Yu, Wei; Li, Lisha; Wang, Jun; Gao, Feng; Zhao, Na; Ren, Guoping; Li, Lanjuan; Lin, Biaoyang

    2012-01-01

    Integration of the viral DNA into host chromosomes was found in most of the hepatitis B virus (HBV)–related hepatocellular carcinomas (HCCs). Here we devised a massive anchored parallel sequencing (MAPS) method using next-generation sequencing to isolate and sequence HBV integrants. Applying MAPS to 40 pairs of HBV–related HCC tissues (cancer and adjacent tissues), we identified 296 HBV integration events corresponding to 286 unique integration sites (UISs) with precise HBV–Human DNA junctions. HBV integration favored chromosome 17 and preferentially integrated into human transcript units. HBV targeted genes were enriched in GO terms: cAMP metabolic processes, T cell differentiation and activation, TGF beta receptor pathway, ncRNA catabolic process, and dsRNA fragmentation and cellular response to dsRNA. The HBV targeted genes include 7 genes (PTPRJ, CNTN6, IL12B, MYOM1, FNDC3B, LRFN2, FN1) containing IPR003961 (Fibronectin, type III domain), 7 genes (NRG3, MASP2, NELL1, LRP1B, ADAM21, NRXN1, FN1) containing IPR013032 (EGF-like region, conserved site), and three genes (PDE7A, PDE4B, PDE11A) containing IPR002073 (3′, 5′-cyclic-nucleotide phosphodiesterase). Enriched pathways include hsa04512 (ECM-receptor interaction), hsa04510 (Focal adhesion), and hsa04012 (ErbB signaling pathway). Fewer integration events were found in cancers compared to cancer-adjacent tissues, suggesting a clonal expansion model in HCC development. Finally, we identified 8 genes that were recurrent target genes by HBV integration including fibronectin 1 (FN1) and telomerase reverse transcriptase (TERT1), two known recurrent target genes, and additional novel target genes such as SMAD family member 5 (SMAD5), phosphatase and actin regulator 4 (PHACTR4), and RNA binding protein fox-1 homolog (C. elegans) 1 (RBFOX1). Integrating analysis with recently published whole-genome sequencing analysis, we identified 14 additional recurrent HBV target genes, greatly expanding the HBV recurrent target list. This global survey of HBV integration events, together with recently published whole-genome sequencing analyses, furthered our understanding of the HBV–related HCC. PMID:23236287

  14. RUCS: rapid identification of PCR primers for unique core sequences.

    PubMed

    Thomsen, Martin Christen Frølund; Hasman, Henrik; Westh, Henrik; Kaya, Hülya; Lund, Ole

    2017-12-15

    Designing PCR primers to target a specific selection of whole genome sequenced strains can be a long, arduous and sometimes impractical task. Such tasks would benefit greatly from an automated tool to both identify unique targets, and to validate the vast number of potential primer pairs for the targets in silico. Here we present RUCS, a program that will find PCR primer pairs and probes for the unique core sequences of a positive genome dataset complement to a negative genome dataset. The resulting primer pairs and probes are in addition to simple selection also validated through a complex in silico PCR simulation. We compared our method, which identifies the unique core sequences, against an existing tool called ssGeneFinder, and found that our method was 6.5-20 times more sensitive. We used RUCS to design primer pairs that would target a set of genomes known to contain the mcr-1 colistin resistance gene. Three of the predicted pairs were chosen for experimental validation using PCR and gel electrophoresis. All three pairs successfully produced an amplicon with the target length for the samples containing mcr-1 and no amplification products were produced for the negative samples. The novel methods presented in this manuscript can reduce the time needed to identify target sequences, and provide a quick virtual PCR validation to eliminate time wasted on ambiguously binding primers. Source code is freely available on https://bitbucket.org/genomicepidemiology/rucs. Web service is freely available on https://cge.cbs.dtu.dk/services/RUCS. mcft@cbs.dtu.dk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  15. Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

    PubMed

    Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark

    2003-07-04

    The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.

  16. Genome-wide characterization of microRNA in foxtail millet (Setaria italica)

    PubMed Central

    2013-01-01

    Background MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. Results In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. Conclusions We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally. PMID:24330712

  17. Genome-wide characterization of microRNA in foxtail millet (Setaria italica).

    PubMed

    Yi, Fei; Xie, Shaojun; Liu, Yuwei; Qi, Xin; Yu, Jingjuan

    2013-12-13

    MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally.

  18. RISC RNA sequencing for context-specific identification of in vivo miR targets

    PubMed Central

    Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

    2010-01-01

    Rationale MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. Objective To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). Methods and Results We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias, and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1,645 mRNAs consistently targeted to mouse cardiac RISCs. We employed this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing ‘seed’ sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. Conclusions RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context, and is applicable to any tissue and any disease state. Summary MicroRNAs (miRs) are key regulators of mRNA translation in health and disease. While bioinformatic predictions suggest that a single miR may target hundreds of mRNAs, the number of experimentally verified targets of miRs is low. To enable comprehensive, unbiased examination of miR targets, we have performed deep RNA sequencing of cardiac transcriptomes in parallel with cardiac RNA-induced silencing complex (RISC)-associated RNAs (the RISCome), called RISC sequencing. We developed methods that did not require cross-linking of RNAs to RISCs or amplification of mRNA prior to sequencing, making it possible to rapidly perform RISC sequencing from intact tissue while avoiding amplification bias. Comparison of RISCome with transcriptome expression defined the degree of RISC enrichment for each mRNA. The majority of the mRNAs enriched in wild-type cardiac RISComes compared to transcriptomes were bioinformatically predicted to be targets of at least 1 of 139 cardiac-expressed miRs. Programming cardiomyocyte RISCs via transgenic overexpression in adult hearts of miR-133a or miR-499, two miRs that contain entirely different ‘seed’ sequences, elicited differing profiles of RISC-targeted mRNAs. Thus, RISC sequencing represents a highly sensitive method for general RISC profiling and individual miR target identification in biological context. PMID:21030712

  19. Experimental and statistical post-validation of positive example EST sequences carrying peroxisome targeting signals type 1 (PTS1)

    PubMed Central

    Lingner, Thomas; Kataya, Amr R. A.; Reumann, Sigrun

    2012-01-01

    We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences.1 As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity.” Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals. PMID:22415050

  20. Experimental and statistical post-validation of positive example EST sequences carrying peroxisome targeting signals type 1 (PTS1).

    PubMed

    Lingner, Thomas; Kataya, Amr R A; Reumann, Sigrun

    2012-02-01

    We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences. As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity." Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals.

  1. Silent genetic alterations identified by targeted next-generation sequencing in pheochromocytoma/paraganglioma: A clinicopathological correlations.

    PubMed

    Pillai, Suja; Gopalan, Vinod; Lo, Chung Y; Liew, Victor; Smith, Robert A; Lam, Alfred King Y

    2017-02-01

    The goal of this pilot study was to develop a customized, cost-effective amplicon panel (Ampliseq) for target sequencing in a cohort of patients with sporadic phaeochromocytoma/paraganglioma. Phaeochromocytoma/paragangliomas from 25 patients were analysed by targeted next-generation sequencing approach using an Ion Torrent PGM instrument. Primers for 15 target genes (NF1, RET, VHL, SDHA, SDHB, SDHC, SDHD, SDHAF2, TMEM127, MAX, MEN1, KIF1Bβ, EPAS1, CDKN2 & PHD2) were designed using ion ampliseq designer. Ion Reporter software and Ingenuity® Variant Analysis™ software (www.ingenuity.com/variants) from Ingenuity Systems were used to analysis these results. Overall, 713 variants were identified. The variants identified from the Ion Reporter ranged from 64 to 161 per patient. Single nucleotide variants (SNV) were the most common. Further annotation with the help of Ingenuity variant analysis revealed 29 of these 713variants were deletions. Of these, six variants were non-pathogenic and four were likely to be pathogenic. The remaining 19 variants were of uncertain significance. The most frequently altered gene in the cohort was KIF1B followed by NF1. Novel KIF1B pathogenic variant c.3375+1G>A was identified. The mutation was noted in a patient with clinically confirmed neurofibromatosis. Chromosome 1 showed the presence of maximum number of variants. Use of targeted next-generation sequencing is a sensitive method for the detecting genetic changes in patients with phaeochromocytoma/paraganglioma. The precise detection of these genetic changes helps in understanding the pathogenesis of these tumours. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Personalized genomic analyses for cancer mutation discovery and interpretation

    PubMed Central

    Jones, Siân; Anagnostou, Valsamo; Lytle, Karli; Parpart-Li, Sonya; Nesselbush, Monica; Riley, David R.; Shukla, Manish; Chesnick, Bryan; Kadan, Maura; Papp, Eniko; Galens, Kevin G.; Murphy, Derek; Zhang, Theresa; Kann, Lisa; Sausen, Mark; Angiuoli, Samuel V.; Diaz, Luis A.; Velculescu, Victor E.

    2015-01-01

    Massively parallel sequencing approaches are beginning to be used clinically to characterize individual patient tumors and to select therapies based on the identified mutations. A major question in these analyses is the extent to which these methods identify clinically actionable alterations and whether the examination of the tumor tissue alone is sufficient or whether matched normal DNA should also be analyzed to accurately identify tumor-specific (somatic) alterations. To address these issues, we comprehensively evaluated 815 tumor-normal paired samples from patients of 15 tumor types. We identified genomic alterations using next-generation sequencing of whole exomes or 111 targeted genes that were validated with sensitivities >95% and >99%, respectively, and specificities >99.99%. These analyses revealed an average of 140 and 4.3 somatic mutations per exome and targeted analysis, respectively. More than 75% of cases had somatic alterations in genes associated with known therapies or current clinical trials. Analyses of matched normal DNA identified germline alterations in cancer-predisposing genes in 3% of patients with apparently sporadic cancers. In contrast, a tumor-only sequencing approach could not definitively identify germline changes in cancer-predisposing genes and led to additional false-positive findings comprising 31% and 65% of alterations identified in targeted and exome analyses, respectively, including in potentially actionable genes. These data suggest that matched tumor-normal sequencing analyses are essential for precise identification and interpretation of somatic and germline alterations and have important implications for the diagnostic and therapeutic management of cancer patients. PMID:25877891

  3. Cryopyrin-associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation

    PubMed Central

    Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M.; Walts, Avram D.; Hoffmann, Patrycja; Remmers, Elaine F.; Kastner, Daniel L.; Ombrello, Amanda K.

    2015-01-01

    Objective To identify the cause of disease in an adult patient presenting with recent onset fevers, chills, urticaria, fatigue, and profound myalgia, who was negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. Methods We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient’s whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively-selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. Results We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3%–16.8% in monocytes and 15.2%–18% in granulocytes; Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, buccal cells, and in the patient’s cultured fibroblasts. Conclusion These data document the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively-parallel sequencing in clinical diagnosis. PMID:25988971

  4. The spectrum and clinical impact of epigenetic modifier mutations in myeloma

    PubMed Central

    Pawlyn, Charlotte; Kaiser, Martin F; Heuck, Christoph; Melchor, Lorenzo; Wardell, Christopher P; Murison, Alex; Chavan, Shweta; Johnson, David C; Begum, Dil; Dahir, Nasrin; Proszek, Paula; Cairns, David A; Boyle, Eileen M; Jones, John R; Cook, Gordon; Drayson, Mark T; Owen, Roger G; Gregory, Walter M; Jackson, Graham H; Barlogie, Bart; Davies, Faith E; Walker, Brian A; Morgan, Gareth J

    2016-01-01

    Purpose Epigenetic dysregulation is known to be an important contributor to myeloma pathogenesis but, unlike in other B cell malignancies, the full spectrum of somatic mutations in epigenetic modifiers has not been previously reported. We sought to address this using results from whole-exome sequencing in the context of a large prospective clinical trial of newly diagnosed patients and targeted sequencing in a cohort of previously treated patients for comparison. Experimental Design Whole-exome sequencing analysis of 463 presenting myeloma cases entered in the UK NCRI Myeloma XI study and targeted sequencing analysis of 156 previously treated cases from the University of Arkansas for Medical Sciences. We correlated the presence of mutations with clinical outcome from diagnosis and compared the mutations found at diagnosis with later stages of disease. Results In diagnostic myeloma patient samples we identify significant mutations in genes encoding the histone 1 linker protein, previously identified in other B-cell malignancies. Our data suggest an adverse prognostic impact from the presence of lesions in genes encoding DNA methylation modifiers and the histone demethylase KDM6A/UTX. The frequency of mutations in epigenetic modifiers appears to increase following treatment most notably in genes encoding histone methyltransferases and DNA methylation modifiers. Conclusions Numerous mutations identified raise the possibility of targeted treatment strategies for patients either at diagnosis or relapse supporting the use of sequencing-based diagnostics in myeloma to help guide therapy as more epigenetic targeted agents become available. PMID:27235425

  5. An evolution based biosensor receptor DNA sequence generation algorithm.

    PubMed

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.

  6. Widespread long noncoding RNAs as endogenous target mimics for microRNAs in plants.

    PubMed

    Wu, Hua-Jun; Wang, Zhi-Min; Wang, Meng; Wang, Xiu-Jie

    2013-04-01

    Target mimicry is a recently identified regulatory mechanism for microRNA (miRNA) functions in plants in which the decoy RNAs bind to miRNAs via complementary sequences and therefore block the interaction between miRNAs and their authentic targets. Both endogenous decoy RNAs (miRNA target mimics) and engineered artificial RNAs can induce target mimicry effects. Yet until now, only the Induced by Phosphate Starvation1 RNA has been proven to be a functional endogenous microRNA target mimic (eTM). In this work, we developed a computational method and systematically identified intergenic or noncoding gene-originated eTMs for 20 conserved miRNAs in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). The predicted miRNA binding sites were well conserved among eTMs of the same miRNA, whereas sequences outside of the binding sites varied a lot. We proved that the eTMs of miR160 and miR166 are functional target mimics and identified their roles in the regulation of plant development. The effectiveness of eTMs for three other miRNAs was also confirmed by transient agroinfiltration assay.

  7. CRISPR/Cas9-mediated gene knockout screens and target identification via whole-genome sequencing uncover host genes required for picornavirus infection.

    PubMed

    Kim, Heon Seok; Lee, Kyungjin; Bae, Sangsu; Park, Jeongbin; Lee, Chong-Kyo; Kim, Meehyein; Kim, Eunji; Kim, Minju; Kim, Seokjoong; Kim, Chonsaeng; Kim, Jin-Soo

    2017-06-23

    Several groups have used genome-wide libraries of lentiviruses encoding small guide RNAs (sgRNAs) for genetic screens. In most cases, sgRNA expression cassettes are integrated into cells by using lentiviruses, and target genes are statistically estimated by the readout of sgRNA sequences after targeted sequencing. We present a new virus-free method for human gene knockout screens using a genome-wide library of CRISPR/Cas9 sgRNAs based on plasmids and target gene identification via whole-genome sequencing (WGS) confirmation of authentic mutations rather than statistical estimation through targeted amplicon sequencing. We used 30,840 pairs of individually synthesized oligonucleotides to construct the genome-scale sgRNA library, collectively targeting 10,280 human genes ( i.e. three sgRNAs per gene). These plasmid libraries were co-transfected with a Cas9-expression plasmid into human cells, which were then treated with cytotoxic drugs or viruses. Only cells lacking key factors essential for cytotoxic drug metabolism or viral infection were able to survive. Genomic DNA isolated from cells that survived these challenges was subjected to WGS to directly identify CRISPR/Cas9-mediated causal mutations essential for cell survival. With this approach, we were able to identify known and novel genes essential for viral infection in human cells. We propose that genome-wide sgRNA screens based on plasmids coupled with WGS are powerful tools for forward genetics studies and drug target discovery. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  8. Targeted next-generation sequencing reveals that a compound heterozygous mutation in phosphodiesterase 6a gene leads to retinitis pigmentosa in a Chinese family.

    PubMed

    Zhang, Shanshan; Li, Jie; Li, Shujin; Yang, Yeming; Yang, Mu; Yang, Zhenglin; Zhu, Xianjun; Zhang, Lin

    2018-04-25

    Retinitis pigmentosa (RP) is a genetically heterogeneous disease with over 70 causative genes identified to date. However, approximately 40% of RP cases remain genetically unsolved, suggesting that many novel disease-causing mutations are yet to be identified. The purpose of this study is to identify the causative mutations of a Chinese RP family. Targeted next-generation sequencing (NGS) for a total of 163 genes which involved in inherited retinal disorders were used to screen the possible causative mutations. Sanger sequencing was used to verify the mutations. As results, we identified two heterozygous mutations: a splicing site mutation c.1407 + 1G>C and a nonsense mutation c. 1957C>T (p.R653X) in phosphodiesterase 6A (PDE6A) gene in the RP patient. These two mutations are inherited from his father and mother, respectively. Furthermore, these mutations are unique in our in-house database and are rare in human genome databases, implicating that these two mutations are pathological. By using targeted NGS method, we identified a compound heterozygous mutation in PDE6A gene that is associated with RP in a Chinese family.

  9. Application of industrial scale genomics to discovery of therapeutic targets in heart failure.

    PubMed

    Mehraban, F; Tomlinson, J E

    2001-12-01

    In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.

  10. Artificial neural network study on organ-targeting peptides

    NASA Astrophysics Data System (ADS)

    Jung, Eunkyoung; Kim, Junhyoung; Choi, Seung-Hoon; Kim, Minkyoung; Rhee, Hokyoung; Shin, Jae-Min; Choi, Kihang; Kang, Sang-Kee; Lee, Nam Kyung; Choi, Yun-Jaie; Jung, Dong Hyun

    2010-01-01

    We report a new approach to studying organ targeting of peptides on the basis of peptide sequence information. The positive control data sets consist of organ-targeting peptide sequences identified by the peroral phage-display technique for four organs, and the negative control data are prepared from random sequences. The capacity of our models to make appropriate predictions is validated by statistical indicators including sensitivity, specificity, enrichment curve, and the area under the receiver operating characteristic (ROC) curve (the ROC score). VHSE descriptor produces statistically significant training models and the models with simple neural network architectures show slightly greater predictive power than those with complex ones. The training and test set statistics indicate that our models could discriminate between organ-targeting and random sequences. We anticipate that our models will be applicable to the selection of organ-targeting peptides for generating peptide drugs or peptidomimetics.

  11. A screen of chemical modifications identifies position-specific modification by UNA to most potently reduce siRNA off-target effects

    PubMed Central

    Bramsen, Jesper B.; Pakula, Malgorzata M.; Hansen, Thomas B.; Bus, Claus; Langkjær, Niels; Odadzic, Dalibor; Smicius, Romualdas; Wengel, Suzy L.; Chattopadhyaya, Jyoti; Engels, Joachim W.; Herdewijn, Piet; Wengel, Jesper; Kjems, Jørgen

    2010-01-01

    Small interfering RNAs (siRNAs) are now established as the preferred tool to inhibit gene function in mammalian cells yet trigger unintended gene silencing due to their inherent miRNA-like behavior. Such off-target effects are primarily mediated by the sequence-specific interaction between the siRNA seed regions (position 2–8 of either siRNA strand counting from the 5′-end) and complementary sequences in the 3′UTR of (off-) targets. It was previously shown that chemical modification of siRNAs can reduce off-targeting but only very few modifications have been tested leaving more to be identified. Here we developed a luciferase reporter-based assay suitable to monitor siRNA off-targeting in a high throughput manner using stable cell lines. We investigated the impact of chemically modifying single nucleotide positions within the siRNA seed on siRNA function and off-targeting using 10 different types of chemical modifications, three different target sequences and three siRNA concentrations. We found several differently modified siRNAs to exercise reduced off-targeting yet incorporation of the strongly destabilizing unlocked nucleic acid (UNA) modification into position 7 of the siRNA most potently reduced off-targeting for all tested sequences. Notably, such position-specific destabilization of siRNA–target interactions did not significantly reduce siRNA potency and is therefore well suited for future siRNA designs especially for applications in vivo where siRNA concentrations, expectedly, will be low. PMID:20453030

  12. Identifying transcription factor functions and targets by phenotypic activation

    PubMed Central

    Chua, Gordon; Morris, Quaid D.; Sopko, Richelle; Robinson, Mark D.; Ryan, Owen; Chan, Esther T.; Frey, Brendan J.; Andrews, Brenda J.; Boone, Charles; Hughes, Timothy R.

    2006-01-01

    Mapping transcriptional regulatory networks is difficult because many transcription factors (TFs) are activated only under specific conditions. We describe a generic strategy for identifying genes and pathways induced by individual TFs that does not require knowledge of their normal activation cues. Microarray analysis of 55 yeast TFs that caused a growth phenotype when overexpressed showed that the majority caused increased transcript levels of genes in specific physiological categories, suggesting a mechanism for growth inhibition. Induced genes typically included established targets and genes with consensus promoter motifs, if known, indicating that these data are useful for identifying potential new target genes and binding sites. We identified the sequence 5′-TCACGCAA as a binding sequence for Hms1p, a TF that positively regulates pseudohyphal growth and previously had no known motif. The general strategy outlined here presents a straightforward approach to discovery of TF activities and mapping targets that could be adapted to any organism with transgenic technology. PMID:16880382

  13. Sequencing of intraductal biopsies is feasible and potentially impacts clinical management of patients with indeterminate biliary stricture and cholangiocarcinoma.

    PubMed

    Bankov, Katrin; Döring, Claudia; Schneider, Markus; Hartmann, Sylvia; Winkelmann, Ria; Albert, Joerg G; Bechstein, Wolf Otto; Zeuzem, Stefan; Hansmann, Martin Leo; Peveling-Oberhag, Jan; Walter, Dirk

    2018-04-30

    Definite diagnosis and therapeutic management of cholangiocarcinoma (CCA) remains a challenge. The aim of the current study was to investigate feasibility and potential impact on clinical management of targeted sequencing of intraductal biopsies. Intraductal biopsies with suspicious findings from 16 patients with CCA in later clinical course were analyzed with targeted sequencing including tumor and control benign tissue (n = 55 samples). A CCA-specific sequencing panel containing 41 genes was designed and a dual strand targeted enrichment was applied. Sequencing was successfully performed for all samples. In total, 79 mutations were identified and a mean of 1.7 mutations per tumor sample (range 0-4) as well as 2.3 per biopsy (0-6) were detected and potentially therapeutically relevant genes were identified in 6/16 cases. In 14/18 (78%) biopsies with dysplasia or inconclusive findings at least one mutation was detected. The majority of mutations were found in both surgical specimen and biopsy (68%), while 28% were only present in biopsies in contrast to 4% being only present in the surgical tumor specimen. Targeted sequencing from intraductal biopsies is feasible and potentially improves the diagnostic yield. A profound genetic heterogeneity in biliary dysplasia needs to be considered in clinical management and warrants further investigation. The current study is the first to demonstrate the feasibility of sequencing of intraductal biopsies which holds the potential to impact diagnostic and therapeutical management of patients with biliary dysplasia and neoplasia.

  14. In vivo therapeutic potential of Dicer-hunting siRNAs targeting infectious hepatitis C virus.

    PubMed

    Watanabe, Tsunamasa; Hatakeyama, Hiroto; Matsuda-Yasui, Chiho; Sato, Yusuke; Sudoh, Masayuki; Takagi, Asako; Hirata, Yuichi; Ohtsuki, Takahiro; Arai, Masaaki; Inoue, Kazuaki; Harashima, Hideyoshi; Kohara, Michinori

    2014-04-23

    The development of RNA interference (RNAi)-based therapy faces two major obstacles: selecting small interfering RNA (siRNA) sequences with strong activity, and identifying a carrier that allows efficient delivery to target organs. Additionally, conservative region at nucleotide level must be targeted for RNAi in applying to virus because hepatitis C virus (HCV) could escape from therapeutic pressure with genome mutations. In vitro preparation of Dicer-generated siRNAs targeting a conserved, highly ordered HCV 5' untranslated region are capable of inducing strong RNAi activity. By dissecting the 5'-end of an RNAi-mediated cleavage site in the HCV genome, we identified potent siRNA sequences, which we designate as Dicer-hunting siRNAs (dh-siRNAs). Furthermore, formulation of the dh-siRNAs in an optimized multifunctional envelope-type nano device inhibited ongoing infectious HCV replication in human hepatocytes in vivo. Our efforts using both identification of optimal siRNA sequences and delivery to human hepatocytes suggest therapeutic potential of siRNA for a virus.

  15. Identification of novel microRNAs in Hevea brasiliensis and computational prediction of their targets

    PubMed Central

    2012-01-01

    Background Plants respond to external stimuli through fine regulation of gene expression partially ensured by small RNAs. Of these, microRNAs (miRNAs) play a crucial role. They negatively regulate gene expression by targeting the cleavage or translational inhibition of target messenger RNAs (mRNAs). In Hevea brasiliensis, environmental and harvesting stresses are known to affect natural rubber production. This study set out to identify abiotic stress-related miRNAs in Hevea using next-generation sequencing and bioinformatic analysis. Results Deep sequencing of small RNAs was carried out on plantlets subjected to severe abiotic stress using the Solexa technique. By combining the LeARN pipeline, data from the Plant microRNA database (PMRD) and Hevea EST sequences, we identified 48 conserved miRNA families already characterized in other plant species, and 10 putatively novel miRNA families. The results showed the most abundant size for miRNAs to be 24 nucleotides, except for seven families. Several MIR genes produced both 20-22 nucleotides and 23-27 nucleotides. The two miRNA class sizes were detected for both conserved and putative novel miRNA families, suggesting their functional duality. The EST databases were scanned with conserved and novel miRNA sequences. MiRNA targets were computationally predicted and analysed. The predicted targets involved in "responses to stimuli" and to "antioxidant" and "transcription activities" are presented. Conclusions Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs when the complete genome is not yet available. Our study provided additional information for evolutionary studies and revealed potentially specific regulation of the control of redox status in Hevea. PMID:22330773

  16. In-silico and in-vivo analyses of EST databases unveil conserved miRNAs from Carthamus tinctorius and Cynara cardunculus

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are small RNAs (21-24 bp) providing an RNA-based system of gene regulation highly conserved in plants and animals. In plants, miRNAs control mRNA degradation or restrain translation, affecting development and responses to stresses. Plant miRNAs show imperfect but extensive complementarity to mRNA targets, making their computational prediction possible, useful when data mining is applied on different species. In this study we used a comparative approach to identify both miRNAs and their targets, in artichoke and safflower. Results Two complete expressed sequence tags (ESTs) datasets from artichoke (3.6·104 entries) and safflower (4.2·104), were analysed with a bioinformatic pipeline and in vitro experiments, identifying 17 potential miRNAs. For each EST, using RNAhybrid program and 953 non redundant miRNA mature sequences, available in mirBase as reference, we searched matching putative targets. 8730 out of 42011 ESTs from safflower and 7145 of 36323 ESTs from artichoke showed at least one predicted miRNA target. BLAST analysis showed that 75% of all ESTs shared at least a common homologous region (E-value < 10-4) and about 50% of these displayed 400 bp or longer aligned sequences as conserved homologous/orthologous (COS) regions. 960 and 890 ESTs of safflower and artichoke organized in COS shared 79 different miRNA targets, considered functionally conserved, and statistically significant when compared with random sequences (signal to noise ratio > 2 and specificity ≥ 0.85). Four highly significant miRNAs selected from in silico data were experimentally validated in globe artichoke leaves. Conclusions Mature miRNAs and targets were predicted within EST sequences of safflower and artichoke. Most of the miRNA targets appeared highly/moderately conserved, highlighting an important and conserved function. In this study we introduce a stringent parameter for the comparative sequence analysis, represented by the identification of the same target in the COS region. After statistical analysis 79 targets, found on the COS regions and belonging to 60 miRNA families, have a signal to noise ratio > 2, with ≥ 0.85 specificity. The putative miRNAs identified belong to 55 dicotyledon plants and to 24 families only in monocotyledon. PMID:22536958

  17. Identification of MicroRNAs in Helicoverpa armigera and Spodoptera litura Based on Deep Sequencing and Homology Analysis

    PubMed Central

    Ge, Xie; Zhang, Yong; Jiang, Jianhao; Zhong, Yi; Yang, Xiaonan; Li, Zhiqian; Huang, Yongping; Tan, Anjiang

    2013-01-01

    The current identification of microRNAs (miRNAs) in insects is largely dependent on genome sequences. However, the lack of available genome sequences inhibits the identification of miRNAs in various insect species. In this study, we used a miRNA database of the silkworm Bombyx mori as a reference to identify miRNAs in Helicoverpa armigera and Spodoptera litura using deep sequencing and homology analysis. Because all three species belong to the Lepidoptera, the experiment produced reliable results. Our study identified 97 and 91 conserved miRNAs in H. armigera and S. litura, respectively. Using the genome of B. mori and BAC sequences of H. armigera as references, 1 novel miRNA and 8 novel miRNA candidates were identified in H. armigera, and 4 novel miRNA candidates were identified in S. litura. An evolutionary analysis revealed that most of the identified miRNAs were insect-specific, and more than 20 miRNAs were Lepidoptera-specific. The investigation of the expression patterns of miR-2a, miR-34, miR-2796-3p and miR-11 revealed their potential roles in insect development. miRNA target prediction revealed that conserved miRNA target sites exist in various genes in the 3 species. Conserved miRNA target sites for the Hsp90 gene among the 3 species were validated in the mammalian 293T cell line using a dual-luciferase reporter assay. Our study provides a new approach with which to identify miRNAs in insects lacking genome information and contributes to the functional analysis of insect miRNAs. PMID:23289012

  18. Large scale RNAi screen in Tribolium reveals novel target genes for pest control and the proteasome as prime target.

    PubMed

    Ulrich, Julia; Dao, Van Anh; Majumdar, Upalparna; Schmitt-Engel, Christian; Schwirz, Jonas; Schultheis, Dorothea; Ströhlein, Nadi; Troelenberg, Nicole; Grossmann, Daniela; Richter, Tobias; Dönitz, Jürgen; Gerischer, Lizzy; Leboulle, Gérard; Vilcinskas, Andreas; Stanke, Mario; Bucher, Gregor

    2015-09-03

    Insect pest control is challenged by insecticide resistance and negative impact on ecology and health. One promising pest specific alternative is the generation of transgenic plants, which express double stranded RNAs targeting essential genes of a pest species. Upon feeding, the dsRNA induces gene silencing in the pest resulting in its death. However, the identification of efficient RNAi target genes remains a major challenge as genomic tools and breeding capacity is limited in most pest insects impeding whole-animal-high-throughput-screening. We use the red flour beetle Tribolium castaneum as a screening platform in order to identify the most efficient RNAi target genes. From about 5,000 randomly screened genes of the iBeetle RNAi screen we identify 11 novel and highly efficient RNAi targets. Our data allowed us to determine GO term combinations that are predictive for efficient RNAi target genes with proteasomal genes being most predictive. Finally, we show that RNAi target genes do not appear to act synergistically and that protein sequence conservation does not correlate with the number of potential off target sites. Our results will aid the identification of RNAi target genes in many pest species by providing a manageable number of excellent candidate genes to be tested and the proteasome as prime target. Further, the identified GO term combinations will help to identify efficient target genes from organ specific transcriptomes. Our off target analysis is relevant for the sequence selection used in transgenic plants.

  19. Development of a real-time PCR for detection of Staphylococcus pseudintermedius using a novel automated comparison of whole-genome sequences.

    PubMed

    Verstappen, Koen M; Huijbregts, Loes; Spaninks, Mirlin; Wagenaar, Jaap A; Fluit, Ad C; Duim, Birgitta

    2017-01-01

    Staphylococcus pseudintermedius is an opportunistic pathogen in dogs and cats and occasionally causes infections in humans. S. pseudintermedius is often resistant to multiple classes of antimicrobials. It requires a reliable detection so that it is not misidentified as S. aureus. Phenotypic and currently-used molecular-based diagnostic assays lack specificity or are labour-intensive using multiplex PCR or nucleic acid sequencing. The aim of this study was to identify a specific target for real-time PCR by comparing whole genome sequences of S. pseudintermedius and non-pseudintermedius.Genome sequences were downloaded from public repositories and supplemented by isolates that were sequenced in this study. A Perl-script was written that analysed 300-nt fragments from a reference genome sequence of S. pseudintermedius and checked if this sequence was present in other S. pseudintermedius genomes (n = 74) and non-pseudintermedius genomes (n = 138). Six sequences specific for S. pseudintermedius were identified (sequence length between 300-500 nt). One sequence, which was located in the spsJ gene, was used to develop primers and a probe. The real-time PCR showed 100% specificity when testing for S. pseudintermedius isolates (n = 54), and eight other staphylococcal species (n = 43). In conclusion, a novel approach by comparing whole genome sequences identified a sequence that is specific for S. pseudintermedius and provided a real-time PCR target for rapid and reliable detection of S. pseudintermedius.

  20. Identifying mRNA sequence elements for target recognition by human Argonaute proteins

    PubMed Central

    Li, Jingjing; Kim, TaeHyung; Nutiu, Razvan; Ray, Debashish; Hughes, Timothy R.; Zhang, Zhaolei

    2014-01-01

    It is commonly known that mammalian microRNAs (miRNAs) guide the RNA-induced silencing complex (RISC) to target mRNAs through the seed-pairing rule. However, recent experiments that coimmunoprecipitate the Argonaute proteins (AGOs), the central catalytic component of RISC, have consistently revealed extensive AGO-associated mRNAs that lack seed complementarity with miRNAs. We herein test the hypothesis that AGO has its own binding preference within target mRNAs, independent of guide miRNAs. By systematically analyzing the data from in vivo cross-linking experiments with human AGOs, we have identified a structurally accessible and evolutionarily conserved region (∼10 nucleotides in length) that alone can accurately predict AGO–mRNA associations, independent of the presence of miRNA binding sites. Within this region, we further identified an enriched motif that was replicable on independent AGO-immunoprecipitation data sets. We used RNAcompete to enumerate the RNA-binding preference of human AGO2 to all possible 7-mer RNA sequences and validated the AGO motif in vitro. These findings reveal a novel function of AGOs as sequence-specific RNA-binding proteins, which may aid miRNAs in recognizing their targets with high specificity. PMID:24663241

  1. CRISPRTarget

    PubMed Central

    Biswas, Ambarish; Gagnon, Joshua N.; Brouns, Stan J.J.; Fineran, Peter C.; Brown, Chris M.

    2013-01-01

    The bacterial and archaeal CRISPR/Cas adaptive immune system targets specific protospacer nucleotide sequences in invading organisms. This requires base pairing between processed CRISPR RNA and the target protospacer. For type I and II CRISPR/Cas systems, protospacer adjacent motifs (PAM) are essential for target recognition, and for type III, mismatches in the flanking sequences are important in the antiviral response. In this study, we examine the properties of each class of CRISPR. We use this information to provide a tool (CRISPRTarget) that predicts the most likely targets of CRISPR RNAs (http://bioanalysis.otago.ac.nz/CRISPRTarget). This can be used to discover targets in newly sequenced genomic or metagenomic data. To test its utility, we discover features and targets of well-characterized Streptococcus thermophilus and Sulfolobus solfataricus type II and III CRISPR/Cas systems. Finally, in Pectobacterium species, we identify new CRISPR targets and propose a model of temperate phage exposure and subsequent inhibition by the type I CRISPR/Cas systems. PMID:23492433

  2. A Multidimensional Strategy to Detect Polypharmacological Targets in the Absence of Structural and Sequence Homology

    PubMed Central

    Durrant, Jacob D.; Amaro, Rommie E.; Xie, Lei; Urbaniak, Michael D.; Ferguson, Michael A. J.; Haapalainen, Antti; Chen, Zhijun; Di Guilmi, Anne Marie; Wunder, Frank; Bourne, Philip E.; McCammon, J. Andrew

    2010-01-01

    Conventional drug design embraces the “one gene, one drug, one disease” philosophy. Polypharmacology, which focuses on multi-target drugs, has emerged as a new paradigm in drug discovery. The rational design of drugs that act via polypharmacological mechanisms can produce compounds that exhibit increased therapeutic potency and against which resistance is less likely to develop. Additionally, identifying multiple protein targets is also critical for side-effect prediction. One third of potential therapeutic compounds fail in clinical trials or are later removed from the market due to unacceptable side effects often caused by off-target binding. In the current work, we introduce a multidimensional strategy for the identification of secondary targets of known small-molecule inhibitors in the absence of global structural and sequence homology with the primary target protein. To demonstrate the utility of the strategy, we identify several targets of 4,5-dihydroxy-3-(1-naphthyldiazenyl)-2,7-naphthalenedisulfonic acid, a known micromolar inhibitor of Trypanosoma brucei RNA editing ligase 1. As it is capable of identifying potential secondary targets, the strategy described here may play a useful role in future efforts to reduce drug side effects and/or to increase polypharmacology. PMID:20098496

  3. A multidimensional strategy to detect polypharmacological targets in the absence of structural and sequence homology.

    PubMed

    Durrant, Jacob D; Amaro, Rommie E; Xie, Lei; Urbaniak, Michael D; Ferguson, Michael A J; Haapalainen, Antti; Chen, Zhijun; Di Guilmi, Anne Marie; Wunder, Frank; Bourne, Philip E; McCammon, J Andrew

    2010-01-22

    Conventional drug design embraces the "one gene, one drug, one disease" philosophy. Polypharmacology, which focuses on multi-target drugs, has emerged as a new paradigm in drug discovery. The rational design of drugs that act via polypharmacological mechanisms can produce compounds that exhibit increased therapeutic potency and against which resistance is less likely to develop. Additionally, identifying multiple protein targets is also critical for side-effect prediction. One third of potential therapeutic compounds fail in clinical trials or are later removed from the market due to unacceptable side effects often caused by off-target binding. In the current work, we introduce a multidimensional strategy for the identification of secondary targets of known small-molecule inhibitors in the absence of global structural and sequence homology with the primary target protein. To demonstrate the utility of the strategy, we identify several targets of 4,5-dihydroxy-3-(1-naphthyldiazenyl)-2,7-naphthalenedisulfonic acid, a known micromolar inhibitor of Trypanosoma brucei RNA editing ligase 1. As it is capable of identifying potential secondary targets, the strategy described here may play a useful role in future efforts to reduce drug side effects and/or to increase polypharmacology.

  4. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits. PMID:23116282

  5. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni.

    PubMed

    Mandhan, Vibha; Kaur, Jagdeep; Singh, Kashmir

    2012-11-01

    MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits.

  6. Brief Report: Cryopyrin-Associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation.

    PubMed

    Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M; Walts, Avram D; Hoffmann, Patrycja; Remmers, Elaine F; Kastner, Daniel L; Ombrello, Amanda K

    2015-09-01

    To identify the cause of disease in an adult patient presenting with recent-onset fevers, chills, urticaria, fatigue, and profound myalgia, who was found to be negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient's whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3-16.8% in monocytes and 15.2-18% in granulocytes. Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, in buccal cells, and in the patient's cultured fibroblasts. Our findings indicate the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively parallel sequencing in clinical diagnosis. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.

  7. Targeting the Atypical Chemokine Receptor ACKR3/CXCR7: Phase 1 - Phage Display Peptide Identification and Characterization.

    PubMed

    Vestal, R D; LaJeunesse, D R; Taylor, E W

    2016-01-01

    One of the greatest challenges in fighting cancer is cell targeting and biomarker selection. The Atypical Chemokine Receptor ACKR3/CXCR7 is expressed on many cancer cell types, including breast cancer and glioblastoma, and binds the endogenous ligands SDF1/CXCL12 and ITAC/CXCL11. A 20 amino acid region of the ACKR3/CXCR7 N-terminus was synthesized and targeted with the NEB PhD-7 Phage Display Peptide Library. Twenty-nine phages were isolated and heptapeptide inserts sequenced; of these, 23 sequences were unique. A 3D molecular model was created for the ACKR3/CXCR7 N-terminus by mutating the corresponding region of the crystal structure of CXCR4 with bound SDF1/CXCL12. A ClustalW alignment was performed on each peptide sequence using the entire SDF1/CXCL12 sequence as the template. The 23-peptide sequences showed similarity to three distinct regions of the SDF1/CXCL12 molecule. A 3D molecular model was made for each of the phage peptide inserts to visually identify potential areas of steric interference of peptides that simulated CXCL12 regions not in contact with the receptor's Nterminus. An ELISA analysis of the relative binding affinity between the peptides identified 9 peptides with statistically significant results. The candidate pool of 9 peptides was further reduced to 3 peptides based on their affinity for the targeted N-terminus region peptide versus no target peptide present or a scrambled negative control peptide. The results clearly show the Phage Display protocol can be used to target a synthesized region of the ACKR3/CXCR7 N-terminus. The 3 peptides chosen, P20, P3, and P9, will be the basis for further targeting studies.

  8. High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients.

    PubMed

    Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya

    2015-08-01

    Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  9. Illumina sequencing of green stink bug nymph and adult cdna to identify potential rnai gene targets

    USDA-ARS?s Scientific Manuscript database

    Whole-body transcriptomes for nymphs and adults of the green stink bug, Acrosternum hilare (Say), were sequenced on an Illumina® Genome Analyzer IIx sequencer. The insects were collected from sites in North Carolina and Virginia, USA. The cDNA library for each sample was sequenced on one lane of an...

  10. Whole-Genome Thermodynamic Analysis Reduces siRNA Off-Target Effects

    PubMed Central

    Chen, Xi; Liu, Peng; Chou, Hui-Hsien

    2013-01-01

    Small interfering RNAs (siRNAs) are important tools for knocking down targeted genes, and have been widely applied to biological and biomedical research. To design siRNAs, two important aspects must be considered: the potency in knocking down target genes and the off-target effect on any nontarget genes. Although many studies have produced useful tools to design potent siRNAs, off-target prevention has mostly been delegated to sequence-level alignment tools such as BLAST. We hypothesize that whole-genome thermodynamic analysis can identify potential off-targets with higher precision and help us avoid siRNAs that may have strong off-target effects. To validate this hypothesis, two siRNA sets were designed to target three human genes IDH1, ITPR2 and TRIM28. They were selected from the output of two popular siRNA design tools, siDirect and siDesign. Both siRNA design tools have incorporated sequence-level screening to avoid off-targets, thus their output is believed to be optimal. However, one of the sets we tested has off-target genes predicted by Picky, a whole-genome thermodynamic analysis tool. Picky can identify off-target genes that may hybridize to a siRNA within a user-specified melting temperature range. Our experiments validated that some off-target genes predicted by Picky can indeed be inhibited by siRNAs. Similar experiments were performed using commercially available siRNAs and a few off-target genes were also found to be inhibited as predicted by Picky. In summary, we demonstrate that whole-genome thermodynamic analysis can identify off-target genes that are missed in sequence-level screening. Because Picky prediction is deterministic according to thermodynamics, if a siRNA candidate has no Picky predicted off-targets, it is unlikely to cause off-target effects. Therefore, we recommend including Picky as an additional screening step in siRNA design. PMID:23484018

  11. PAT: predictor for structured units and its application for the optimization of target molecules for the generation of synthetic antibodies.

    PubMed

    Jeon, Jouhyun; Arnold, Roland; Singh, Fateh; Teyra, Joan; Braun, Tatjana; Kim, Philip M

    2016-04-01

    The identification of structured units in a protein sequence is an important first step for most biochemical studies. Importantly for this study, the identification of stable structured region is a crucial first step to generate novel synthetic antibodies. While many approaches to find domains or predict structured regions exist, important limitations remain, such as the optimization of domain boundaries and the lack of identification of non-domain structured units. Moreover, no integrated tool exists to find and optimize structural domains within protein sequences. Here, we describe a new tool, PAT ( http://www.kimlab.org/software/pat ) that can efficiently identify both domains (with optimized boundaries) and non-domain putative structured units. PAT automatically analyzes various structural properties, evaluates the folding stability, and reports possible structural domains in a given protein sequence. For reliability evaluation of PAT, we applied PAT to identify antibody target molecules based on the notion that soluble and well-defined protein secondary and tertiary structures are appropriate target molecules for synthetic antibodies. PAT is an efficient and sensitive tool to identify structured units. A performance analysis shows that PAT can characterize structurally well-defined regions in a given sequence and outperforms other efforts to define reliable boundaries of domains. Specially, PAT successfully identifies experimentally confirmed target molecules for antibody generation. PAT also offers the pre-calculated results of 20,210 human proteins to accelerate common queries. PAT can therefore help to investigate large-scale structured domains and improve the success rate for synthetic antibody generation.

  12. Target enrichment and high-throughput sequencing of 80 ribosomal protein genes to identify mutations associated with Diamond-Blackfan anaemia.

    PubMed

    Gerrard, Gareth; Valgañón, Mikel; Foong, Hui En; Kasperaviciute, Dalia; Iskander, Deena; Game, Laurence; Müller, Michael; Aitman, Timothy J; Roberts, Irene; de la Fuente, Josu; Foroni, Letizia; Karadimitris, Anastasios

    2013-08-01

    Diamond-Blackfan anaemia (DBA) is caused by inactivating mutations in ribosomal protein (RP) genes, with mutations in 13 of the 80 RP genes accounting for 50-60% of cases. The remaining 40-50% cases may harbour mutations in one of the remaining RP genes, but the very low frequencies render conventional genetic screening as challenging. We, therefore, applied custom enrichment technology combined with high-throughput sequencing to screen all 80 RP genes. Using this approach, we identified and validated inactivating mutations in 15/17 (88%) DBA patients. Target enrichment combined with high-throughput sequencing is a robust and improved methodology for the genetic diagnosis of DBA. © 2013 John Wiley & Sons Ltd.

  13. The Value of DNA Sequencing - TCGA

    Cancer.gov

    DNA sequencing: what it tells us about DNA changes in cancer, how looking across many tumors will help to identify meaningful changes and potential drug targets, and how genomics is changing the way we think about cancer.

  14. Reducing animal sequencing redundancy by preferentially selecting animals with low-frequency haplotypes

    USDA-ARS?s Scientific Manuscript database

    Many studies leverage targeted whole genome sequencing (WGS) experiments in order to identify rare and causal variants within populations. As a natural consequence of experimental design, many of these surveys tend to sequence redundant haplotype segments due to high frequency in the base population...

  15. Widespread Long Noncoding RNAs as Endogenous Target Mimics for MicroRNAs in Plants1[W

    PubMed Central

    Wu, Hua-Jun; Wang, Zhi-Min; Wang, Meng; Wang, Xiu-Jie

    2013-01-01

    Target mimicry is a recently identified regulatory mechanism for microRNA (miRNA) functions in plants in which the decoy RNAs bind to miRNAs via complementary sequences and therefore block the interaction between miRNAs and their authentic targets. Both endogenous decoy RNAs (miRNA target mimics) and engineered artificial RNAs can induce target mimicry effects. Yet until now, only the Induced by Phosphate Starvation1 RNA has been proven to be a functional endogenous microRNA target mimic (eTM). In this work, we developed a computational method and systematically identified intergenic or noncoding gene-originated eTMs for 20 conserved miRNAs in Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). The predicted miRNA binding sites were well conserved among eTMs of the same miRNA, whereas sequences outside of the binding sites varied a lot. We proved that the eTMs of miR160 and miR166 are functional target mimics and identified their roles in the regulation of plant development. The effectiveness of eTMs for three other miRNAs was also confirmed by transient agroinfiltration assay. PMID:23429259

  16. The genome sequence of a widespread apex Predator, the golden eagle (Aquila chrysaetos)

    Treesearch

    Jacqueline M. Doyle; Todd E. Katzner; Peter H. Bloom; Yanzhu Ji; Bhagya K. Wijayawardena; J. Andrew DeWoody; Ludovic Orlando

    2014-01-01

    Biologists routinely use molecular markers to identify conservation units, to quantify genetic connectivity, to estimate population sizes, and to identify targets of selection. Many imperiled eagle populations require such efforts and would benefit from enhanced genomic resources. We sequenced, assembled, and annotated the first eagle genome using DNA from a male...

  17. Identification and characterization of microRNAs in Phaseolus vulgaris by high-throughput sequencing

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are endogenously encoded small RNAs that post-transcriptionally regulate gene expression. MiRNAs play essential roles in almost all plant biological processes. Currently, few miRNAs have been identified in the model food legume Phaseolus vulgaris (common bean). Recent advances in next generation sequencing technologies have allowed the identification of conserved and novel miRNAs in many plant species. Here, we used Illumina's sequencing by synthesis (SBS) technology to identify and characterize the miRNA population of Phaseolus vulgaris. Results Small RNA libraries were generated from roots, flowers, leaves, and seedlings of P. vulgaris. Based on similarity to previously reported plant miRNAs,114 miRNAs belonging to 33 conserved miRNA families were identified. Stem-loop precursors and target gene sequences for several conserved common bean miRNAs were determined from publicly available databases. Less conserved miRNA families and species-specific common bean miRNA isoforms were also characterized. Moreover, novel miRNAs based on the small RNAs were found and their potential precursors were predicted. In addition, new target candidates for novel and conserved miRNAs were proposed. Finally, we studied organ-specific miRNA family expression levels through miRNA read frequencies. Conclusions This work represents the first massive-scale RNA sequencing study performed in Phaseolus vulgaris to identify and characterize its miRNA population. It significantly increases the number of miRNAs, precursors, and targets identified in this agronomically important species. The miRNA expression analysis provides a foundation for understanding common bean miRNA organ-specific expression patterns. The present study offers an expanded picture of P. vulgaris miRNAs in relation to those of other legumes. PMID:22394504

  18. Genome-wide localization and expression profiling establish Sp2 as a sequence-specific transcription factor regulating vitally important genes

    PubMed Central

    Terrados, Gloria; Finkernagel, Florian; Stielow, Bastian; Sadic, Dennis; Neubert, Juliane; Herdt, Olga; Krause, Michael; Scharfe, Maren; Jarek, Michael; Suske, Guntram

    2012-01-01

    The transcription factor Sp2 is essential for early mouse development and for proliferation of mouse embryonic fibroblasts in culture. Yet its mechanisms of action and its target genes are largely unknown. In this study, we have combined RNA interference, in vitro DNA binding, chromatin immunoprecipitation sequencing and global gene-expression profiling to investigate the role of Sp2 for cellular functions, to define target sites and to identify genes regulated by Sp2. We show that Sp2 is important for cellular proliferation that it binds to GC-boxes and occupies proximal promoters of genes essential for vital cellular processes including gene expression, replication, metabolism and signalling. Moreover, we identified important key target genes and cellular pathways that are directly regulated by Sp2. Most significantly, Sp2 binds and activates numerous sequence-specific transcription factor and co-activator genes, and represses the whole battery of cholesterol synthesis genes. Our results establish Sp2 as a sequence-specific regulator of vitally important genes. PMID:22684502

  19. Novel mutations in CRB1 gene identified in a chinese pedigree with retinitis pigmentosa by targeted capture and next generation sequencing

    PubMed Central

    Lo, David; Weng, Jingning; Liu, xiaohong; Yang, Juhua; He, Fen; Wang, Yun; Liu, Xuyang

    2016-01-01

    PURPOSE To detect the disease-causing gene in a Chinese pedigree with autosomal-recessive retinitis pigmentosa (ARRP). METHODS All subjects in this family underwent a complete ophthalmic examination. Targeted-capture next generation sequencing (NGS) was performed on the proband to detect variants. All variants were verified in the remaining family members by PCR amplification and Sanger sequencing. RESULTS All the affected subjects in this pedigree were diagnosed with retinitis pigmentosa (RP). The compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations in the Crumbs homolog 1 (CRB1) gene were identified in all the affected patients but not in the unaffected individuals in this family. These mutations were inherited from their parents, respectively. CONCLUSION The novel compound heterozygous mutations in CRB1 were identified in a Chinese pedigree with ARRP using targeted-capture next generation sequencing. After evaluating the significant heredity and impaired protein function, the compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations are the causal genes of early onset ARRP in this pedigree. To the best of our knowledge, there is no previous report regarding the compound mutations. PMID:27806333

  20. ARID1B alterations identify aggressive tumors in neuroblastoma.

    PubMed

    Lee, Soo Hyun; Kim, Jung-Sun; Zheng, Siyuan; Huse, Jason T; Bae, Joon Seol; Lee, Ji Won; Yoo, Keon Hee; Koo, Hong Hoe; Kyung, Sungkyu; Park, Woong-Yang; Sung, Ki W

    2017-07-11

    Targeted panel sequencing was performed to determine molecular targets and biomarkers in 72 children with neuroblastoma. Frequent genetic alterations were detected in ALK (16.7%), BRCA1 (13.9%), ATM (12.5%), and PTCH1 (11.1%) in an 83-gene panel. Molecular targets for targeted therapy were identified in 16 of 72 patients (22.2%). Two-thirds of ALK mutations were known to increase sensitivity to ALK inhibitors. Sequence alterations in ARID1B were identified in 5 of 72 patients (6.9%). Four of five ARID1B alterations were detected in tumors of high-risk patients. Two of five patients with ARID1B alterations died of disease progression. Relapse-free survival was lower in patients with ARID1B alterations than in those without (p = 0.01). In analysis confined to high-risk patients, 3-year overall survival was lower in patients with an ARID1B alteration (33.3 ± 27.2%) or MYCN amplification (30.0 ± 23.9%) than in those with neither ARID1B alteration nor MYCN amplification (90.5 ± 6.4%, p = 0.05). These results provide possibilities for targeted therapy and a new biomarker identifying a subgroup of neuroblastoma patients with poor prognosis.

  1. P41IDENTIFICATION OF GLIOMA SPECIFIC APTAMER TARGETS

    PubMed Central

    Arora, Mohit; Alder, Jane; Lawrence, Clare; Davis, Charles; Dawson, Tim; Hall, Greg; Shaw, Lisa

    2014-01-01

    INTRODUCTION: Aptamers are in vitro generated DNA and RNA sequences which are randomly created as a library, with multiple permutations and combinations. These are then exposed to the target structure against which we want an aptamer ‘selected’ using Sequential Enumeration of Ligands by Exponential enrichment (SELEX). METHOD: Commercially available glioma and glial cell lines and in-house generated primary glioma cultures were used. Modified aptamers based on published sequences against glioma cell lines and newly generated sequences were used in the project to identify their binding targets. Cy3 or biotin- conjugated aptamers were incubated with live glioma cell cultures and imaged using confocal or light microscopy.To determine the target ligand, aptamers were then reacted with glial cell lysate and subjected to precipitation using streptavidin agarose beads and SDS polyacrylamide electrophoresis. Proteins were analysed by mass spectroscopy. RESULTS: Known and unknown aptamer protein ligands were co-precipitated. Ku70, Ku80 were precipitated along with nucleolin and related proteins. CONCLUSION: The aptamer has shown preferential binding to glioma cells and could act as a delivery system for therapeutic payloads. The aptamer targets Ku70 and Ku80, which are known to be over expressed in other forms of cancer but their role in gliomagenesis has not been fully elucidated. Other novel proteins have also been identified. Thus the aptamer co-precipitation technique has identified potential glioma biomarkers that may be of clinical significance.

  2. On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution

    NASA Astrophysics Data System (ADS)

    Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein

    2018-07-01

    One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.

  3. On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution

    NASA Astrophysics Data System (ADS)

    Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein

    2017-12-01

    One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.

  4. Next-generation sequencing identifies a novel compound heterozygous mutation in MYO7A in a Chinese patient with Usher Syndrome 1B.

    PubMed

    Wei, Xiaoming; Sun, Yan; Xie, Jiansheng; Shi, Quan; Qu, Ning; Yang, Guanghui; Cai, Jun; Yang, Yi; Liang, Yu; Wang, Wei; Yi, Xin

    2012-11-20

    Targeted enrichment and next-generation sequencing (NGS) have been employed for detection of genetic diseases. The purpose of this study was to validate the accuracy and sensitivity of our method for comprehensive mutation detection of hereditary hearing loss, and identify inherited mutations involved in human deafness accurately and economically. To make genetic diagnosis of hereditary hearing loss simple and timesaving, we designed a 0.60 MB array-based chip containing 69 nuclear genes and mitochondrial genome responsible for human deafness and conducted NGS toward ten patients with five known mutations and a Chinese family with hearing loss (never genetically investigated). Ten patients with five known mutations were sequenced using next-generation sequencing to validate the sensitivity of the method. We identified four known mutations in two nuclear deafness causing genes (GJB2 and SLC26A4), one in mitochondrial DNA. We then performed this method to analyze the variants in a Chinese family with hearing loss and identified compound heterozygosity for two novel mutations in gene MYO7A. The compound heterozygosity identified in gene MYO7A causes Usher Syndrome 1B with severe phenotypes. The results support that the combination of enrichment of targeted genes and next-generation sequencing is a valuable molecular diagnostic tool for hereditary deafness and suitable for clinical application. Copyright © 2012 Elsevier B.V. All rights reserved.

  5. Comparative Analysis of Fruit Ripening-Related miRNAs and Their Targets in Blueberry Using Small RNA and Degradome Sequencing

    PubMed Central

    Hou, Yanming; Zhai, Lulu; Li, Xuyan; Xue, Yu; Wang, Jingjing; Yang, Pengjie; Cao, Chunmei; Li, Hongxue; Cui, Yuhai; Bian, Shaomin

    2017-01-01

    MicroRNAs (miRNAs) play vital roles in the regulation of fruit development and ripening. Blueberry is an important small berry fruit crop with economical and nutritional value. However, nothing is known about the miRNAs and their targets involved in blueberry fruit ripening. In this study, using high-throughput sequencing of small RNAs, 84 known miRNAs belonging to 28 families and 16 novel miRNAs were identified in white fruit (WF) and blue fruit (BF) libraries, which represent fruit ripening onset and in progress, respectively. Among them, 41 miRNAs were shown to be differentially expressed during fruit maturation, and 16 miRNAs representing 16 families were further chosen to validate the sRNA sequencing data by stem-loop qRT-PCR. Meanwhile, 178 targets were identified for 41 known and 7 novel miRNAs in WF and BF libraries using degradome sequencing, and targets of miR160 were validated using RLM-RACE (RNA Ligase-Mediated (RLM)-Rapid Amplification of cDNA Ends) approach. Moreover, the expression patterns of 6 miRNAs and their targets were examined during fruit development and ripening. Finally, integrative analysis of miRNAs and their targets revealed a complex miRNA-mRNA regulatory network involving a wide variety of biological processes. The findings will facilitate future investigations of the miRNA-mediated mechanisms that regulate fruit development and ripening in blueberry. PMID:29257112

  6. Designing pH induced fold switch in proteins

    NASA Astrophysics Data System (ADS)

    Baruah, Anupaul; Biswas, Parbati

    2015-05-01

    This work investigates the computational design of a pH induced protein fold switch based on a self-consistent mean-field approach by identifying the ensemble averaged characteristics of sequences that encode a fold switch. The primary challenge to balance the alternative sets of interactions present in both target structures is overcome by simultaneously optimizing two foldability criteria corresponding to two target structures. The change in pH is modeled by altering the residual charge on the amino acids. The energy landscape of the fold switch protein is found to be double funneled. The fold switch sequences stabilize the interactions of the sites with similar relative surface accessibility in both target structures. Fold switch sequences have low sequence complexity and hence lower sequence entropy. The pH induced fold switch is mediated by attractive electrostatic interactions rather than hydrophobic-hydrophobic contacts. This study may provide valuable insights to the design of fold switch proteins.

  7. Global Identification of MicroRNAs and Their Targets in Barley under Salinity Stress

    PubMed Central

    Cui, Licao; Feng, Kewei; Liu, Fuyan; Du, Xianghong; Tong, Wei; Nie, Xiaojun; Ji, Wanquan; Weining, Song

    2015-01-01

    Salinity is a major limiting factor for agricultural production worldwide. A better understanding of the mechanisms of salinity stress response will aid efforts to improve plant salt tolerance. In this study, a combination of small RNA and mRNA degradome sequencing was used to identify salinity responsive-miRNAs and their targets in barley. A total of 152 miRNAs belonging to 126 families were identified, of which 44 were found to be salinity responsive with 30 up-regulated and 25 down-regulated respectively. The majority of the salinity-responsive miRNAs were up-regulated at the 8h time point, while down-regulated at the 3h and 27h time points. The targets of these miRNAs were further detected by degradome sequencing coupled with bioinformatics prediction. Finally, qRT-PCR was used to validate the identified miRNA and their targets. Our study systematically investigated the expression profile of miRNA and their targets in barley during salinity stress phase, which can contribute to understanding how miRNAs respond to salinity stress in barley and other cereal crops. PMID:26372557

  8. Identification of miRNAs Involved in Stolon Formation in Tulipa edulis by High-Throughput Sequencing

    PubMed Central

    Zhu, Zaibiao; Miao, Yuanyuan; Guo, Qiaosheng; Zhu, Yunhao; Yang, Xiaohua; Sun, Yuan

    2016-01-01

    MicroRNAs (miRNAs) are a class of endogenous, non-coding small RNAs that play an important role in transcriptional and post-transcriptional gene regulation. However, the sequence information and functions of miRNAs are still unexplored in Tulipa edulis. In this study, high-throughput sequencing was used to identify small RNAs in stolon formation stages (stage 1, 2, and 3) in T. edulis. A total of 12,890,912, 12,182,122, and 12,061,434 clean reads were obtained from stage 1, 2, and 3, respectively. Among the reads, 88 conserved miRNAs and 70 novel miRNAs were identified. Target prediction of 122 miRNAs resulted in 531 potential target genes. Nr, Swiss-Prot, GO, COG, and KEGG annotations revealed that these target genes participate in many biologic and metabolic processes. Moreover, qRT-PCR was performed to analyze the expression levels of the miRNAs and target genes in stolon formation. The results revealed that miRNAs play a key role in T. edulis stolon formation. PMID:27446103

  9. Identification of miRNAs Involved in Stolon Formation in Tulipa edulis by High-Throughput Sequencing.

    PubMed

    Zhu, Zaibiao; Miao, Yuanyuan; Guo, Qiaosheng; Zhu, Yunhao; Yang, Xiaohua; Sun, Yuan

    2016-01-01

    MicroRNAs (miRNAs) are a class of endogenous, non-coding small RNAs that play an important role in transcriptional and post-transcriptional gene regulation. However, the sequence information and functions of miRNAs are still unexplored in Tulipa edulis. In this study, high-throughput sequencing was used to identify small RNAs in stolon formation stages (stage 1, 2, and 3) in T. edulis. A total of 12,890,912, 12,182,122, and 12,061,434 clean reads were obtained from stage 1, 2, and 3, respectively. Among the reads, 88 conserved miRNAs and 70 novel miRNAs were identified. Target prediction of 122 miRNAs resulted in 531 potential target genes. Nr, Swiss-Prot, GO, COG, and KEGG annotations revealed that these target genes participate in many biologic and metabolic processes. Moreover, qRT-PCR was performed to analyze the expression levels of the miRNAs and target genes in stolon formation. The results revealed that miRNAs play a key role in T. edulis stolon formation.

  10. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets

    DOE PAGES

    Schulze, Kornelius; Imbeaud, Sandrine; Letouzé, Eric; ...

    2015-03-30

    Our genomic analyses promise to improve tumor characterization to optimize personalized treatment for patients with hepatocellular carcinoma (HCC). Exome sequencing analysis of 243 liver tumors identified mutational signatures associated with specific risk factors, mainly combined alcohol and tobacco consumption and exposure to aflatoxin B1. We identified 161 putative driver genes associated with 11 recurrently altered pathways. Associations of mutations defined 3 groups of genes related to risk factors and centered on CTNNB1 (alcohol), TP53 (hepatitis B virus, HBV) and AXIN1. These analyses according to tumor stage progression identified TERT promoter mutation as an early event, whereasFGF3, FGF4, FGF19 or CCND1more » amplification and TP53 and CDKN2A alterations appeared at more advanced stages in aggressive tumors. In 28% of the tumors, we identified genetic alterations potentially targetable by US Food and Drug Administration (FDA)–approved drugs. Finally, we identified risk factor–specific mutational signatures and defined the extensive landscape of altered genes and pathways in HCC, which will be useful to design clinical trials for targeted therapy.« less

  11. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schulze, Kornelius; Imbeaud, Sandrine; Letouzé, Eric

    Our genomic analyses promise to improve tumor characterization to optimize personalized treatment for patients with hepatocellular carcinoma (HCC). Exome sequencing analysis of 243 liver tumors identified mutational signatures associated with specific risk factors, mainly combined alcohol and tobacco consumption and exposure to aflatoxin B1. We identified 161 putative driver genes associated with 11 recurrently altered pathways. Associations of mutations defined 3 groups of genes related to risk factors and centered on CTNNB1 (alcohol), TP53 (hepatitis B virus, HBV) and AXIN1. These analyses according to tumor stage progression identified TERT promoter mutation as an early event, whereasFGF3, FGF4, FGF19 or CCND1more » amplification and TP53 and CDKN2A alterations appeared at more advanced stages in aggressive tumors. In 28% of the tumors, we identified genetic alterations potentially targetable by US Food and Drug Administration (FDA)–approved drugs. Finally, we identified risk factor–specific mutational signatures and defined the extensive landscape of altered genes and pathways in HCC, which will be useful to design clinical trials for targeted therapy.« less

  12. Design and Construction of a Single-Tube, LATE-PCR, Multiplex Endpoint Assay with Lights-On/Lights-Off Probes for the Detection of Pathogens Associated with Sepsis

    PubMed Central

    Carver-Brown, Rachel K.; Reis, Arthur H.; Rice, Lisa M.; Czajka, John W.; Wangh, Lawrence J.

    2012-01-01

    Aims. The goal of this study was to construct a single tube molecular diagnostic multiplex assay for the detection of microbial pathogens commonly associated with septicemia, using LATE-PCR and Lights-On/Lights-Off probe technology. Methods and Results. The assay described here identified pathogens associated with sepsis by amplification and analysis of the 16S ribosomal DNA gene sequence for bacteria and specific gene sequences for fungi. A sequence from an unidentified gene in Lactococcus lactis subsp. cremoris served as a positive control for assay function. LATE-PCR was used to generate single-stranded amplicons that were then analyzed at endpoint over a wide temperature range in a specific fluorescent color. Each bacterial target was identified by its pattern of hybridization to Lights-On/Lights-Off probes derived from molecular beacons. Complex mixtures of targets were also detected. Conclusions. All microbial targets were identified in samples containing low starting copy numbers of pathogen genomic DNA, both as individual targets and in complex mixtures. Significance and Impact of the Study. This assay uses new technology to achieve an advance in the field of molecular diagnostics: a single-tube multiplex assay for identification of pathogens commonly associated with sepsis. PMID:23326668

  13. Method of Identifying a Base in a Nucleic Acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    1999-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  14. Identifying a base in a nucleic acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2005-02-08

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  15. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites.

    PubMed

    Betel, Doron; Koppal, Anjali; Agius, Phaedra; Sander, Chris; Leslie, Christina

    2010-01-01

    mirSVR is a new machine learning method for ranking microRNA target sites by a down-regulation score. The algorithm trains a regression model on sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR is competitive with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites.

  16. In-silico identification of miRNAs and their regulating target functions in Ocimum basilicum.

    PubMed

    Singh, Noopur; Sharma, Ashok

    2014-12-01

    microRNA is known to play an important role in growth and development of the plants and also in environmental stress. Ocimum basilicum (Basil) is a well known herb for its medicinal properties. In this study, we used in-silico approaches to identify miRNAs and their targets regulating different functions in O. basilicum using EST approach. Additionally, functional annotation, gene ontology and pathway analysis of identified target transcripts were also done. Seven miRNA families were identified. Meaningful regulations of target transcript by identified miRNAs were computationally evaluated. Four miRNA families have been reported by us for the first time from the Lamiaceae. Our results further confirmed that uracil was the predominant base in the first positions of identified mature miRNA sequence, while adenine and uracil were predominant in pre-miRNA sequences. Phylogenetic analysis was carried out to determine the relation between O. basilicum and other plant pre-miRNAs. Thirteen potential targets were evaluated for 4 miRNA families. Majority of the identified target transcripts regulated by miRNAs showed response to stress. miRNA 5021 was also indicated for playing an important role in the amino acid metabolism and co-factor metabolism in this plant. To the best of our knowledge this is the first in silico study describing miRNAs and their regulation in different metabolic pathways of O. basilicum. Copyright © 2014 Elsevier B.V. All rights reserved.

  17. A new comprehensive method for detection of livestock-related pathogenic viruses using a target enrichment system.

    PubMed

    Oba, Mami; Tsuchiaka, Shinobu; Omatsu, Tsutomu; Katayama, Yukie; Otomaru, Konosuke; Hirata, Teppei; Aoki, Hiroshi; Murata, Yoshiteru; Makino, Shinji; Nagai, Makoto; Mizutani, Tetsuya

    2018-01-08

    We tested usefulness of a target enrichment system SureSelect, a comprehensive viral nucleic acid detection method, for rapid identification of viral pathogens in feces samples of cattle, pigs and goats. This system enriches nucleic acids of target viruses in clinical/field samples by using a library of biotinylated RNAs with sequences complementary to the target viruses. The enriched nucleic acids are amplified by PCR and subjected to next generation sequencing to identify the target viruses. In many samples, SureSelect target enrichment method increased efficiencies for detection of the viruses listed in the biotinylated RNA library. Furthermore, this method enabled us to determine nearly full-length genome sequence of porcine parainfluenza virus 1 and greatly increased Breadth, a value indicating the ratio of the mapping consensus length in the reference genome, in pig samples. Our data showed usefulness of SureSelect target enrichment system for comprehensive analysis of genomic information of various viruses in field samples. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Molecular testing for familial hypercholesterolaemia-associated mutations in a UK-based cohort: development of an NGS-based method and comparison with multiplex polymerase chain reaction and oligonucleotide arrays.

    PubMed

    Reiman, Anne; Pandey, Sarojini; Lloyd, Kate L; Dyer, Nigel; Khan, Mike; Crockard, Martin; Latten, Mark J; Watson, Tracey L; Cree, Ian A; Grammatopoulos, Dimitris K

    2016-11-01

    Background Detection of disease-associated mutations in patients with familial hypercholesterolaemia is crucial for early interventions to reduce risk of cardiovascular disease. Screening for these mutations represents a methodological challenge since more than 1200 different causal mutations in the low-density lipoprotein receptor has been identified. A number of methodological approaches have been developed for screening by clinical diagnostic laboratories. Methods Using primers targeting, the low-density lipoprotein receptor, apolipoprotein B, and proprotein convertase subtilisin/kexin type 9, we developed a novel Ion Torrent-based targeted re-sequencing method. We validated this in a West Midlands-UK small cohort of 58 patients screened in parallel with other mutation-targeting methods, such as multiplex polymerase chain reaction (Elucigene FH20), oligonucleotide arrays (Randox familial hypercholesterolaemia array) or the Illumina next-generation sequencing platform. Results In this small cohort, the next-generation sequencing method achieved excellent analytical performance characteristics and showed 100% and 89% concordance with the Randox array and the Elucigene FH20 assay. Investigation of the discrepant results identified two cases of mutation misclassification of the Elucigene FH20 multiplex polymerase chain reaction assay. A number of novel mutations not previously reported were also identified by the next-generation sequencing method. Conclusions Ion Torrent-based next-generation sequencing can deliver a suitable alternative for the molecular investigation of familial hypercholesterolaemia patients, especially when comprehensive mutation screening for rare or unknown mutations is required.

  19. Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

    PubMed Central

    Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.

    2012-01-01

    High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309

  20. Analysis of the ergosterol biosynthesis pathway cloning, molecular characterization and phylogeny of lanosterol 14 α-demethylase (ERG11) gene of Moniliophthora perniciosa.

    PubMed

    de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles

    2014-10-01

    The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches' broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea.

  1. Analysis of the ergosterol biosynthesis pathway cloning, molecular characterization and phylogeny of lanosterol 14 α-demethylase (ERG11) gene of Moniliophthora perniciosa

    PubMed Central

    de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles

    2014-01-01

    The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches’ broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea. PMID:25505843

  2. SIMBAD : a sequence-independent molecular-replacement pipeline

    DOE PAGES

    Simpkin, Adam J.; Simkovic, Felix; Thomas, Jens M. H.; ...

    2018-06-08

    The conventional approach to finding structurally similar search models for use in molecular replacement (MR) is to use the sequence of the target to search against those of a set of known structures. Sequence similarity often correlates with structure similarity. Given sufficient similarity, a known structure correctly positioned in the target cell by the MR process can provide an approximation to the unknown phases of the target. An alternative approach to identifying homologous structures suitable for MR is to exploit the measured data directly, comparing the lattice parameters or the experimentally derived structure-factor amplitudes with those of known structures. Here,more » SIMBAD , a new sequence-independent MR pipeline which implements these approaches, is presented. SIMBAD can identify cases of contaminant crystallization and other mishaps such as mistaken identity (swapped crystallization trays), as well as solving unsequenced targets and providing a brute-force approach where sequence-dependent search-model identification may be nontrivial, for example because of conformational diversity among identifiable homologues. The program implements a three-step pipeline to efficiently identify a suitable search model in a database of known structures. The first step performs a lattice-parameter search against the entire Protein Data Bank (PDB), rapidly determining whether or not a homologue exists in the same crystal form. The second step is designed to screen the target data for the presence of a crystallized contaminant, a not uncommon occurrence in macromolecular crystallography. Solving structures with MR in such cases can remain problematic for many years, since the search models, which are assumed to be similar to the structure of interest, are not necessarily related to the structures that have actually crystallized. To cater for this eventuality, SIMBAD rapidly screens the data against a database of known contaminant structures. Where the first two steps fail to yield a solution, a final step in SIMBAD can be invoked to perform a brute-force search of a nonredundant PDB database provided by the MoRDa MR software. Through early-access usage of SIMBAD , this approach has solved novel cases that have otherwise proved difficult to solve.« less

  3. SIMBAD : a sequence-independent molecular-replacement pipeline

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Simpkin, Adam J.; Simkovic, Felix; Thomas, Jens M. H.

    The conventional approach to finding structurally similar search models for use in molecular replacement (MR) is to use the sequence of the target to search against those of a set of known structures. Sequence similarity often correlates with structure similarity. Given sufficient similarity, a known structure correctly positioned in the target cell by the MR process can provide an approximation to the unknown phases of the target. An alternative approach to identifying homologous structures suitable for MR is to exploit the measured data directly, comparing the lattice parameters or the experimentally derived structure-factor amplitudes with those of known structures. Here,more » SIMBAD , a new sequence-independent MR pipeline which implements these approaches, is presented. SIMBAD can identify cases of contaminant crystallization and other mishaps such as mistaken identity (swapped crystallization trays), as well as solving unsequenced targets and providing a brute-force approach where sequence-dependent search-model identification may be nontrivial, for example because of conformational diversity among identifiable homologues. The program implements a three-step pipeline to efficiently identify a suitable search model in a database of known structures. The first step performs a lattice-parameter search against the entire Protein Data Bank (PDB), rapidly determining whether or not a homologue exists in the same crystal form. The second step is designed to screen the target data for the presence of a crystallized contaminant, a not uncommon occurrence in macromolecular crystallography. Solving structures with MR in such cases can remain problematic for many years, since the search models, which are assumed to be similar to the structure of interest, are not necessarily related to the structures that have actually crystallized. To cater for this eventuality, SIMBAD rapidly screens the data against a database of known contaminant structures. Where the first two steps fail to yield a solution, a final step in SIMBAD can be invoked to perform a brute-force search of a nonredundant PDB database provided by the MoRDa MR software. Through early-access usage of SIMBAD , this approach has solved novel cases that have otherwise proved difficult to solve.« less

  4. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy.

    PubMed

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo; Xu, Ge-Zhi

    2017-01-01

    Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [ LRP5 , c.4053 DelC (p.Ile1351IlefsX88); TSPAN12 , EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. We identified two novel heterozygous deletion mutations [ LRP5 , c.4053 DelC (p.Ile1351IlefsX88); TSPAN12 , EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype-phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling.

  5. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy

    PubMed Central

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M.; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo

    2017-01-01

    Purpose Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. Methods To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Results Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. Conclusions We identified two novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype–phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling. PMID:28867931

  6. Identification of distant drug off-targets by direct superposition of binding pocket surfaces.

    PubMed

    Schumann, Marcel; Armen, Roger S

    2013-01-01

    Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target ("distant off-targets"). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target ("distant off-target").

  7. In vitro selection using a dual RNA library that allows primerless selection

    PubMed Central

    Jarosch, Florian; Buchner, Klaus; Klussmann, Sven

    2006-01-01

    High affinity target-binding aptamers are identified from random oligonucleotide libraries by an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX). Since the SELEX process includes a PCR amplification step the randomized region of the oligonucleotide libraries need to be flanked by two fixed primer binding sequences. These primer binding sites are often difficult to truncate because they may be necessary to maintain the structure of the aptamer or may even be part of the target binding motif. We designed a novel type of RNA library that carries fixed sequences which constrain the oligonucleotides into a partly double-stranded structure, thereby minimizing the risk that the primer binding sequences become part of the target-binding motif. Moreover, the specific design of the library including the use of tandem RNA Polymerase promoters allows the selection of oligonucleotides without any primer binding sequences. The library was used to select aptamers to the mirror-image peptide of ghrelin. Ghrelin is a potent stimulator of growth-hormone release and food intake. After selection, the identified aptamer sequences were directly synthesized in their mirror-image configuration. The final 44 nt-Spiegelmer, named NOX-B11-3, blocks ghrelin action in a cell culture assay displaying an IC50 of 4.5 nM at 37°C. PMID:16855281

  8. Identification of microRNA-like RNAs from Curvularia lunata associated with maize leaf spot by bioinformation analysis and deep sequencing.

    PubMed

    Liu, Tong; Hu, John; Zuo, Yuhu; Jin, Yazhong; Hou, Jumei

    2016-04-01

    Deep sequencing of small RNAs is a useful tool to identify novel small RNAs that may be involved in fungal growth and pathogenesis. In this study, we used HiSeq deep sequencing to identify 747,487 unique small RNAs from Curvularia lunata. Among these small RNAs were 1012 microRNA-like RNAs (milRNAs), which are similar to other known microRNAs, and 48 potential novel milRNAs without homologs in other organisms have been identified using the miRBase© database. We used quantitative PCR to analyze the expression of four of these milRNAs from C. lunata at different developmental stages. The analysis revealed several changes associated with germinating conidia and mycelial growth, suggesting that these milRNAs may play a role in pathogen infection and mycelial growth. A total of 8334 target mRNAs for the 1012 milRNAs that were identified, and 256 target mRNAs for the 48 novel milRNAs were predicted by computational analysis. These target mRNAs of milRNAs were also performed by gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis. To our knowledge, this study is the first report of C. lunata's milRNA profiles. This information will provide a better understanding of pathogen development and infection mechanism.

  9. Implementing targeted region capture sequencing for the clinical detection of Alagille syndrome: An efficient and cost‑effective method.

    PubMed

    Huang, Tianhong; Yang, Guilin; Dang, Xiao; Ao, Feijian; Li, Jiankang; He, Yizhou; Tang, Qiyuan; He, Qing

    2017-11-01

    Alagille syndrome (AGS) is a highly variable, autosomal dominant disease that affects multiple structures including the liver, heart, eyes, bones and face. Targeted region capture sequencing focuses on a panel of known pathogenic genes and provides a rapid, cost‑effective and accurate method for molecular diagnosis. In a Chinese family, this method was used on the proband and Sanger sequencing was applied to validate the candidate mutation. A de novo heterozygous mutation (c.3254_3255insT p.Leu1085PhefsX24) of the jagged 1 gene was identified as the potential disease‑causing gene mutation. In conclusion, the present study suggested that target region capture sequencing is an efficient, reliable and accurate approach for the clinical diagnosis of AGS. Furthermore, these results expand on the understanding of the pathogenesis of AGS.

  10. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    PubMed

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding the molecular mechanisms of drought resistance in mulberry.

  11. Sequencing Needs for Viral Diagnostics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, S N; Lam, M; Mulakken, N J

    2004-01-26

    We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less

  12. Identification of human microRNA targets from isolated argonaute protein complexes.

    PubMed

    Beitzinger, Michaela; Peters, Lasse; Zhu, Jia Yun; Kremmer, Elisabeth; Meister, Gunter

    2007-06-01

    MicroRNAs (miRNAs) constitute a class of small non-coding RNAs that regulate gene expression on the level of translation and/or mRNA stability. Mammalian miRNAs associate with members of the Argonaute (Ago) protein family and bind to partially complementary sequences in the 3' untranslated region (UTR) of specific target mRNAs. Computer algorithms based on factors such as free binding energy or sequence conservation have been used to predict miRNA target mRNAs. Based on such predictions, up to one third of all mammalian mRNAs seem to be under miRNA regulation. However, due to the low degree of complementarity between the miRNA and its target, such computer programs are often imprecise and therefore not very reliable. Here we report the first biochemical identification approach of miRNA targets from human cells. Using highly specific monoclonal antibodies against members of the Ago protein family, we co-immunoprecipitate Ago-bound mRNAs and identify them by cloning. Interestingly, most of the identified targets are also predicted by different computer programs. Moreover, we randomly analyzed six different target candidates and were able to experimentally validate five as miRNA targets. Our data clearly indicate that miRNA targets can be experimentally identified from Ago complexes and therefore provide a new tool to directly analyze miRNA function.

  13. Plastoglobule-Targeting Competence of a Putative Transit Peptide Sequence from Rice Phytoene Synthase 2 in Plastids.

    PubMed

    You, Min Kyoung; Kim, Jin Hwa; Lee, Yeo Jin; Jeong, Ye Sol; Ha, Sun-Hwa

    2016-12-22

    Plastoglobules (PGs) are thylakoid membrane microdomains within plastids that are known as specialized locations of carotenogenesis. Three rice phytoene synthase proteins (OsPSYs) involved in carotenoid biosynthesis have been identified. Here, the N-terminal 80-amino-acid portion of OsPSY2 (PTp) was demonstrated to be a chloroplast-targeting peptide by displaying cytosolic localization of OsPSY2(ΔPTp):mCherry in rice protoplast, in contrast to chloroplast localization of OsPSY2:mCherry in a punctate pattern. The peptide sequence of a PTp was predicted to harbor two transmembrane domains eligible for a putative PG-targeting signal. To assess and enhance the PG-targeting ability of PTp, the original PTp DNA sequence ( PTp ) was modified to a synthetic DNA sequence ( stPTp ), which had 84.4% similarity to the original sequence. The motivation of this modification was to reduce the GC ratio from 75% to 65% and to disentangle the hairpin loop structures of PTp . These two DNA sequences were fused to the sequence of the synthetic green fluorescent protein (sGFP) and drove GFP expression with different efficiencies. In particular, the RNA and protein levels of stPTp-sGFP were slightly improved to 1.4-fold and 1.3-fold more than those of sGFP, respectively. The green fluorescent signals of their mature proteins were all observed as speckle-like patterns with slightly blurred stromal signals in chloroplasts. These discrete green speckles of PTp - sGFP and stPTp - sGFP corresponded exactly to the red fluorescent signal displayed by OsPSY2:mCherry in both etiolated and greening protoplasts and it is presumed to correspond to distinct PGs. In conclusion, we identified PTp as a transit peptide sequence facilitating preferential translocation of foreign proteins to PGs, and developed an improved PTp sequence, a s tPTp , which is expected to be very useful for applications in plant biotechnologies requiring precise micro-compartmental localization in plastids.

  14. The long tail of molecular alterations in non-small cell lung cancer: a single-institution experience of next-generation sequencing in clinical molecular diagnostics.

    PubMed

    Fumagalli, Caterina; Vacirca, Davide; Rappa, Alessandra; Passaro, Antonio; Guarize, Juliana; Rafaniello Raviele, Paola; de Marinis, Filippo; Spaggiari, Lorenzo; Casadio, Chiara; Viale, Giuseppe; Barberis, Massimo; Guerini-Rocco, Elena

    2018-03-13

    Molecular profiling of advanced non-small cell lung cancers (NSCLC) is essential to identify patients who may benefit from targeted treatments. In the last years, the number of potentially actionable molecular alterations has rapidly increased. Next-generation sequencing allows for the analysis of multiple genes simultaneously. To evaluate the feasibility and the throughput of next-generation sequencing in clinical molecular diagnostics of advanced NSCLC. A single-institution cohort of 535 non-squamous NSCLC was profiled using a next-generation sequencing panel targeting 22 actionable and cancer-related genes. 441 non-squamous NSCLC (82.4%) harboured at least one gene alteration, including 340 cases (63.6%) with clinically relevant molecular aberrations. Mutations have been detected in all but one gene ( FGFR1 ) of the panel. Recurrent alterations were observed in KRAS , TP53 , EGFR , STK11 and MET genes, whereas the remaining genes were mutated in <5% of the cases. Concurrent mutations were detected in 183 tumours (34.2%), mostly impairing KRAS or EGFR in association with TP53 alterations. The study highlights the feasibility of targeted next-generation sequencing in clinical setting. The majority of NSCLC harboured mutations in clinically relevant genes, thus identifying patients who might benefit from different targeted therapies. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  15. Identification, characterization and expression analysis of pigeonpea miRNAs in response to Fusarium wilt.

    PubMed

    Hussain, Khalid; Mungikar, Kanak; Kulkarni, Abhijeet; Kamble, Avinash

    2018-05-05

    Upon confrontation with unfavourable conditions, plants invoke a very complex set of biochemical and physiological reactions and alter gene expression patterns to combat the situations. MicroRNAs (miRNAs), a class of small non-coding RNA, contribute extensively in regulation of gene expression through translation inhibition or degradation of their target mRNAs during such conditions. Therefore, identification of miRNAs and their targets holds importance in understanding the regulatory networks triggered during stress. Structure and sequence similarity based in silico prediction of miRNAs in Cajanus cajan L. (Pigeonpea) draft genome sequence has been carried out earlier. These annotations also appear in related GenBank genome sequence entries. However, there are no reports available on context dependent miRNA expression and their targets in pigeonpea. Therefore, in the present study we addressed these questions computationally, using pigeonpea EST sequence information. We identified five novel pigeonpea miRNA precursors, their mature forms and targets. Interestingly, only one of these miRNAs (miR169i-3p) was identified earlier in draft genome sequence. We then validated expression of these miRNAs, experimentally. It was also observed that these miRNAs show differential expression patterns in response to Fusarium inoculation indicating their biotic stress responsive nature. Overall these results will help towards better understanding the regulatory network of defense during pigeonpea -pathogen interactions and role of miRNAs in the process. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. Probe kit for identifying a base in a nucleic acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  17. Massively Parallel Sequencing of Patients with Intellectual Disability, Congenital Anomalies and/or Autism Spectrum Disorders with a Targeted Gene Panel

    PubMed Central

    Brett, Maggie; McPherson, John; Zang, Zhi Jiang; Lai, Angeline; Tan, Ee-Shien; Ng, Ivy; Ong, Lai-Choo; Cham, Breana; Tan, Patrick; Rozen, Steve; Tan, Ene-Choo

    2014-01-01

    Developmental delay and/or intellectual disability (DD/ID) affects 1–3% of all children. At least half of these are thought to have a genetic etiology. Recent studies have shown that massively parallel sequencing (MPS) using a targeted gene panel is particularly suited for diagnostic testing for genetically heterogeneous conditions. We report on our experiences with using massively parallel sequencing of a targeted gene panel of 355 genes for investigating the genetic etiology of eight patients with a wide range of phenotypes including DD/ID, congenital anomalies and/or autism spectrum disorder. Targeted sequence enrichment was performed using the Agilent SureSelect Target Enrichment Kit and sequenced on the Illumina HiSeq2000 using paired-end reads. For all eight patients, 81–84% of the targeted regions achieved read depths of at least 20×, with average read depths overlapping targets ranging from 322× to 798×. Causative variants were successfully identified in two of the eight patients: a nonsense mutation in the ATRX gene and a canonical splice site mutation in the L1CAM gene. In a third patient, a canonical splice site variant in the USP9X gene could likely explain all or some of her clinical phenotypes. These results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes. However, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism. PMID:24690944

  18. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    PubMed

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  19. Targeted Amplicon Sequencing (TAS): A Scalable Next-Gen Approach to Multilocus, Multitaxa Phylogenetics

    PubMed Central

    Bybee, Seth M.; Bracken-Grissom, Heather; Haynes, Benjamin D.; Hermansen, Russell A.; Byers, Robert L.; Clement, Mark J.; Udall, Joshua A.; Wilcox, Edward R.; Crandall, Keith A.

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach. PMID:22002916

  20. Identification and profiling of novel microRNAs in the Brassica rapa genome based on small RNA deep sequencing

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are one of the functional non-coding small RNAs involved in the epigenetic control of the plant genome. Although plants contain both evolutionary conserved miRNAs and species-specific miRNAs within their genomes, computational methods often only identify evolutionary conserved miRNAs. The recent sequencing of the Brassica rapa genome enables us to identify miRNAs and their putative target genes. In this study, we sought to provide a more comprehensive prediction of B. rapa miRNAs based on high throughput small RNA deep sequencing. Results We sequenced small RNAs from five types of tissue: seedlings, roots, petioles, leaves, and flowers. By analyzing 2.75 million unique reads that mapped to the B. rapa genome, we identified 216 novel and 196 conserved miRNAs that were predicted to target approximately 20% of the genome’s protein coding genes. Quantitative analysis of miRNAs from the five types of tissue revealed that novel miRNAs were expressed in diverse tissues but their expression levels were lower than those of the conserved miRNAs. Comparative analysis of the miRNAs between the B. rapa and Arabidopsis thaliana genomes demonstrated that redundant copies of conserved miRNAs in the B. rapa genome may have been deleted after whole genome triplication. Novel miRNA members seemed to have spontaneously arisen from the B. rapa and A. thaliana genomes, suggesting the species-specific expansion of miRNAs. We have made this data publicly available in a miRNA database of B. rapa called BraMRs. The database allows the user to retrieve miRNA sequences, their expression profiles, and a description of their target genes from the five tissue types investigated here. Conclusions This is the first report to identify novel miRNAs from Brassica crops using genome-wide high throughput techniques. The combination of computational methods and small RNA deep sequencing provides robust predictions of miRNAs in the genome. The finding of numerous novel miRNAs, many with few target genes and low expression levels, suggests the rapid evolution of miRNA genes. The development of a miRNA database, BraMRs, enables us to integrate miRNA identification, target prediction, and functional annotation of target genes. BraMRs will represent a valuable public resource with which to study the epigenetic control of B. rapa and other closely related Brassica species. The database is available at the following link: http://bramrs.rna.kr [1]. PMID:23163954

  1. Characterisation of IS153, an IS3-family insertion sequence isolated from Lactobacillus sanfranciscensis and its use for strain differentiation.

    PubMed

    Ehrmann, M A; Vogel, R E

    2001-11-01

    An insertion sequence has been identified in the genome of Lactobacillus sanfranciscensis DSM 20451T as segment of 1351 nucleotides containing 37-bp imperfect terminal inverted repeats. The sequence of this element encodes two out of phase, overlapping open reading frames, orfA and orfB, from which three putative proteins are produced. OrfAB is a transframe protein produced by -1 translational frame shifting between orf A and orf B that is presumed to be the transposase. The large orfAB of this element encodes a 342 amino acid protein that displays similarities with transposases encoded by bacterial insertion sequences belonging to the IS3 family. In L. sanfranciscensis type strain DSM 20451T multiple truncated IS elements were identified. Inverse PCR was used to analyze target sites of four of these elements, but except of their highly AT rich character not any sequence specificity was identified so far. Moreover, no flanking direct repeats were identified. Multiple copies of IS153 were detected by hybridization in other strains of L. sanfranciscensis. Resulting hybridization patterns were shown to differentiate between organisms at strain level rather than a probe targeted against the 16S rDNA. With a PCR based approach IS153 or highly similar sequences were detected in L. acidophilus, L. casei, L. malefermentans, L. plantarum, L. hilgardii, L. collinoides L. farciminis L. sakei and L. salivarius, L. reuteri as well as in Enterococcus faecium, Pediococcus acidilactici and P. pentosaceus.

  2. A Public Health Model for the Molecular Surveillance of HIV Transmission in San Diego, California

    PubMed Central

    May, Susanne; Tweeten, Samantha; Drumright, Lydia; Pacold, Mary E.; Kosakovsky Pond, Sergei L.; Pesano, Rick L.; Lie, Yolanda S.; Richman, Douglas D.; Frost, Simon D.W.; Woelk, Christopher H.; Little, Susan J.

    2009-01-01

    Background Current public health efforts often use molecular technologies to identify and contain communicable disease networks, but not for HIV. Here, we investigate how molecular epidemiology can be used to identify highly-related HIV networks within a population and how voluntary contact tracing of sexual partners can be used to selectively target these networks. Methods We evaluated the use of HIV-1 pol sequences obtained from participants of a community-recruited cohort (n=268) and a primary infection research cohort (n=369) to define highly related transmission clusters and the use of contact tracing to link other individuals (n=36) within these clusters. The presence of transmitted drug resistance was interpreted from the pol sequences (Calibrated Population Resistance v3.0). Results Phylogenetic clustering was conservatively defined when the genetic distance between any two pol sequences was <1%, which identified 34 distinct transmission clusters within the combined community-recruited and primary infection research cohorts containing 160 individuals. Although sequences from the epidemiologically-linked partners represented approximately 5% of the total sequences, they clustered with 60% of the sequences that clustered from the combined cohorts (O.R. 21.7; p=<0.01). Major resistance to at least one class of antiretroviral medication was found in 19% of clustering sequences. Conclusions Phylogenetic methods can be used to identify individuals who are within highly related transmission groups, and contact tracing of epidemiologically-linked partners of recently infected individuals can be used to link into previously-defined transmission groups. These methods could be used to implement selectively targeted prevention interventions. PMID:19098493

  3. Application and comparison of large-scale solution-based DNA capture-enrichment methods on ancient DNA

    PubMed Central

    Ávila-Arcos, María C.; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Moreno-Mayar, J. Víctor; Rasmussen, Morten; Fordyce, Sarah L.; Montiel, Rafael; Vielle-Calzada, Jean-Philippe; Willerslev, Eske; Gilbert, M. Thomas P.

    2011-01-01

    The development of second-generation sequencing technologies has greatly benefitted the field of ancient DNA (aDNA). Its application can be further exploited by the use of targeted capture-enrichment methods to overcome restrictions posed by low endogenous and contaminating DNA in ancient samples. We tested the performance of Agilent's SureSelect and Mycroarray's MySelect in-solution capture systems on Illumina sequencing libraries built from ancient maize to identify key factors influencing aDNA capture experiments. High levels of clonality as well as the presence of multiple-copy sequences in the capture targets led to biases in the data regardless of the capture method. Neither method consistently outperformed the other in terms of average target enrichment, and no obvious difference was observed either when two tiling designs were compared. In addition to demonstrating the plausibility of capturing aDNA from ancient plant material, our results also enable us to provide useful recommendations for those planning targeted-sequencing on aDNA. PMID:22355593

  4. E-RNAi: a web application for the multi-species design of RNAi reagents—2010 update

    PubMed Central

    Horn, Thomas; Boutros, Michael

    2010-01-01

    The design of RNA interference (RNAi) reagents is an essential step for performing loss-of-function studies in many experimental systems. The availability of sequenced and annotated genomes greatly facilitates RNAi experiments in an increasing number of organisms that were previously not genetically tractable. The E-RNAi web-service, accessible at http://www.e-rnai.org/, provides a computational resource for the optimized design and evaluation of RNAi reagents. The 2010 update of E-RNAi now covers 12 genomes, including Drosophila, Caenorhabditis elegans, human, emerging model organisms such as Schmidtea mediterranea and Acyrthosiphon pisum, as well as the medically relevant vectors Anopheles gambiae and Aedes aegypti. The web service calculates RNAi reagents based on the input of target sequences, sequence identifiers or by visual selection of target regions through a genome browser interface. It identifies optimized RNAi target-sites by ranking sequences according to their predicted specificity, efficiency and complexity. E-RNAi also facilitates the design of secondary RNAi reagents for validation experiments, evaluation of pooled siRNA reagents and batch design. Results are presented online, as a downloadable HTML report and as tab-delimited files. PMID:20444868

  5. NEBNext Direct: A Novel, Rapid, Hybridization-Based Approach for the Capture and Library Conversion of Genomic Regions of Interest.

    PubMed

    Emerman, Amy B; Bowman, Sarah K; Barry, Andrew; Henig, Noa; Patel, Kruti M; Gardner, Andrew F; Hendrickson, Cynthia L

    2017-07-05

    Next-generation sequencing (NGS) is a powerful tool for genomic studies, translational research, and clinical diagnostics that enables the detection of single nucleotide polymorphisms, insertions and deletions, copy number variations, and other genetic variations. Target enrichment technologies improve the efficiency of NGS by only sequencing regions of interest, which reduces sequencing costs while increasing coverage of the selected targets. Here we present NEBNext Direct ® , a hybridization-based, target-enrichment approach that addresses many of the shortcomings of traditional target-enrichment methods. This approach features a simple, 7-hr workflow that uses enzymatic removal of off-target sequences to achieve a high specificity for regions of interest. Additionally, unique molecular identifiers are incorporated for the identification and filtering of PCR duplicates. The same protocol can be used across a wide range of input amounts, input types, and panel sizes, enabling NEBNext Direct to be broadly applicable across a wide variety of research and diagnostic needs. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  6. Phylogenetic distribution of plant snoRNA families.

    PubMed

    Patra Bhattacharya, Deblina; Canzler, Sebastian; Kehr, Stephanie; Hertel, Jana; Grosse, Ivo; Stadler, Peter F

    2016-11-24

    Small nucleolar RNAs (snoRNAs) are one of the most ancient families amongst non-protein-coding RNAs. They are ubiquitous in Archaea and Eukarya but absent in bacteria. Their main function is to target chemical modifications of ribosomal RNAs. They fall into two classes, box C/D snoRNAs and box H/ACA snoRNAs, which are clearly distinguished by conserved sequence motifs and the type of chemical modification that they govern. Similarly to microRNAs, snoRNAs appear in distinct families of homologs that affect homologous targets. In animals, snoRNAs and their evolution have been studied in much detail. In plants, however, their evolution has attracted comparably little attention. In order to chart the phylogenetic distribution of individual snoRNA families in plants, we applied a sophisticated approach for identifying homologs of known plant snoRNAs across the plant kingdom. In response to the relatively fast evolution of snoRNAs, information on conserved sequence boxes, target sequences, and secondary structure is combined to identify additional snoRNAs. We identified 296 families of snoRNAs in 24 species and traced their evolution throughout the plant kingdom. Many of the plant snoRNA families comprise paralogs. We also found that targets are well-conserved for most snoRNA families. The sequence conservation of snoRNAs is sufficient to establish homologies between phyla. The degree of this conservation tapers off, however, between land plants and algae. Plant snoRNAs are frequently organized in highly conserved spatial clusters. As a resource for further investigations we provide carefully curated and annotated alignments for each snoRNA family under investigation.

  7. Rapid Identification and Differentiation of Trichophyton Species, Based on Sequence Polymorphisms of the Ribosomal Internal Transcribed Spacer Regions, by Rolling-Circle Amplification▿

    PubMed Central

    Kong, Fanrong; Tong, Zhongsheng; Chen, Xiaoyou; Sorrell, Tania; Wang, Bin; Wu, Qixuan; Ellis, David; Chen, Sharon

    2008-01-01

    DNA sequencing analyses have demonstrated relatively limited polymorphisms within the fungal internal transcribed spacer (ITS) regions among Trichophyton spp. We sequenced the ITS region (ITS1, 5.8S, and ITS2) for 42 dermatophytes belonging to seven species (Trichophyton rubrum, T. mentagrophytes, T. soudanense, T. tonsurans, Epidermophyton floccosum, Microsporum canis, and M. gypseum) and developed a novel padlock probe and rolling-circle amplification (RCA)-based method for identification of single nucleotide polymorphisms (SNPs) that could be exploited to differentiate between Trichophyton spp. Sequencing results demonstrated intraspecies genetic variation for T. tonsurans, T. mentagrophytes, and T. soudanense but not T. rubrum. Signature sets of SNPs between T. rubrum and T. soudanense (4-bp difference) and T. violaceum and T. soudanense (3-bp difference) were identified. The RCA assay correctly identified five Trichophyton species. Although the use of two “group-specific” probes targeting both the ITS1 and the ITS2 regions were required to identify T. soudanense, the other species were identified by single ITS1- or ITS2-targeted species-specific probes. There was good agreement between ITS sequencing and the RCA assay. Despite limited genetic variation between Trichophyton spp., the sensitive, specific RCA-based SNP detection assay showed potential as a simple, reproducible method for the rapid (2-h) identification of Trichophyton spp. PMID:18234865

  8. Targeted next-generation sequencing makes new molecular diagnoses and expands genotype-phenotype relationship in Ehlers-Danlos syndrome.

    PubMed

    Weerakkody, Ruwan A; Vandrovcova, Jana; Kanonidou, Christina; Mueller, Michael; Gampawar, Piyush; Ibrahim, Yousef; Norsworthy, Penny; Biggs, Jennifer; Abdullah, Abdulshakur; Ross, David; Black, Holly A; Ferguson, David; Cheshire, Nicholas J; Kazkaz, Hanadi; Grahame, Rodney; Ghali, Neeti; Vandersteen, Anthony; Pope, F Michael; Aitman, Timothy J

    2016-11-01

    Ehlers-Danlos syndrome (EDS) comprises a group of overlapping hereditary disorders of connective tissue with significant morbidity and mortality, including major vascular complications. We sought to identify the diagnostic utility of a next-generation sequencing (NGS) panel in a mixed EDS cohort. We developed and applied PCR-based NGS assays for targeted, unbiased sequencing of 12 collagen and aortopathy genes to a cohort of 177 unrelated EDS patients. Variants were scored blind to previous genetic testing and then compared with results of previous Sanger sequencing. Twenty-eight pathogenic variants in COL5A1/2, COL3A1, FBN1, and COL1A1 and four likely pathogenic variants in COL1A1, TGFBR1/2, and SMAD3 were identified by the NGS assays. These included all previously detected single-nucleotide and other short pathogenic variants in these genes, and seven newly detected pathogenic or likely pathogenic variants leading to clinically significant diagnostic revisions. Twenty-two variants of uncertain significance were identified, seven of which were in aortopathy genes and required clinical follow-up. Unbiased NGS-based sequencing made new molecular diagnoses outside the expected EDS genotype-phenotype relationship and identified previously undetected clinically actionable variants in aortopathy susceptibility genes. These data may be of value in guiding future clinical pathways for genetic diagnosis in EDS.Genet Med 18 11, 1119-1127.

  9. Identification of Distant Drug Off-Targets by Direct Superposition of Binding Pocket Surfaces

    PubMed Central

    Schumann, Marcel; Armen, Roger S.

    2013-01-01

    Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target (“distant off-targets”). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target (“distant off-target”). PMID:24391782

  10. Unbiased Combinatorial Genomic Approaches to Identify Alternative Therapeutic Targets within the TSC Signaling Network

    DTIC Science & Technology

    2015-09-01

    assessed the specificity of mutation in Drosophila S2R+ cells. We generated a quantitative mutation reporter vector in which an sgRNA target sequence ...phosphatases (563 genes) in the Drosophila genome (Figure 4). 65 samples that displayed synthetic lethality (15 genes) or synthetic increases in viability...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . . Identified three hits (mRNA-Cap, Pitslre and CycT) that scored as

  11. Comprehensive Molecular Characterization of Urothelial Bladder Carcinoma

    PubMed Central

    2014-01-01

    Urothelial carcinoma of the bladder is a common malignancy that causes approximately 150,000 deaths per year worldwide. To date, no molecularly targeted agents have been approved for the disease. As part of The Cancer Genome Atlas project, we report here an integrated analysis of 131 urothelial carcinomas to provide a comprehensive landscape of molecular alterations. There were statistically significant recurrent mutations in 32 genes, including multiple genes involved in cell cycle regulation, chromatin regulation, and kinase signaling pathways, as well as 9 genes not previously reported as significantly mutated in any cancer. RNA sequencing revealed four expression subtypes, two of which (papillary-like and basal/squamous-like) were also evident in miRNA sequencing and protein data. Whole-genome and RNA sequencing identified recurrent in-frame activating FGFR3-TACC3 fusions and expression or integration of several viruses (including HPV16) that are associated with gene inactivation. Our analyses identified potential therapeutic targets in 69% of the tumours, including 42% with targets in the PI3K/AKT/mTOR pathway and 45% with targets (including ERBB2) in the RTK/MAPK pathway. Chromatin regulatory genes were more frequently mutated in urothelial carcinoma than in any common cancer studied to date, suggesting the future possibility of targeted therapy for chromatin abnormalities. PMID:24476821

  12. Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma

    PubMed Central

    Wrzeszczynski, Kazimierz O.; Frank, Mayu O.; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A.; Moore Vogel, Julia L.; Bruce, Jeffrey N.; Lassman, Andrew B.; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V.; Zody, Michael C.; Jobanputra, Vaidehi; Royyuru, Ajay K.

    2017-01-01

    Objective: To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Methods: Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. Results: More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. Conclusions: The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. ClinicalTrials.gov identifier: NCT02725684. PMID:28740869

  13. Quantitative phenotyping via deep barcode sequencing.

    PubMed

    Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

    2009-10-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.

  14. Targeting the atypical chemokine receptor ACKR3/CXCR7 for the treatment of cancer and other diseases

    NASA Astrophysics Data System (ADS)

    Vestal, Richard D., Jr.

    One of the greatest challenges in fighting cancer is cell targeting and biomarker selection. The Atypical Chemokine Receptor ACKR3/CXCR7 is expressed on many cancer cell types, including breast cancer and glioblastoma, and binds the endogenous ligands SDF1/CXCL12 and ITAC/CXCL11. A 20 amino acid region of the ACKR3/CXCR7 N-terminus was synthesized and targeted with the NEB PhD-7 Phage Display Peptide Library. Twenty-nine phages were isolated and heptapeptide inserts sequenced; of these, 23 sequences were unique. A 3D molecular model was created for the ACKR3/CXCR7 N-terminus by mutating the corresponding region of the crystal structure of CXCR4 with bound SDF1/CXCL12. A ClustalW alignment was performed on each peptide sequence using the entire SDF1/CXCL12 sequence as the template. The 23-peptide sequences showed similarity to three distinct regions of the SDF1/CXCL12 molecule. A 3D molecular model was made for each of the phage peptide inserts to visually identify potential areas of steric interference of peptides that simulated CXCL12 regions not in contact with the receptor's N-terminus. An ELISA analysis of the relative binding affinity between the peptides identified 9 peptides with statistically significant results. The candidate pool of 9 peptides was further reduced to 3 peptides based on their affinity for the targeted N-terminus region peptide versus no target peptide present or a scrambled negative control peptide. The results clearly show the Phage Display protocol can be used to target a synthesized region of the ACKR3/CXCR7 N-terminus. The 3 peptides chosen, P20, P3, and P9, showed no effect on the viability or proliferation upon exposure to MCF-7 and U87-MG cells. Membrane binding, colocalization, and cellular uptake were confirmed by whole-cell ELISA and confocal microscopy. The recovered peptides did not activate the receptor as confirmed by a Beta-Arrestin recruitment assay. The data shows that the peptide sequences recovered from the phage display protocol are viable candidates for targeting cancer cells and delivering material to them.

  15. Integrated DNA/RNA targeted genomic profiling of diffuse large B-cell lymphoma using a clinical assay.

    PubMed

    Intlekofer, Andrew M; Joffe, Erel; Batlevi, Connie L; Hilden, Patrick; He, Jie; Seshan, Venkatraman E; Zelenetz, Andrew D; Palomba, M Lia; Moskowitz, Craig H; Portlock, Carol; Straus, David J; Noy, Ariela; Horwitz, Steven M; Gerecitano, John F; Moskowitz, Alison; Hamlin, Paul; Matasar, Matthew J; Kumar, Anita; van den Brink, Marcel R; Knapp, Kristina M; Pichardo, Janine D; Nahas, Michelle K; Trabucco, Sally E; Mughal, Tariq; Copeland, Amanda R; Papaemmanuil, Elli; Moarii, Mathai; Levine, Ross L; Dogan, Ahmet; Miller, Vincent A; Younes, Anas

    2018-06-12

    We sought to define the genomic landscape of diffuse large B-cell lymphoma (DLBCL) by using formalin-fixed paraffin-embedded (FFPE) biopsy specimens. We used targeted sequencing of genes altered in hematologic malignancies, including DNA coding sequence for 405 genes, noncoding sequence for 31 genes, and RNA coding sequence for 265 genes (FoundationOne-Heme). Short variants, rearrangements, and copy number alterations were determined. We studied 198 samples (114 de novo, 58 previously treated, and 26 large-cell transformation from follicular lymphoma). Median number of GAs per case was 6, with 97% of patients harboring at least one alteration. Recurrent GAs were detected in genes with established roles in DLBCL pathogenesis (e.g. MYD88, CREBBP, CD79B, EZH2), as well as notable differences compared to prior studies such as inactivating mutations in TET2 (5%). Less common GAs identified potential targets for approved or investigational therapies, including BRAF, CD274 (PD-L1), IDH2, and JAK1/2. TP53 mutations were more frequently observed in relapsed/refractory DLBCL, and predicted for lack of response to first-line chemotherapy, identifying a subset of patients that could be prioritized for novel therapies. Overall, 90% (n = 169) of the patients harbored a GA which could be explored for therapeutic intervention, with 54% (n = 107) harboring more than one putative target.

  16. A transcriptome-wide, organ-specific regulatory map of Dendrobium officinale, an important traditional Chinese orchid herb

    PubMed Central

    Meng, Yijun; Yu, Dongliang; Xue, Jie; Lu, Jiangjie; Feng, Shangguo; Shen, Chenjia; Wang, Huizhong

    2016-01-01

    Dendrobium officinale is an important traditional Chinese herb. Here, we did a transcriptome-wide, organ-specific study on this valuable plant by combining RNA, small RNA (sRNA) and degradome sequencing. RNA sequencing of four organs (flower, root, leaf and stem) of Dendrobium officinale enabled us to obtain 536,558 assembled transcripts, from which 2,645, 256, 42 and 54 were identified to be highly expressed in the four organs respectively. Based on sRNA sequencing, 2,038, 2, 21 and 24 sRNAs were identified to be specifically accumulated in the four organs respectively. A total of 1,047 mature microRNA (miRNA) candidates were detected. Based on secondary structure predictions and sequencing, tens of potential miRNA precursors were identified from the assembled transcripts. Interestingly, phase-distributed sRNAs with degradome-based processing evidences were discovered on the long-stem structures of two precursors. Target identification was performed for the 1,047 miRNA candidates, resulting in the discovery of 1,257 miRNA--target pairs. Finally, some biological meaningful subnetworks involving hormone signaling, development, secondary metabolism and Argonaute 1-related regulation were established. All of the sequencing data sets are available at NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/sra/). Summarily, our study provides a valuable resource for the in-depth molecular and functional studies on this important Chinese orchid herb. PMID:26732614

  17. Method of identifying hairpin DNA probes by partial fold analysis

    DOEpatents

    Miller, Benjamin L [Penfield, NY; Strohsahl, Christopher M [Saugerties, NY

    2009-10-06

    Method of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.

  18. Method of identifying hairpin DNA probes by partial fold analysis

    DOEpatents

    Miller, Benjamin L.; Strohsahl, Christopher M.

    2008-10-28

    Methods of identifying molecular beacons in which a secondary structure prediction algorithm is employed to identify oligonucleotide sequences within a target gene having the requisite hairpin structure. Isolated oligonucleotides, molecular beacons prepared from those oligonucleotides, and their use are also disclosed.

  19. GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases

    PubMed Central

    Nguyen, Nhu T.; Liebers, Matthew; Topkar, Ved V.; Thapar, Vishal; Wyvekens, Nicolas; Khayter, Cyd; Iafrate, A. John; Le, Long P.; Aryee, Martin J.; Joung, J. Keith

    2014-01-01

    CRISPR RNA-guided nucleases (RGNs) are widely used genome-editing reagents, but methods to delineate their genome-wide off-target cleavage activities have been lacking. Here we describe an approach for global detection of DNA double-stranded breaks (DSBs) introduced by RGNs and potentially other nucleases. This method, called Genome-wide Unbiased Identification of DSBs Enabled by Sequencing (GUIDE-Seq), relies on capture of double-stranded oligodeoxynucleotides into breaks Application of GUIDE-Seq to thirteen RGNs in two human cell lines revealed wide variability in RGN off-target activities and unappreciated characteristics of off-target sequences. The majority of identified sites were not detected by existing computational methods or ChIP-Seq. GUIDE-Seq also identified RGN-independent genomic breakpoint ‘hotspots’. Finally, GUIDE-Seq revealed that truncated guide RNAs exhibit substantially reduced RGN-induced off-target DSBs. Our experiments define the most rigorous framework for genome-wide identification of RGN off-target effects to date and provide a method for evaluating the safety of these nucleases prior to clinical use. PMID:25513782

  20. Genome and transcriptome sequencing identifies breeding targets in the orphan crop tef (Eragrostis tef).

    PubMed

    Cannarozzi, Gina; Plaza-Wüthrich, Sonia; Esfeld, Korinna; Larti, Stéphanie; Wilson, Yi Song; Girma, Dejene; de Castro, Edouard; Chanyalew, Solomon; Blösch, Regula; Farinelli, Laurent; Lyons, Eric; Schneider, Michel; Falquet, Laurent; Kuhlemeier, Cris; Assefa, Kebebew; Tadele, Zerihun

    2014-07-09

    Tef (Eragrostis tef), an indigenous cereal critical to food security in the Horn of Africa, is rich in minerals and protein, resistant to many biotic and abiotic stresses and safe for diabetics as well as sufferers of immune reactions to wheat gluten. We present the genome of tef, the first species in the grass subfamily Chloridoideae and the first allotetraploid assembled de novo. We sequenced the tef genome for marker-assisted breeding, to shed light on the molecular mechanisms conferring tef's desirable nutritional and agronomic properties, and to make its genome publicly available as a community resource. The draft genome contains 672 Mbp representing 87% of the genome size estimated from flow cytometry. We also sequenced two transcriptomes, one from a normalized RNA library and another from unnormalized RNASeq data. The normalized RNA library revealed around 38000 transcripts that were then annotated by the SwissProt group. The CoGe comparative genomics platform was used to compare the tef genome to other genomes, notably sorghum. Scaffolds comprising approximately half of the genome size were ordered by syntenic alignment to sorghum producing tef pseudo-chromosomes, which were sorted into A and B genomes as well as compared to the genetic map of tef. The draft genome was used to identify novel SSR markers, investigate target genes for abiotic stress resistance studies, and understand the evolution of the prolamin family of proteins that are responsible for the immune response to gluten. It is highly plausible that breeding targets previously identified in other cereal crops will also be valuable breeding targets in tef. The draft genome and transcriptome will be of great use for identifying these targets for genetic improvement of this orphan crop that is vital for feeding 50 million people in the Horn of Africa.

  1. Designing highly active siRNAs for therapeutic applications.

    PubMed

    Walton, S Patrick; Wu, Ming; Gredell, Joseph A; Chan, Christina

    2010-12-01

    The discovery of RNA interference (RNAi) generated considerable interest in developing short interfering RNAs (siRNAs) for understanding basic biology and as the active agents in a new variety of therapeutics. Early studies showed that selecting an active siRNA was not as straightforward as simply picking a sequence on the target mRNA and synthesizing the siRNA complementary to that sequence. As interest in applying RNAi has increased, the methods for identifying active siRNA sequences have evolved from focusing on the simplicity of synthesis and purification, to identifying preferred target sequences and secondary structures, to predicting the thermodynamic stability of the siRNA. As more specific details of the RNAi mechanism have been defined, these have been incorporated into more complex siRNA selection algorithms, increasing the reliability of selecting active siRNAs against a single target. Ultimately, design of the best siRNA therapeutics will require design of the siRNA itself, in addition to design of the vehicle and other components necessary for it to function in vivo. In this minireview, we summarize the evolution of siRNA selection techniques with a particular focus on one issue of current importance to the field, how best to identify those siRNA sequences likely to have high activity. Approaches to designing active siRNAs through chemical and structural modifications will also be highlighted. As the understanding of how to control the activity and specificity of siRNAs improves, the potential utility of siRNAs as human therapeutics will concomitantly grow. © 2010 The Authors Journal compilation © 2010 FEBS.

  2. Implementing Genome-Driven Oncology

    PubMed Central

    Hyman, David M.; Taylor, Barry S.; Baselga, José

    2017-01-01

    Early successes in identifying and targeting individual oncogenic drivers, together with the increasing feasibility of sequencing tumor genomes, have brought forth the promise of genome-driven oncology care. As we expand the breadth and depth of genomic analyses, the biological and clinical complexity of its implementation will be unparalleled. Challenges include target credentialing and validation, implementing drug combinations, clinical trial designs, targeting tumor heterogeneity, and deploying technologies beyond DNA sequencing, among others. We review how contemporary approaches are tackling these challenges and will ultimately serve as an engine for biological discovery and increase our insight into cancer and its treatment. PMID:28187282

  3. Next-generation sequencing using a pre-designed gene panel for the molecular diagnosis of congenital disorders in pediatric patients.

    PubMed

    Lim, Eileen C P; Brett, Maggie; Lai, Angeline H M; Lee, Siew-Peng; Tan, Ee-Shien; Jamuar, Saumya S; Ng, Ivy S L; Tan, Ene-Choo

    2015-12-14

    Next-generation sequencing (NGS) has revolutionized genetic research and offers enormous potential for clinical application. Sequencing the exome has the advantage of casting the net wide for all known coding regions while targeted gene panel sequencing provides enhanced sequencing depths and can be designed to avoid incidental findings in adult-onset conditions. A HaloPlex panel consisting of 180 genes within commonly altered chromosomal regions is available for use on both the Ion Personal Genome Machine (PGM) and MiSeq platforms to screen for causative mutations in these genes. We used this Haloplex ICCG panel for targeted sequencing of 15 patients with clinical presentations indicative of an abnormality in one of the 180 genes. Sequencing runs were done using the Ion 318 Chips on the Ion Torrent PGM. Variants were filtered for known polymorphisms and analysis was done to identify possible disease-causing variants before validation by Sanger sequencing. When possible, segregation of variants with phenotype in family members was performed to ascertain the pathogenicity of the variant. More than 97% of the target bases were covered at >20×. There was an average of 9.6 novel variants per patient. Pathogenic mutations were identified in five genes for six patients, with two novel variants. There were another five likely pathogenic variants, some of which were unreported novel variants. In a cohort of 15 patients, we were able to identify a likely genetic etiology in six patients (40%). Another five patients had candidate variants for which further evaluation and segregation analysis are ongoing. Our results indicate that the HaloPlex ICCG panel is useful as a rapid, high-throughput and cost-effective screening tool for 170 of the 180 genes. There is low coverage for some regions in several genes which might have to be supplemented by Sanger sequencing. However, comparing the cost, ease of analysis, and shorter turnaround time, it is a good alternative to exome sequencing for patients whose features are suggestive of a genetic etiology involving one of the genes in the panel.

  4. High-throughput sequencing of small RNAs and analysis of differentially expressed microRNAs associated with pistil development in Japanese apricot

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are a class of endogenous, small, non-coding RNAs that regulate gene expression by mediating gene silencing at transcriptional and post-transcriptional levels in high plants. However, the diversity of miRNAs and their roles in floral development in Japanese apricot (Prunus mume Sieb. et Zucc) remains largely unexplored. Imperfect flowers with pistil abortion seriously decrease production yields. To understand the role of miRNAs in pistil development, pistil development-related miRNAs were identified by Solexa sequencing in Japanese apricot. Results Solexa sequencing was used to identify and quantitatively profile small RNAs from perfect and imperfect flower buds of Japanese apricot. A total of 22,561,972 and 24,952,690 reads were sequenced from two small RNA libraries constructed from perfect and imperfect flower buds, respectively. Sixty-one known miRNAs, belonging to 24 families, were identified. Comparative profiling revealed that seven known miRNAs exhibited significant differential expression between perfect and imperfect flower buds. A total of 61 potentially novel miRNAs/new members of known miRNA families were also identified by the presence of mature miRNAs and corresponding miRNA*s in the sRNA libraries. Comparative analysis showed that six potentially novel miRNAs were differentially expressed between perfect and imperfect flower buds. Target predictions of the 13 differentially expressed miRNAs resulted in 212 target genes. Gene ontology (GO) annotation revealed that high-ranking miRNA target genes are those implicated in the developmental process, the regulation of transcription and response to stress. Conclusions This study represents the first comparative identification of miRNAomes between perfect and imperfect Japanese apricot flowers. Seven known miRNAs and six potentially novel miRNAs associated with pistil development were identified, using high-throughput sequencing of small RNAs. The findings, both computationally and experimentally, provide valuable information for further functional characterisation of miRNAs associated with pistil development in plants. PMID:22863067

  5. Bioinformatic Identification of Potential MicroRNAs and Their Targets in the Lingzhi or Reishi Medicinal Mushroom Ganoderma lucidum (Higher Basidiomycetes).

    PubMed

    Mu, Da-Shuai; Li, Chenyang; Shi, Liang; Zhang, Xuchen; Ren, Ang; Zhao, Ming-Wen

    2015-01-01

    MicroRNAs (miRNAs) are a class of small, endogenous, noncoding RNA molecules that negatively regulate gene expression at the transcriptional or the post-transcriptional level. Although a large number of miRNAs have been identified in many species, especially model plants and animals, miRNAs in fungi remain largely unknown. In this study, based on a database of expressed sequence tags in Ganoderma lucidum, 89 potential miRNAs were identified using computational methods. Real-time polymerase chain reaction analysis of miRNA-like samples prepared from G. lucidum at different development stages revealed that miRNA-like RNAs were differentially expressed in different stages. Furthermore, a total of 28 potential targets were found based on near-perfect or perfect complementarity between the randomly selected 9 miRNA-like RNAs and the target sequences, and potential targets for G. lucidum miRNA-like RNAs were predicted. Finally, we studied the expression pattern of 4 target genes in 3 different development stages of G. lucidum to further understand the mechanism of interaction between miRNA-like RNAs and their target genes. Our analysis paves the way toward identifying fungal miRNA-like RNAs that might be involved in various physiological and cellular differentiation processes.

  6. A comparative analysis of exome capture.

    PubMed

    Parla, Jennifer S; Iossifov, Ivan; Grabill, Ian; Spector, Mona S; Kramer, Melissa; McCombie, W Richard

    2011-09-29

    Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data. Each exome kit performed well at capturing the targets they were designed to capture, which mainly corresponds to the consensus coding sequences (CCDS) annotations of the human genome. In addition, based on their respective targets, each capture kit coupled with high coverage Illumina sequencing produced highly accurate nucleotide calls. However, other databases, such as the Reference Sequence collection (RefSeq), define the exome more broadly, and so not surprisingly, the exome kits did not capture these additional regions. Commercial exome capture kits provide a very efficient way to sequence select areas of the genome at very high accuracy. Here we provide the data to help guide critical analyses of sequencing data derived from these products.

  7. An analysis of possible off target effects following CAS9/CRISPR targeted deletions of neuropeptide gene enhancers from the mouse genome.

    PubMed

    Hay, Elizabeth Anne; Khalaf, Abdulla Razak; Marini, Pietro; Brown, Andrew; Heath, Karyn; Sheppard, Darrin; MacKenzie, Alasdair

    2017-08-01

    We have successfully used comparative genomics to identify putative regulatory elements within the human genome that contribute to the tissue specific expression of neuropeptides such as galanin and receptors such as CB1. However, a previous inability to rapidly delete these elements from the mouse genome has prevented optimal assessment of their function in-vivo. This has been solved using CAS9/CRISPR genome editing technology which uses a bacterial endonuclease called CAS9 that, in combination with specifically designed guide RNA (gRNA) molecules, cuts specific regions of the mouse genome. However, reports of "off target" effects, whereby the CAS9 endonuclease is able to cut sites other than those targeted, limits the appeal of this technology. We used cytoplasmic microinjection of gRNA and CAS9 mRNA into 1-cell mouse embryos to rapidly generate enhancer knockout mouse lines. The current study describes our analysis of the genomes of these enhancer knockout lines to detect possible off-target effects. Bioinformatic analysis was used to identify the most likely putative off-target sites and to design PCR primers that would amplify these sequences from genomic DNA of founder enhancer deletion mouse lines. Amplified DNA was then sequenced and blasted against the mouse genome sequence to detect off-target effects. Using this approach we were unable to detect any evidence of off-target effects in the genomes of three founder lines using any of the four gRNAs used in the analysis. This study suggests that the problem of off-target effects in transgenic mice have been exaggerated and that CAS9/CRISPR represents a highly effective and accurate method of deleting putative neuropeptide gene enhancer sequences from the mouse genome. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    PubMed Central

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547

  9. Targeted exome sequencing reveals novel USH2A mutations in Chinese patients with simplex Usher syndrome.

    PubMed

    Shu, Hai-Rong; Bi, Huai; Pan, Yang-Chun; Xu, Hang-Yu; Song, Jian-Xin; Hu, Jie

    2015-09-16

    Usher syndrome (USH) is an autosomal recessive disorder characterized by hearing impairment and vision dysfunction due to retinitis pigmentosa. Phenotypic and genetic heterogeneities of this disease make it impractical to obtain a genetic diagnosis by conventional Sanger sequencing. In this study, we applied a next-generation sequencing approach to detect genetic abnormalities in patients with USH. Two unrelated Chinese families were recruited, consisting of two USH afflicted patients and four unaffected relatives. We selected 199 genes related to inherited retinal diseases as targets for deep exome sequencing. Through systematic data analysis using an established bioinformatics pipeline, all variants that passed filter criteria were validated by Sanger sequencing and co-segregation analysis. A homozygous frameshift mutation (c.4382delA, p.T1462Lfs*2) was revealed in exon20 of gene USH2A in the F1 family. Two compound heterozygous mutations, IVS47 + 1G > A and c.13156A > T (p.I4386F), located in intron 48 and exon 63 respectively, of USH2A, were identified as causative mutations for the F2 family. Of note, the missense mutation c.13156A > T has not been reported so far. In conclusion, targeted exome sequencing precisely and rapidly identified the genetic defects in two Chinese USH families and this technique can be applied as a routine examination for these disorders with significant clinical and genetic heterogeneity.

  10. Identification of microRNAs and their targets in Finger millet by high throughput sequencing.

    PubMed

    Usha, S; Jyothi, M N; Sharadamma, N; Dixit, Rekha; Devaraj, V R; Nagesh Babu, R

    2015-12-15

    MicroRNAs are short non-coding RNAs which play an important role in regulating gene expression by mRNA cleavage or by translational repression. The majority of identified miRNAs were evolutionarily conserved; however, others expressed in a species-specific manner. Finger millet is an important cereal crop; nonetheless, no practical information is available on microRNAs to date. In this study, we have identified 95 conserved microRNAs belonging to 39 families and 3 novel microRNAs by high throughput sequencing. For the identified conserved and novel miRNAs a total of 507 targets were predicted. 11 miRNAs were validated and tissue specificity was determined by stem loop RT-qPCR, Northern blot. GO analyses revealed targets of miRNA were involved in wide range of regulatory functions. This study implies large number of known and novel miRNAs found in Finger millet which may play important role in growth and development. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. BLAST and FASTA similarity searching for multiple sequence alignment.

    PubMed

    Pearson, William R

    2014-01-01

    BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

  12. Exome Sequencing Identifies Potentially Druggable Mutations in Nasopharyngeal Carcinoma.

    PubMed

    Chow, Yock Ping; Tan, Lu Ping; Chai, San Jiun; Abdul Aziz, Norazlin; Choo, Siew Woh; Lim, Paul Vey Hong; Pathmanathan, Rajadurai; Mohd Kornain, Noor Kaslina; Lum, Chee Lun; Pua, Kin Choo; Yap, Yoke Yeow; Tan, Tee Yong; Teo, Soo Hwang; Khoo, Alan Soo-Beng; Patel, Vyomesh

    2017-03-03

    In this study, we first performed whole exome sequencing of DNA from 10 untreated and clinically annotated fresh frozen nasopharyngeal carcinoma (NPC) biopsies and matched bloods to identify somatically mutated genes that may be amenable to targeted therapeutic strategies. We identified a total of 323 mutations which were either non-synonymous (n = 238) or synonymous (n = 85). Furthermore, our analysis revealed genes in key cancer pathways (DNA repair, cell cycle regulation, apoptosis, immune response, lipid signaling) were mutated, of which those in the lipid-signaling pathway were the most enriched. We next extended our analysis on a prioritized sub-set of 37 mutated genes plus top 5 mutated cancer genes listed in COSMIC using a custom designed HaloPlex target enrichment panel with an additional 88 NPC samples. Our analysis identified 160 additional non-synonymous mutations in 37/42 genes in 66/88 samples. Of these, 99/160 mutations within potentially druggable pathways were further selected for validation. Sanger sequencing revealed that 77/99 variants were true positives, giving an accuracy of 78%. Taken together, our study indicated that ~72% (n = 71/98) of NPC samples harbored mutations in one of the four cancer pathways (EGFR-PI3K-Akt-mTOR, NOTCH, NF-κB, DNA repair) which may be potentially useful as predictive biomarkers of response to matched targeted therapies.

  13. Exome Sequencing Identifies Potentially Druggable Mutations in Nasopharyngeal Carcinoma

    PubMed Central

    Chow, Yock Ping; Tan, Lu Ping; Chai, San Jiun; Abdul Aziz, Norazlin; Choo, Siew Woh; Lim, Paul Vey Hong; Pathmanathan, Rajadurai; Mohd Kornain, Noor Kaslina; Lum, Chee Lun; Pua, Kin Choo; Yap, Yoke Yeow; Tan, Tee Yong; Teo, Soo Hwang; Khoo, Alan Soo-Beng; Patel, Vyomesh

    2017-01-01

    In this study, we first performed whole exome sequencing of DNA from 10 untreated and clinically annotated fresh frozen nasopharyngeal carcinoma (NPC) biopsies and matched bloods to identify somatically mutated genes that may be amenable to targeted therapeutic strategies. We identified a total of 323 mutations which were either non-synonymous (n = 238) or synonymous (n = 85). Furthermore, our analysis revealed genes in key cancer pathways (DNA repair, cell cycle regulation, apoptosis, immune response, lipid signaling) were mutated, of which those in the lipid-signaling pathway were the most enriched. We next extended our analysis on a prioritized sub-set of 37 mutated genes plus top 5 mutated cancer genes listed in COSMIC using a custom designed HaloPlex target enrichment panel with an additional 88 NPC samples. Our analysis identified 160 additional non-synonymous mutations in 37/42 genes in 66/88 samples. Of these, 99/160 mutations within potentially druggable pathways were further selected for validation. Sanger sequencing revealed that 77/99 variants were true positives, giving an accuracy of 78%. Taken together, our study indicated that ~72% (n = 71/98) of NPC samples harbored mutations in one of the four cancer pathways (EGFR-PI3K-Akt-mTOR, NOTCH, NF-κB, DNA repair) which may be potentially useful as predictive biomarkers of response to matched targeted therapies. PMID:28256603

  14. Actionable mutations in canine hemangiosarcoma

    PubMed Central

    Wang, Guannan; Wu, Ming; Maloneyhuss, Martha A.; Wojcik, John; Durham, Amy C.; Mason, Nicola J.

    2017-01-01

    Background Angiosarcomas (AS) are rare in humans, but they are a deadly subtype of soft tissue sarcoma. Discovery sequencing in AS, especially the visceral form, is hampered by the rarity of cases. Most diagnostic material exists as archival formalin fixed, paraffin embedded tissue which serves as a poor source of high quality DNA for genome-wide sequencing. We approached this problem through comparative genomics. We hypothesized that exome sequencing a histologically similar tumor, hemangiosarcoma (HSA), that occurs in approximately 50,000 dogs per year, may lead to the identification of potential oncogenic drivers and druggable targets that could also occur in angiosarcoma. Methods Splenic hemangiosarcomas are common in dogs, which allowed us to collect a cohort of archived matched tumor and normal tissue samples suitable for whole exome sequencing. Mapping of the reads to the latest canine reference genome (Canfam3) demonstrated that >99% of the targeted exomal regions were covered, with >80% at 20X coverage and >90% at 10X coverage. Results and conclusions Sequence analysis of 20 samples identified somatic mutations in PIK3CA, TP53, PTEN, and PLCG1, all of which correspond to well-known tumor drivers in human cancer, in more than half of the cases. In one case, we identified a mutation in PLCG1 identical to a mutation observed previously in this gene in human visceral AS. Activating PIK3CA mutations present novel therapeutic targets, and clinical trials of targeted inhibitors are underway in human cancers. Our results lay a foundation for similar clinical trials in canine HSA, enabling a precision medicine approach to this disease. PMID:29190660

  15. Actionable mutations in canine hemangiosarcoma.

    PubMed

    Wang, Guannan; Wu, Ming; Maloneyhuss, Martha A; Wojcik, John; Durham, Amy C; Mason, Nicola J; Roth, David B

    2017-01-01

    Angiosarcomas (AS) are rare in humans, but they are a deadly subtype of soft tissue sarcoma. Discovery sequencing in AS, especially the visceral form, is hampered by the rarity of cases. Most diagnostic material exists as archival formalin fixed, paraffin embedded tissue which serves as a poor source of high quality DNA for genome-wide sequencing. We approached this problem through comparative genomics. We hypothesized that exome sequencing a histologically similar tumor, hemangiosarcoma (HSA), that occurs in approximately 50,000 dogs per year, may lead to the identification of potential oncogenic drivers and druggable targets that could also occur in angiosarcoma. Splenic hemangiosarcomas are common in dogs, which allowed us to collect a cohort of archived matched tumor and normal tissue samples suitable for whole exome sequencing. Mapping of the reads to the latest canine reference genome (Canfam3) demonstrated that >99% of the targeted exomal regions were covered, with >80% at 20X coverage and >90% at 10X coverage. Sequence analysis of 20 samples identified somatic mutations in PIK3CA, TP53, PTEN, and PLCG1, all of which correspond to well-known tumor drivers in human cancer, in more than half of the cases. In one case, we identified a mutation in PLCG1 identical to a mutation observed previously in this gene in human visceral AS. Activating PIK3CA mutations present novel therapeutic targets, and clinical trials of targeted inhibitors are underway in human cancers. Our results lay a foundation for similar clinical trials in canine HSA, enabling a precision medicine approach to this disease.

  16. Exome-wide DNA capture and next generation sequencing in domestic and wild species.

    PubMed

    Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon

    2011-07-05

    Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.

  17. Parallel analysis of RNA ends enhances global investigation of microRNAs and target RNAs of Brachypodium distachyon

    PubMed Central

    2013-01-01

    Background The wild grass Brachypodium distachyon has emerged as a model system for temperate grasses and biofuel plants. However, the global analysis of miRNAs, molecules known to be key for eukaryotic gene regulation, has been limited in B. distachyon to studies examining a few samples or that rely on computational predictions. Similarly an in-depth global analysis of miRNA-mediated target cleavage using parallel analysis of RNA ends (PARE) data is lacking in B. distachyon. Results B. distachyon small RNAs were cloned and deeply sequenced from 17 libraries that represent different tissues and stresses. Using a computational pipeline, we identified 116 miRNAs including not only conserved miRNAs that have not been reported in B. distachyon, but also non-conserved miRNAs that were not found in other plants. To investigate miRNA-mediated cleavage function, four PARE libraries were constructed from key tissues and sequenced to a total depth of approximately 70 million sequences. The roughly 5 million distinct genome-matched sequences that resulted represent an extensive dataset for analyzing small RNA-guided cleavage events. Analysis of the PARE and miRNA data provided experimental evidence for miRNA-mediated cleavage of 264 sites in predicted miRNA targets. In addition, PARE analysis revealed that differentially expressed miRNAs in the same family guide specific target RNA cleavage in a correspondingly tissue-preferential manner. Conclusions B. distachyon miRNAs and target RNAs were experimentally identified and analyzed. Knowledge gained from this study should provide insights into the roles of miRNAs and the regulation of their targets in B. distachyon and related plants. PMID:24367943

  18. An RNAi in silico approach to find an optimal shRNA cocktail against HIV-1

    PubMed Central

    2010-01-01

    Background HIV-1 can be inhibited by RNA interference in vitro through the expression of short hairpin RNAs (shRNAs) that target conserved genome sequences. In silico shRNA design for HIV has lacked a detailed study of virus variability constituting a possible breaking point in a clinical setting. We designed shRNAs against HIV-1 considering the variability observed in naïve and drug-resistant isolates available at public databases. Methods A Bioperl-based algorithm was developed to automatically scan multiple sequence alignments of HIV, while evaluating the possibility of identifying dominant and subdominant viral variants that could be used as efficient silencing molecules. Student t-test and Bonferroni Dunn correction test were used to assess statistical significance of our findings. Results Our in silico approach identified the most common viral variants within highly conserved genome regions, with a calculated free energy of ≥ -6.6 kcal/mol. This is crucial for strand loading to RISC complex and for a predicted silencing efficiency score, which could be used in combination for achieving over 90% silencing. Resistant and naïve isolate variability revealed that the most frequent shRNA per region targets a maximum of 85% of viral sequences. Adding more divergent sequences maintained this percentage. Specific sequence features that have been found to be related with higher silencing efficiency were hardly accomplished in conserved regions, even when lower entropy values correlated with better scores. We identified a conserved region among most HIV-1 genomes, which meets as many sequence features for efficient silencing. Conclusions HIV-1 variability is an obstacle to achieving absolute silencing using shRNAs designed against a consensus sequence, mainly because there are many functional viral variants. Our shRNA cocktail could be truly effective at silencing dominant and subdominant naïve viral variants. Additionally, resistant isolates might be targeted under specific antiretroviral selective pressure, but in both cases these should be tested exhaustively prior to clinical use. PMID:21172023

  19. Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma.

    PubMed

    Wrzeszczynski, Kazimierz O; Frank, Mayu O; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A; Moore Vogel, Julia L; Bruce, Jeffrey N; Lassman, Andrew B; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V; Zody, Michael C; Jobanputra, Vaidehi; Royyuru, Ajay K; Darnell, Robert B

    2017-08-01

    To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. NCT02725684.

  20. Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong

    2010-02-19

    Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less

  1. Identification of microRNAs in Caragana intermedia by high-throughput sequencing and expression analysis of 12 microRNAs and their targets under salt stress.

    PubMed

    Zhu, Jianfeng; Li, Wanfeng; Yang, Wenhua; Qi, Liwang; Han, Suying

    2013-09-01

    142 miRNAs were identified and 38 miRNA targets were predicted, 4 of which were validated, in C. intermedia . The expression of 12 miRNAs in salt-stressed leaves was assessed by qRT-PCR. MicroRNAs (miRNAs) are endogenous small RNAs that play important roles in various biological and metabolic processes in plants. Caragana intermedia is an important ecological and economic tree species prominent in the desert environment of west and northwest China. To date, no investigation into C. intermedia miRNAs has been reported. In this study, high-throughput sequencing of small RNAs and analysis of transcriptome data were performed to identify both conserved and novel miRNAs, and also their target mRNA genes in C. intermedia. Based on sequence similarity and hairpin structure prediction, 132 putative conserved miRNAs (12 of which were confirmed to form hairpin precursors) belonging to 31 known miRNA families were identified. Ten novel miRNAs (including the miRNA* sequences of three novel miRNAs) were also discovered. Furthermore, 36 potential target genes of 17 known miRNA families and 2 potential target genes of 1 novel miRNA were predicted; 4 of these were validated by 5' RACE. The expression of 12 miRNAs was validated in different tissues, and these and five target mRNAs were assessed by qRT-PCR after salt treatment. The expression levels of seven miRNAs (cin-miR157a, cin-miR159a, cin-miR165a, cin-miR167b, cin-miR172b, cin-miR390a and cin-miR396a) were upregulated, while cin-miR398a expression was downregulated after salt treatment. The targets of cin-miR157a, cin-miR165a, cin-miR172b and cin-miR396a were downregulated and showed an approximately negative correlation with their corresponding miRNAs under salt treatment. These results would help further understanding of miRNA regulation in response to abiotic stress in C. intermedia.

  2. A resource for characterizing genome-wide binding and putative target genes of transcription factors expressed during secondary growth and wood formation in Populus

    Treesearch

    Lijun Liu; Trevor Ramsay; Matthew S. Zinkgraf; David Sundell; Nathaniel Robert Street; Vladimir Filkov; Andrew Groover

    2015-01-01

    Identifying transcription factor target genes is essential for modeling the transcriptional networks underlying developmental processes. Here we report a chromatin immunoprecipitation sequencing (ChIP-seq) resource consisting of genome-wide binding regions and associated putative target genes for four Populus homeodomain transcription factors...

  3. Mapping of RNA accessible sites by extension of random oligonucleotide libraries with reverse transcriptase.

    PubMed Central

    Allawi, H T; Dong, F; Ip, H S; Neri, B P; Lyamichev, V I

    2001-01-01

    A rapid and simple method for determining accessible sites in RNA that is independent of the length of target RNA and does not require RNA labeling is described. In this method, target RNA is allowed to hybridize with sequence-randomized libraries of DNA oligonucleotides linked to a common tag sequence at their 5'-end. Annealed oligonucleotides are extended with reverse transcriptase and the extended products are then amplified by using PCR with a primer corresponding to the tag sequence and a second primer specific to the target RNA sequence. We used the combination of both the lengths of the RT-PCR products and the location of the binding site of the RNA-specific primer to determine which regions of the RNA molecules were RNA extendible sites, that is, sites available for oligonucleotide binding and extension. We then employed this reverse transcription with the random oligonucleotide libraries (RT-ROL) method to determine the accessible sites on four mRNA targets, human activated ras (ha-ras), human intercellular adhesion molecule-1 (ICAM-1), rabbit beta-globin, and human interferon-gamma (IFN-gamma). Our results were concordant with those of other researchers who had used RNase H cleavage or hybridization with arrays of oligonucleotides to identify accessible sites on some of these targets. Further, we found good correlation between sites when we compared the location of extendible sites identified by RT-ROL with hybridization sites of effective antisense oligonucleotides on ICAM-1 mRNA in antisense inhibition studies. Finally, we discuss the relationship between RNA extendible sites and RNA accessibility. PMID:11233988

  4. Insilico profiling of microRNAs in Korean ginseng (Panax ginseng Meyer)

    PubMed Central

    Mathiyalagan, Ramya; Subramaniyam, Sathiyamoorthy; Natarajan, Sathishkumar; Kim, Yeon Ju; Sun, Myung Suk; Kim, Se Young; Kim, Yu-Jin; Yang, Deok Chun

    2013-01-01

    MicroRNAs (miRNAs) are a class of recently discovered non-coding small RNA molecules, on average approximately 21 nucleotides in length, which underlie numerous important biological roles in gene regulation in various organisms. The miRNA database (release 18) has 18,226 miRNAs, which have been deposited from different species. Although miRNAs have been identified and validated in many plant species, no studies have been reported on discovering miRNAs in Panax ginseng Meyer, which is a traditionally known medicinal plant in oriental medicine, also known as Korean ginseng. It has triterpene ginseng saponins called ginsenosides, which are responsible for its various pharmacological activities. Predicting conserved miRNAs by homology-based analysis with available expressed sequence tag (EST) sequences can be powerful, if the species lacks whole genome sequence information. In this study by using the EST based computational approach, 69 conserved miRNAs belonging to 44 miRNA families were identified in Korean ginseng. The digital gene expression patterns of predicted conserved miRNAs were analyzed by deep sequencing using small RNA sequences of flower buds, leaves, and lateral roots. We have found that many of the identified miRNAs showed tissue specific expressions. Using the insilico method, 346 potential targets were identified for the predicted 69 conserved miRNAs by searching the ginseng EST database, and the predicted targets were mainly involved in secondary metabolic processes, responses to biotic and abiotic stress, and transcription regulator activities, as well as a variety of other metabolic processes. PMID:23717176

  5. Novel myosin mutations for hereditary hearing loss revealed by targeted genomic capture and massively parallel sequencing

    PubMed Central

    Brownstein, Zippora; Abu-Rayyan, Amal; Karfunkel-Doron, Daphne; Sirigu, Serena; Davidov, Bella; Shohat, Mordechai; Frydman, Moshe; Houdusse, Anne; Kanaan, Moien; Avraham, Karen B

    2014-01-01

    Hereditary hearing loss is genetically heterogeneous, with a large number of genes and mutations contributing to this sensory, often monogenic, disease. This number, as well as large size, precludes comprehensive genetic diagnosis of all known deafness genes. A combination of targeted genomic capture and massively parallel sequencing (MPS), also referred to as next-generation sequencing, was applied to determine the deafness-causing genes in hearing-impaired individuals from Israeli Jewish and Palestinian Arab families. Among the mutations detected, we identified nine novel mutations in the genes encoding myosin VI, myosin VIIA and myosin XVA, doubling the number of myosin mutations in the Middle East. Myosin VI mutations were identified in this population for the first time. Modeling of the mutations provided predicted mechanisms for the damage they inflict in the molecular motors, leading to impaired function and thus deafness. The myosin mutations span all regions of these molecular motors, leading to a wide range of hearing phenotypes, reinforcing the key role of this family of proteins in auditory function. This study demonstrates that multiple mutations responsible for hearing loss can be identified in a relatively straightforward manner by targeted-gene MPS technology and concludes that this is the optimal genetic diagnostic approach for identification of mutations responsible for hearing loss. PMID:24105371

  6. PCR Primers for Metazoan Nuclear 18S and 28S Ribosomal DNA Sequences

    PubMed Central

    Machida, Ryuji J.; Knowlton, Nancy

    2012-01-01

    Background Metagenetic analyses, which amplify and sequence target marker DNA regions from environmental samples, are increasingly employed to assess the biodiversity of communities of small organisms. Using this approach, our understanding of microbial diversity has expanded greatly. In contrast, only a few studies using this approach to characterize metazoan diversity have been reported, despite the fact that many metazoan species are small and difficult to identify or are undescribed. One of the reasons for this discrepancy is the availability of universal primers for the target taxa. In microbial studies, analysis of the 16S ribosomal DNA is standard. In contrast, the best gene for metazoan metagenetics is less clear. In the present study, we have designed primers that amplify the nuclear 18S and 28S ribosomal DNA sequences of most metazoan species with the goal of providing effective approaches for metagenetic analyses of metazoan diversity in environmental samples, with a particular emphasis on marine biodiversity. Methodology/Principal Findings Conserved regions suitable for designing PCR primers were identified using 14,503 and 1,072 metazoan sequences of the nuclear 18S and 28S rDNA regions, respectively. The sequence similarity of both these newly designed and the previously reported primers to the target regions of these primers were compared for each phylum to determine the expected amplification efficacy. The nucleotide diversity of the flanking regions of the primers was also estimated for genera or higher taxonomic groups of 11 phyla to determine the variable regions within the genes. Conclusions/Significance The identified nuclear ribosomal DNA primers (five primer pairs for 18S and eleven for 28S) and the results of the nucleotide diversity analyses provide options for primer combinations for metazoan metagenetic analyses. Additionally, advantages and disadvantages of not only the 18S and 28S ribosomal DNA, but also other marker regions as targets for metazoan metagenetic analyses, are discussed. PMID:23049971

  7. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    PubMed

    Hsu, Ju-Chun; Chien, Ting-Ying; Hu, Chia-Cheng; Chen, Mei-Ju May; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to characterize putative polypeptide translational products and associate them with specific genes and protein functions.

  8. The molecular genetic makeup of acute lymphoblastic leukemia.

    PubMed

    Mullighan, Charles G

    2012-01-01

    Genomic profiling has transformed our understanding of the genetic basis of acute lymphoblastic leukemia (ALL). Recent years have seen a shift from microarray analysis and candidate gene sequencing to next-generation sequencing. Together, these approaches have shown that many ALL subtypes are characterized by constellations of structural rearrangements, submicroscopic DNA copy number alterations, and sequence mutations, several of which have clear implications for risk stratification and targeted therapeutic intervention. Mutations in genes regulating lymphoid development are a hallmark of ALL, and alterations of the lymphoid transcription factor gene IKZF1 (IKAROS) are associated with a high risk of treatment failure in B-ALL. Approximately 20% of B-ALL cases harbor genetic alterations that activate kinase signaling that may be amenable to treatment with tyrosine kinase inhibitors, including rearrangements of the cytokine receptor gene CRLF2; rearrangements of ABL1, JAK2, and PDGFRB; and mutations of JAK1 and JAK2. Whole-genome sequencing has also identified novel targets of mutation in aggressive T-lineage ALL, including hematopoietic regulators (ETV6 and RUNX1), tyrosine kinases, and epigenetic regulators. Challenges for the future are to comprehensively identify and experimentally validate all genetic alterations driving leukemogenesis and treatment failure in childhood and adult ALL and to implement genomic profiling into the clinical setting to guide risk stratification and targeted therapy.

  9. Prescreening of microbial populations for the assessment of sequencing potential.

    PubMed

    Hanning, Irene B; Ricke, Steven C

    2011-01-01

    Next-generation sequencing (NGS) is a powerful tool that can be utilized to profile and compare microbial populations. By amplifying a target gene present in all bacteria and subsequently sequencing amplicons, the bacteria genera present in the populations can be identified and compared. In some scenarios, little to no difference may exist among microbial populations being compared in which case a prescreening method would be practical to determine which microbial populations would be suitable for further analysis by NGS. Denaturing density-gradient electrophoresis (DGGE) is relatively cheaper than NGS and the data comparing microbial populations are ready to be viewed immediately after electrophoresis. DGGE follows essentially the same initial methodology as NGS by targeting and amplifying the 16S rRNA gene. However, as opposed to sequencing amplicons, DGGE amplicons are analyzed by electrophoresis. By prescreening microbial populations with DGGE, more efficient use of NGS methods can be accomplished. In this chapter, we outline the protocol for DGGE targeting the same gene (16S rRNA) that would be targeted for NGS to compare and determine differences in microbial populations from a wide range of ecosystems.

  10. Fluorescence self-quenching assay for the detection of target collagen sequences using a short probe peptide.

    PubMed

    Nian, Linge; Hu, Yue; Fu, Caihong; Song, Chen; Wang, Jie; Xiao, Jianxi

    2018-01-01

    The development of novel assays to detect collagen fragments is of utmost importance for diagnostic, prognostic and therapeutic decisions in various collagen-related diseases, and one essential question is to discover probe peptides that can specifically recognize target collagen sequences. Herein we have developed the fluorescence self-quenching assay as a convenient tool to screen the capability of a series of fluorescent probe peptides of variable lengths to bind with target collagen peptides. We have revealed that the targeting ability of probe peptides is length-dependent, and have discovered a relatively short probe peptide FAM-G(POG) 8 capable to identify the target peptide. We have further demonstrated that fluorescence self-quenching assay together with this short probe peptide can be applied to specifically detect the desired collagen fragment in complex biological media. Fluorescence self-quenching assay provides a powerful new tool to discover effective peptides for the recognition of collagen biomarkers, and it may have great potential to identify probe peptides for various protein biomarkers involved in pathological conditions. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Applying Unique Molecular Identifiers in Next Generation Sequencing Reveals a Constrained Viral Quasispecies Evolution under Cross-Reactive Antibody Pressure Targeting Long Alpha Helix of Hemagglutinin

    PubMed Central

    Hauck, Nastasja C.; Kirpach, Josiane; Kiefer, Christina; Farinelle, Sophie; Morris, Stephen A.; Muller, Claude P.; Lu, I-Na

    2018-01-01

    To overcome yearly efforts and costs for the production of seasonal influenza vaccines, new approaches for the induction of broadly protective and long-lasting immune responses have been developed in the past decade. To warrant safety and efficacy of the emerging crossreactive vaccine candidates, it is critical to understand the evolution of influenza viruses in response to these new immune pressures. Here we applied unique molecular identifiers in next generation sequencing to analyze the evolution of influenza quasispecies under in vivo antibody pressure targeting the hemagglutinin (HA) long alpha helix (LAH). Our vaccine targeting LAH of hemagglutinin elicited significant seroconversion and protection against homologous and heterologous influenza virus strains in mice. The vaccine not only significantly reduced lung viral titers, but also induced a well-known bottleneck effect by decreasing virus diversity. In contrast to the classical bottleneck effect, here we showed a significant increase in the frequency of viruses with amino acid sequences identical to that of vaccine targeting LAH domain. No escape mutant emerged after vaccination. These results not only support the potential of a universal influenza vaccine targeting the conserved LAH domains, but also clearly demonstrate that the well-established bottleneck effect on viral quasispecies evolution does not necessarily generate escape mutants. PMID:29587397

  12. Short Interspersed Nuclear Element (SINE) Sequences in the Genome of the Human Pathogenic Fungus Aspergillus fumigatus Af293

    PubMed Central

    Kanhayuwa, Lakkhana; Coutts, Robert H. A.

    2016-01-01

    Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4–14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140–493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3’-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50–65% and 60–75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259–343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity. PMID:27736869

  13. Short Interspersed Nuclear Element (SINE) Sequences in the Genome of the Human Pathogenic Fungus Aspergillus fumigatus Af293.

    PubMed

    Kanhayuwa, Lakkhana; Coutts, Robert H A

    2016-01-01

    Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4-14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140-493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3'-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50-65% and 60-75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259-343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity.

  14. Methods, microfluidic devices, and systems for detection of an active enzymatic agent

    DOEpatents

    Sommer, Gregory J; Hatch, Anson V; Singh, Anup K; Wang, Ying-Chih

    2014-10-28

    Embodiments of the present invention provide methods, microfluidic devices, and systems for the detection of an active target agent in a fluid sample. A substrate molecule is used that contains a sequence which may cleave in the presence of an active target agent. A SNAP25 sequence is described, for example, that may be cleaved in the presence of Botulinum Neurotoxin. The substrate molecule includes a reporter moiety. The substrate molecule is exposed to the sample, and resulting reaction products separated using electrophoretic separation. The elution time of the reporter moiety may be utilized to identify the presence or absence of the active target agent.

  15. Uncultivated Microbial Eukaryotic Diversity: A Method to Link ssu rRNA Gene Sequences with Morphology

    PubMed Central

    Hirst, Marissa B.; Kita, Kelley N.; Dawson, Scott C.

    2011-01-01

    Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA “phylotypes” from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages, identified in diverse environments. PMID:22174774

  16. Mutation Analysis of SLC26A4 for Pendred Syndrome and Nonsyndromic Hearing Loss by High-Resolution Melting

    PubMed Central

    Chen, Neng; Tranebjærg, Lisbeth; Rendtorff, Nanna Dahl; Schrijver, Iris

    2011-01-01

    Pendred syndrome and DFNB4 (autosomal recessive nonsyndromic congenital deafness, locus 4) are associated with autosomal recessive congenital sensorineural hearing loss and mutations in the SLC26A4 gene. Extensive allelic heterogeneity, however, necessitates analysis of all exons and splice sites to identify mutations for individual patients. Although Sanger sequencing is the gold standard for mutation detection, screening methods supplemented with targeted sequencing can provide a cost-effective alternative. One such method, denaturing high-performance liquid chromatography, was developed for clinical mutation detection in SLC26A4. However, this method inherently cannot distinguish homozygous changes from wild-type sequences. High-resolution melting (HRM), on the other hand, can detect heterozygous and homozygous changes cost-effectively, without any post-PCR modifications. We developed a closed-tube HRM mutation detection method specific for SLC26A4 that can be used in the clinical diagnostic setting. Twenty-eight primer pairs were designed to cover all 21 SLC26A4 exons and splice junction sequences. Using the resulting amplicons, initial HRM analysis detected all 45 variants previously identified by sequencing. Subsequently, a 384-well plate format was designed for up to three patient samples per run. Blinded HRM testing on these plates of patient samples collected over 1 year in a clinical diagnostic laboratory accurately detected all variants identified by sequencing. In conclusion, HRM with targeted sequencing is a reliable, simple, and cost-effective method for SLC26A4 mutation screening and detection. PMID:21704276

  17. Use of Genome Sequence Information for Meat Quality Trait QTL Mining for Causal Genes and Mutations on Pig Chromosome 17

    PubMed Central

    Hu, Zhi-Liang; Ramos, Antonio M.; Humphray, Sean J.; Rogers, Jane; Reecy, James M.; Rothschild, Max F.

    2011-01-01

    The newly available pig genome sequence has provided new information to fine map quantitative trait loci (QTL) in order to eventually identify causal variants. With targeted genomic sequencing efforts, we were able to obtain high quality BAC sequences that cover a region on pig chromosome 17 where a number of meat quality QTL have been previously discovered. Sequences from 70 BAC clones were assembled to form an 8-Mbp contig. Subsequently, we successfully mapped five previously identified QTL, three for meat color and two for lactate related traits, to the contig. With an additional 25 genetic markers that were identified by sequence comparison, we were able to carry out further linkage disequilibrium analysis to narrow down the genomic locations of these QTL, which allowed identification of the chromosomal regions that likely contain the causative variants. This research has provided one practical approach to combine genetic and molecular information for QTL mining. PMID:22303339

  18. Quantitative phenotyping via deep barcode sequencing

    PubMed Central

    Smith, Andrew M.; Heisler, Lawrence E.; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J.; Chee, Mark; Roth, Frederick P.; Giaever, Guri; Nislow, Corey

    2009-01-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or “Bar-seq,” outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that ∼20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene–environment interactions on a genome-wide scale. PMID:19622793

  19. Identifying MicroRNAs and Transcript Targets in Jatropha Seeds

    PubMed Central

    Galli, Vanessa; Guzman, Frank; de Oliveira, Luiz F. V.; Loss-Morais, Guilherme; Körbes, Ana P.; Silva, Sérgio D. A.; Margis-Pinheiro, Márcia M. A. N.; Margis, Rogério

    2014-01-01

    MicroRNAs, or miRNAs, are endogenously encoded small RNAs that play a key role in diverse plant biological processes. Jatropha curcas L. has received significant attention as a potential oilseed crop for the production of renewable oil. Here, a sRNA library of mature seeds and three mRNA libraries from three different seed development stages were generated by deep sequencing to identify and characterize the miRNAs and pre-miRNAs of J. curcas. Computational analysis was used for the identification of 180 conserved miRNAs and 41 precursors (pre-miRNAs) as well as 16 novel pre-miRNAs. The predicted miRNA target genes are involved in a broad range of physiological functions, including cellular structure, nuclear function, translation, transport, hormone synthesis, defense, and lipid metabolism. Some pre-miRNA and miRNA targets vary in abundance between the three stages of seed development. A search for sequences that produce siRNA was performed, and the results indicated that J. curcas siRNAs play a role in nuclear functions, transport, catalytic processes and disease resistance. This study presents the first large scale identification of J. curcas miRNAs and their targets in mature seeds based on deep sequencing, and it contributes to a functional understanding of these miRNAs. PMID:24551031

  20. Novel kinase fusion transcripts found in endometrial cancer

    PubMed Central

    Tamura, Ryo; Yoshihara, Kosuke; Yamawaki, Kaoru; Suda, Kazuaki; Ishiguro, Tatsuya; Adachi, Sosuke; Okuda, Shujiro; Inoue, Ituro; Verhaak, Roel G. W.; Enomoto, Takayuki

    2015-01-01

    Recent advances in RNA-sequencing technology have enabled the discovery of gene fusion transcripts in the transcriptome of cancer cells. However, it remains difficult to differentiate the therapeutically targetable fusions from passenger events. We have analyzed RNA-sequencing data and DNA copy number data from 25 endometrial cancer cell lines to identify potential therapeutically targetable fusion transcripts, and have identified 124 high-confidence fusion transcripts, of which 69% are associated with gene amplifications. As targetable fusion candidates, we focused on three in-frame kinase fusion transcripts that retain a kinase domain (CPQ-PRKDC, CAPZA2-MET, and VGLL4-PRKG1). We detected only CPQ-PRKDC fusion transcript in three of 122 primary endometrial cancer tissues. Cell proliferation of the fusion-positive cell line was inhibited by knocking down the expression of wild-type PRKDC but not by blocking the CPQ-PRKDC fusion transcript expression. Quantitative real-time RT-PCR demonstrated that the expression of the CPQ-PRKDC fusion transcript was significantly lower than that of wild-type PRKDC, corresponding to a low transcript allele fraction of this fusion, based on RNA-sequencing read counts. In endometrial cancers, the CPQ-PRKDC fusion transcript may be a passenger aberration related to gene amplification. Our findings suggest that transcript allele fraction is a useful predictor to find bona-fide therapeutic-targetable fusion transcripts. PMID:26689674

  1. Novel kinase fusion transcripts found in endometrial cancer.

    PubMed

    Tamura, Ryo; Yoshihara, Kosuke; Yamawaki, Kaoru; Suda, Kazuaki; Ishiguro, Tatsuya; Adachi, Sosuke; Okuda, Shujiro; Inoue, Ituro; Verhaak, Roel G W; Enomoto, Takayuki

    2015-12-22

    Recent advances in RNA-sequencing technology have enabled the discovery of gene fusion transcripts in the transcriptome of cancer cells. However, it remains difficult to differentiate the therapeutically targetable fusions from passenger events. We have analyzed RNA-sequencing data and DNA copy number data from 25 endometrial cancer cell lines to identify potential therapeutically targetable fusion transcripts, and have identified 124 high-confidence fusion transcripts, of which 69% are associated with gene amplifications. As targetable fusion candidates, we focused on three in-frame kinase fusion transcripts that retain a kinase domain (CPQ-PRKDC, CAPZA2-MET, and VGLL4-PRKG1). We detected only CPQ-PRKDC fusion transcript in three of 122 primary endometrial cancer tissues. Cell proliferation of the fusion-positive cell line was inhibited by knocking down the expression of wild-type PRKDC but not by blocking the CPQ-PRKDC fusion transcript expression. Quantitative real-time RT-PCR demonstrated that the expression of the CPQ-PRKDC fusion transcript was significantly lower than that of wild-type PRKDC, corresponding to a low transcript allele fraction of this fusion, based on RNA-sequencing read counts. In endometrial cancers, the CPQ-PRKDC fusion transcript may be a passenger aberration related to gene amplification. Our findings suggest that transcript allele fraction is a useful predictor to find bona-fide therapeutic-targetable fusion transcripts.

  2. In silico assessment of primers for eDNA studies using PrimerTree and application to characterize the biodiversity surrounding the Cuyahoga River

    NASA Astrophysics Data System (ADS)

    Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.

    2016-03-01

    Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity.

  3. In silico assessment of primers for eDNA studies using PrimerTree and application to characterize the biodiversity surrounding the Cuyahoga River

    PubMed Central

    Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.

    2016-01-01

    Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity. PMID:26965911

  4. AmpliVar: mutation detection in high-throughput sequence from amplicon-based libraries.

    PubMed

    Hsu, Arthur L; Kondrashova, Olga; Lunke, Sebastian; Love, Clare J; Meldrum, Cliff; Marquis-Nicholson, Renate; Corboy, Greg; Pham, Kym; Wakefield, Matthew; Waring, Paul M; Taylor, Graham R

    2015-04-01

    Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data. © 2015 WILEY PERIODICALS, INC.

  5. Development of Genetic Markers in Eucalyptus Species by Target Enrichment and Exome Sequencing

    PubMed Central

    Dasgupta, Modhumita Ghosh; Dharanishanthi, Veeramuthu; Agarwal, Ishangi; Krutovsky, Konstantin V.

    2015-01-01

    The advent of next-generation sequencing has facilitated large-scale discovery, validation and assessment of genetic markers for high density genotyping. The present study was undertaken to identify markers in genes supposedly related to wood property traits in three Eucalyptus species. Ninety four genes involved in xylogenesis were selected for hybridization probe based nuclear genomic DNA target enrichment and exome sequencing. Genomic DNA was isolated from the leaf tissues and used for on-array probe hybridization followed by Illumina sequencing. The raw sequence reads were trimmed and high-quality reads were mapped to the E. grandis reference sequence and the presence of single nucleotide variants (SNVs) and insertions/ deletions (InDels) were identified across the three species. The average read coverage was 216X and a total of 2294 SNVs and 479 InDels were discovered in E. camaldulensis, 2383 SNVs and 518 InDels in E. tereticornis, and 1228 SNVs and 409 InDels in E. grandis. Additionally, SNV calling and InDel detection were conducted in pair-wise comparisons of E. tereticornis vs. E. grandis, E. camaldulensis vs. E. tereticornis and E. camaldulensis vs. E. grandis. This study presents an efficient and high throughput method on development of genetic markers for family– based QTL and association analysis in Eucalyptus. PMID:25602379

  6. Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

    PubMed

    Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

    2016-01-01

    Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.

  7. Targeted next-generation sequencing helps to decipher the genetic and phenotypic heterogeneity of hypertrophic cardiomyopathy

    PubMed Central

    Cecconi, Massimiliano; Parodi, Maria I.; Formisano, Francesco; Spirito, Paolo; Autore, Camillo; Musumeci, Maria B.; Favale, Stefano; Forleo, Cinzia; Rapezzi, Claudio; Biagini, Elena; Davì, Sabrina; Canepa, Elisabetta; Pennese, Loredana; Castagnetta, Mauro; Degiorgio, Dario; Coviello, Domenico A.

    2016-01-01

    Hypertrophic cardiomyopathy (HCM) is mainly associated with myosin, heavy chain 7 (MYH7) and myosin binding protein C, cardiac (MYBPC3) mutations. In order to better explain the clinical and genetic heterogeneity in HCM patients, in this study, we implemented a target-next generation sequencing (NGS) assay. An Ion AmpliSeq™ Custom Panel for the enrichment of 19 genes, of which 9 of these did not encode thick/intermediate and thin myofilament (TTm) proteins and, among them, 3 responsible of HCM phenocopy, was created. Ninety-two DNA samples were analyzed by the Ion Personal Genome Machine: 73 DNA samples (training set), previously genotyped in some of the genes by Sanger sequencing, were used to optimize the NGS strategy, whereas 19 DNA samples (discovery set) allowed the evaluation of NGS performance. In the training set, we identified 72 out of 73 expected mutations and 15 additional mutations: the molecular diagnosis was achieved in one patient with a previously wild-type status and the pre-excitation syndrome was explained in another. In the discovery set, we identified 20 mutations, 5 of which were in genes encoding non-TTm proteins, increasing the diagnostic yield by approximately 20%: a single mutation in genes encoding non-TTm proteins was identified in 2 out of 3 borderline HCM patients, whereas co-occuring mutations in genes encoding TTm and galactosidase alpha (GLA) altered proteins were characterized in a male with HCM and multiorgan dysfunction. Our combined targeted NGS-Sanger sequencing-based strategy allowed the molecular diagnosis of HCM with greater efficiency than using the conventional (Sanger) sequencing alone. Mutant alleles encoding non-TTm proteins may aid in the complete understanding of the genetic and phenotypic heterogeneity of HCM: co-occuring mutations of genes encoding TTm and non-TTm proteins could explain the wide variability of the HCM phenotype, whereas mutations in genes encoding only the non-TTm proteins are identifiable in patients with a milder HCM status. PMID:27600940

  8. Genomes2Drugs: Identifies Target Proteins and Lead Drugs from Proteome Data

    PubMed Central

    Toomey, David; Hoppe, Heinrich C.; Brennan, Marian P.; Nolan, Kevin B.; Chubb, Anthony J.

    2009-01-01

    Background Genome sequencing and bioinformatics have provided the full hypothetical proteome of many pathogenic organisms. Advances in microarray and mass spectrometry have also yielded large output datasets of possible target proteins/genes. However, the challenge remains to identify new targets for drug discovery from this wealth of information. Further analysis includes bioinformatics and/or molecular biology tools to validate the findings. This is time consuming and expensive, and could fail to yield novel drugs if protein purification and crystallography is impossible. To pre-empt this, a researcher may want to rapidly filter the output datasets for proteins that show good homology to proteins that have already been structurally characterised or proteins that are already targets for known drugs. Critically, those researchers developing novel antibiotics need to select out the proteins that show close homology to any human proteins, as future inhibitors are likely to cross-react with the host protein, causing off-target toxicity effects later in clinical trials. Methodology/Principal Findings To solve many of these issues, we have developed a free online resource called Genomes2Drugs which ranks sequences to identify proteins that are (i) homologous to previously crystallized proteins or (ii) targets of known drugs, but are (iii) not homologous to human proteins. When tested using the Plasmodium falciparum malarial genome the program correctly enriched the ranked list of proteins with known drug target proteins. Conclusions/Significance Genomes2Drugs rapidly identifies proteins that are likely to succeed in drug discovery pipelines. This free online resource helps in the identification of potential drug targets. Importantly, the program further highlights proteins that are likely to be inhibited by FDA-approved drugs. These drugs can then be rapidly moved into Phase IV clinical studies under ‘change-of-application’ patents. PMID:19593435

  9. Targeted sequencing-based analyses of candidate gene variants in ulcerative colitis-associated colorectal neoplasia.

    PubMed

    Chakrabarty, Sanjiban; Varghese, Vinay Koshy; Sahu, Pranoy; Jayaram, Pradyumna; Shivakumar, Bhadravathi M; Pai, Cannanore Ganesh; Satyamoorthy, Kapaettu

    2017-06-27

    Long-standing ulcerative colitis (UC) leading to colorectal cancer (CRC) is one of the most serious and life-threatening consequences acknowledged globally. Ulcerative colitis-associated colorectal carcinogenesis showed distinct molecular alterations when compared with sporadic colorectal carcinoma. Targeted sequencing of 409 genes in tissue samples of 18 long-standing UC subjects at high risk of colorectal carcinoma (UCHR) was performed to identify somatic driver mutations, which may be involved in the molecular changes during the transformation of non-dysplastic mucosa to high-grade dysplasia. Findings from the study are also compared with previously published genome wide and exome sequencing data in inflammatory bowel disease-associated and sporadic colorectal carcinoma. Next-generation sequencing analysis identified 1107 mutations in 275 genes in UCHR subjects. In addition to TP53 (17%) and KRAS (22%) mutations, recurrent mutations in APC (33%), ACVR2A (61%), ARID1A (44%), RAF1 (39%) and MTOR (61%) were observed in UCHR subjects. In addition, APC, FGFR3, FGFR2 and PIK3CA driver mutations were identified in UCHR subjects. Recurrent mutations in ARID1A (44%), SMARCA4 (17%), MLL2 (44%), MLL3 (67%), SETD2 (17%) and TET2 (50%) genes involved in histone modification and chromatin remodelling were identified in UCHR subjects. Our study identifies new oncogenic driver mutations which may be involved in the transition of non-dysplastic cells to dysplastic phenotype in the subjects with long-standing UC with high risk of progression into colorectal neoplasia.

  10. Mutation Scanning in Wheat by Exon Capture and Next-Generation Sequencing.

    PubMed

    King, Robert; Bird, Nicholas; Ramirez-Gonzalez, Ricardo; Coghill, Jane A; Patil, Archana; Hassani-Pak, Keywan; Uauy, Cristobal; Phillips, Andrew L

    2015-01-01

    Targeted Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach to identify novel sequence variation in genomes, with the aims of investigating gene function and/or developing useful alleles for breeding. Despite recent advances in wheat genomics, most current TILLING methods are low to medium in throughput, being based on PCR amplification of the target genes. We performed a pilot-scale evaluation of TILLING in wheat by next-generation sequencing through exon capture. An oligonucleotide-based enrichment array covering ~2 Mbp of wheat coding sequence was used to carry out exon capture and sequencing on three mutagenised lines of wheat containing previously-identified mutations in the TaGA20ox1 homoeologous genes. After testing different mapping algorithms and settings, candidate SNPs were identified by mapping to the IWGSC wheat Chromosome Survey Sequences. Where sequence data for all three homoeologues were found in the reference, mutant calls were unambiguous; however, where the reference lacked one or two of the homoeologues, captured reads from these genes were mis-mapped to other homoeologues, resulting either in dilution of the variant allele frequency or assignment of mutations to the wrong homoeologue. Competitive PCR assays were used to validate the putative SNPs and estimate cut-off levels for SNP filtering. At least 464 high-confidence SNPs were detected across the three mutagenized lines, including the three known alleles in TaGA20ox1, indicating a mutation rate of ~35 SNPs per Mb, similar to that estimated by PCR-based TILLING. This demonstrates the feasibility of using exon capture for genome re-sequencing as a method of mutation detection in polyploid wheat, but accurate mutation calling will require an improved genomic reference with more comprehensive coverage of homoeologues.

  11. A 5.8S nuclear ribosomal RNA gene sequence database: applications to ecology and evolution

    NASA Technical Reports Server (NTRS)

    Cullings, K. W.; Vogler, D. R.

    1998-01-01

    We complied a 5.8S nuclear ribosomal gene sequence database for animals, plants, and fungi using both newly generated and GenBank sequences. We demonstrate the utility of this database as an internal check to determine whether the target organism and not a contaminant has been sequenced, as a diagnostic tool for ecologists and evolutionary biologists to determine the placement of asexual fungi within larger taxonomic groups, and as a tool to help identify fungi that form ectomycorrhizae.

  12. Development of an oligonucleotide probe for Aureobasidium pullulans based on the small-subunit rRNA gene.

    PubMed Central

    Li, S; Cullen, D; Hjort, M; Spear, R; Andrews, J H

    1996-01-01

    Aureobasidium pullulans, a cosmopolitan yeast-like fungus, colonizes leaf surfaces and has potential as a biocontrol agent of pathogens. To assess the feasibility of rRNA as a target for A. pullulans-specific oligonucleotide probes, we compared the nucleotide sequences of the small-subunit rRNA (18S) genes of 12 geographically diverse A. pullulans strains. Extreme sequence conservation was observed. The consensus A. pullulans sequence was compared with other fungal sequences to identify potential probes. A 21-mer probe which hybridized to the 12 A. pullulans strains but not to 98 other fungi, including 82 isolates from the phylloplane, was identified. A 17-mer highly specific for Cladosporium herbarum was also identified. These probes have potential in monitoring and quantifying fungi in leaf surface and other microbial communities. PMID:8633850

  13. Targeted next-generation sequencing identification of mutations in disease resistance gene anologs (RGAs) in wild and cultivated beets

    USDA-ARS?s Scientific Manuscript database

    Resistance gene analogs (RGAs) were searched bioinformatically in the sugar beet (Beta vulgaris L.) genome as potential candidates for improving resistance against different diseases. In the present study, Ion Torrent sequencing technology was used to identify mutations in 21 RGAs. The DNA samples o...

  14. Exome sequencing of a multigenerational human pedigree.

    PubMed

    Hedges, Dale J; Hedges, Dale; Burges, Dan; Powell, Eric; Almonte, Cherylyn; Huang, Jia; Young, Stuart; Boese, Benjamin; Schmidt, Mike; Pericak-Vance, Margaret A; Martin, Eden; Zhang, Xinmin; Harkins, Timothy T; Züchner, Stephan

    2009-12-14

    Over the next few years, the efficient use of next-generation sequencing (NGS) in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or approximately 180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of > or = 3, 86% at a read depth of > or = 10, and over 50% of all targets were covered with > or = 20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at > or = 10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered > or = 8x. Our results offer guidance for "real-world" applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.

  15. Identification of a novel LMF1 nonsense mutation responsible for severe hypertriglyceridemia by targeted next-generation sequencing.

    PubMed

    Cefalù, Angelo B; Spina, Rossella; Noto, Davide; Ingrassia, Valeria; Valenti, Vincenza; Giammanco, Antonina; Fayer, Francesca; Misiano, Gabriella; Cocorullo, Gianfranco; Scrimali, Chiara; Palesano, Ornella; Altieri, Grazia I; Ganci, Antonina; Barbagallo, Carlo M; Averna, Maurizio R

    Severe hypertriglyceridemia (HTG) may result from mutations in genes affecting the intravascular lipolysis of triglyceride (TG)-rich lipoproteins. The aim of this study was to develop a targeted next-generation sequencing panel for the molecular diagnosis of disorders characterized by severe HTG. We developed a targeted customized panel for next-generation sequencing Ion Torrent Personal Genome Machine to capture the coding exons and intron/exon boundaries of 18 genes affecting the main pathways of TG synthesis and metabolism. We sequenced 11 samples of patients with severe HTG (TG>885 mg/dL-10 mmol/L): 4 positive controls in whom pathogenic mutations had previously been identified by Sanger sequencing and 7 patients in whom the molecular defect was still unknown. The customized panel was accurate, and it allowed to confirm genetic variants previously identified in all positive controls with primary severe HTG. Only 1 patient of 7 with HTG was found to be carrier of a homozygous pathogenic mutation of the third novel mutation of LMF1 gene (c.1380C>G-p.Y460X). The clinical and molecular familial cascade screening allowed the identification of 2 additional affected siblings and 7 heterozygous carriers of the mutation. We showed that our targeted resequencing approach for genetic diagnosis of severe HTG appears to be accurate, less time consuming, and more economical compared with traditional Sanger resequencing. The identification of pathogenic mutations in candidate genes remains challenging and clinical resequencing should mainly intended for patients with strong clinical criteria for monogenic severe HTG. Copyright © 2017 National Lipid Association. Published by Elsevier Inc. All rights reserved.

  16. Genome wide identification of microRNAs involved in fatty acid and lipid metabolism of Brassica napus by small RNA and degradome sequencing.

    PubMed

    Wang, Zhiwei; Qiao, Yan; Zhang, Jingjing; Shi, Wenhui; Zhang, Jinwen

    2017-07-01

    Rapeseed (Brassica napus) is an important cash crop considered as the third largest oil crop worldwide. Rapeseed oil contains various saturation or unsaturation fatty acids, these fatty acids, whose could incorporation with TAG form into lipids stored in seeds play various roles in the metabolic activity. The different fatty acids in B. napus seeds determine oil quality, define if the oil is edible or must be used as industrial material. miRNAs are kind of non-coding sRNAs that could regulate gene expressions through post-transcriptional modification to their target transcripts playing important roles in plant metabolic activities. We employed high-throughput sequencing to identify the miRNAs and their target transcripts involved in fatty acids and lipids metabolism in different development of B. napus seeds. As a result, we identified 826 miRNA sequences, including 523 conserved and 303 newly miRNAs. From the degradome sequencing, we found 589 mRNA could be targeted by 236 miRNAs, it includes 49 novel miRNAs and 187 conserved miRNAs. The miRNA-target couple suggests that bna-5p-163957_18, bna-5p-396192_7, miR9563a-p3, miR9563b-p5, miR838-p3, miR156e-p3, miR159c and miR1134 could target PDP, LACS9, MFPA, ADSL1, ACO32, C0401, GDL73, PlCD6, OLEO3 and WSD1. These target transcripts are involving in acetyl-CoA generate and carbon chain desaturase, regulating the levels of very long chain fatty acids, β-oxidation and lipids transport and metabolism process. At the same, we employed the q-PCR to valid the expression of miRNAs and their target transcripts that involve in fatty acid and lipid metabolism, the result suggested that the miRNA and their transcript expression are negative correlation, which in accord with the expression of miRNA and its target transcript. The study findings suggest that the identified miRNA may play important role in the fatty acids and lipids metabolism in seeds of B. napus. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.

  17. Individual microRNAs (miRNAs) display distinct mRNA targeting "rules".

    PubMed

    Wang, Wang-Xia; Wilfred, Bernard R; Xie, Kevin; Jennings, Mary H; Hu, Yanling Hu; Stromberg, Arnold J; Nelson, Peter T

    2010-01-01

    MicroRNAs (miRNAs) guide Argonaute (AGO)-containing microribonucleoprotein (miRNP) complexes to target mRNAs.It has been assumed that miRNAs behave similarly to each other with regard to mRNA target recognition. The usual assumptions, which are based on prior studies, are that miRNAs target preferentially sequences in the 3'UTR of mRNAs,guided by the 5' "seed" portion of the miRNAs. Here we isolated AGO- and miRNA-containing miRNPs from human H4 tumor cells by co-immunoprecipitation (co-IP) with anti-AGO antibody. Cells were transfected with miR-107, miR-124,miR-128, miR-320, or a negative control miRNA. Co-IPed RNAs were subjected to downstream high-density Affymetrix Human Gene 1.0 ST microarray analyses using an assay we validated previously-a "RIP-Chip" experimental design. RIP-Chip data provided a list of mRNAs recruited into the AGO-miRNP in correlation to each miRNA. These experimentally identified miRNA targets were analyzed for complementary six nucleotide "seed" sequences within the transfected miRNAs. We found that miR-124 targets tended to have sequences in the 3'UTR that would be recognized by the 5' seed of miR-124, as described in previous studies. By contrast, miR-107 targets tended to have 'seed' sequences in the mRNA open reading frame, but not the 3' UTR. Further, mRNA targets of miR-128 and miR-320 are less enriched for 6-mer seed sequences in comparison to miR-107 and miR-124. In sum, our data support the importance of the 5' seed in determining binding characteristics for some miRNAs; however, the "binding rules" are complex, and individual miRNAs can have distinct sequence determinants that lead to mRNA targeting.

  18. Capturing Attention When Attention "Blinks"

    ERIC Educational Resources Information Center

    Wee, Serena; Chua, Fook K.

    2004-01-01

    Four experiments addressed the question of whether attention may be captured when the visual system is in the midst of an attentional blink (AB). Participants identified 2 target letters embedded among distractor letters in a rapid serial visual presentation sequence. In some trials, a square frame was inserted between the targets; as the only…

  19. Evaluation of Phage Display Discovered Peptides as Ligands for Prostate-Specific Membrane Antigen (PSMA)

    PubMed Central

    Edwards, W. Barry

    2013-01-01

    The aim of this study was to identify potential ligands of PSMA suitable for further development as novel PSMA-targeted peptides using phage display technology. The human PSMA protein was immobilized as a target followed by incubation with a 15-mer phage display random peptide library. After one round of prescreening and two rounds of screening, high-stringency screening at the third round of panning was performed to identify the highest affinity binders. Phages which had a specific binding activity to PSMA in human prostate cancer cells were isolated and the DNA corresponding to the 15-mers were sequenced to provide three consensus sequences: GDHSPFT, SHFSVGS and EVPRLSLLAVFL as well as other sequences that did not display consensus. Two of the peptide sequences deduced from DNA sequencing of binding phages, SHSFSVGSGDHSPFT and GRFLTGGTGRLLRIS were labeled with 5-carboxyfluorescein and shown to bind and co-internalize with PSMA on human prostate cancer cells by fluorescence microscopy. The high stringency requirements yielded peptides with affinities KD∼1 µM or greater which are suitable starting points for affinity maturation. While these values were less than anticipated, the high stringency did yield peptide sequences that apparently bound to different surfaces on PSMA. These peptide sequences could be the basis for further development of peptides for prostate cancer tumor imaging and therapy. PMID:23935860

  20. Characterizing protein domain associations by Small-molecule ligand binding

    PubMed Central

    Li, Qingliang; Cheng, Tiejun; Wang, Yanli; Bryant, Stephen H.

    2012-01-01

    Background Protein domains are evolutionarily conserved building blocks for protein structure and function, which are conventionally identified based on protein sequence or structure similarity. Small molecule binding domains are of great importance for the recognition of small molecules in biological systems and drug development. Many small molecules, including drugs, have been increasingly identified to bind to multiple targets, leading to promiscuous interactions with protein domains. Thus, a large scale characterization of the protein domains and their associations with respect to small-molecule binding is of particular interest to system biology research, drug target identification, as well as drug repurposing. Methods We compiled a collection of 13,822 physical interactions of small molecules and protein domains derived from the Protein Data Bank (PDB) structures. Based on the chemical similarity of these small molecules, we characterized pairwise associations of the protein domains and further investigated their global associations from a network point of view. Results We found that protein domains, despite lack of similarity in sequence and structure, were comprehensively associated through binding the same or similar small-molecule ligands. Moreover, we identified modules in the domain network that consisted of closely related protein domains by sharing similar biochemical mechanisms, being involved in relevant biological pathways, or being regulated by the same cognate cofactors. Conclusions A novel protein domain relationship was identified in the context of small-molecule binding, which is complementary to those identified by traditional sequence-based or structure-based approaches. The protein domain network constructed in the present study provides a novel perspective for chemogenomic study and network pharmacology, as well as target identification for drug repurposing. PMID:23745168

  1. In silico identification and characterization of conserved miRNAs and their target genes in sweet potato (Ipomoea batatas L.) Expressed Sequence Tags (ESTs)

    PubMed Central

    Dehury, Budheswar; Panda, Debashis; Sahu, Jagajjit; Sahu, Mousumi; Sarma, Kishore; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra Kumar

    2013-01-01

    The endogenous small non-coding micro RNAs (miRNAs), which are typically ~21–24 nt nucleotides, play a crucial role in regulating the intrinsic normal growth of cells and development of the plants as well as in maintaining the integrity of genomes. These small non-coding RNAs function as the universal specificity factors in post-transcriptional gene silencing. Discovering miRNAs, identifying their targets, and further inferring miRNA functions is a routine process to understand normal biological processes of miRNAs and their roles in the development of plants. Comparative genomics based approach using expressed sequence tags (EST) and genome survey sequences (GSS) offer a cost-effective platform for identification and characterization of miRNAs and their target genes in plants. Despite the fact that sweet potato (Ipomoea batatas L.) is an important staple food source for poor small farmers throughout the world, the role of miRNA in various developmental processes remains largely unknown. In this paper, we report the computational identification of miRNAs and their target genes in sweet potato from their ESTs. Using comparative genomics-based approach, 8 potential miRNA candidates belonging to miR168, miR2911, and miR156 families were identified from 23 406 ESTs in sweet potato. A total of 42 target genes were predicted and their probable functions were illustrated. Most of the newly identified miRNAs target transcription factors as well as genes involved in plant growth and development, signal transduction, metabolism, defense, and stress response. The identification of miRNAs and their targets is expected to accelerate the pace of miRNA discovery, leading to an improved understanding of the role of miRNA in development and physiology of sweet potato, as well as stress response. PMID:24067297

  2. Tyrosine kinome sequencing of pediatric acute lymphoblastic leukemia: a report from the Children's Oncology Group TARGET Project | Office of Cancer Genomics

    Cancer.gov

    TARGET researchers sequenced the tyrosine kinome and downstream signaling genes in 45 high-risk pediatric ALL cases with activated kinase signaling, including Ph-like ALL, to establish the incidence of tyrosine kinase mutations in this cohort. The study confirmed previously identified somatic mutations in JAK and FLT3, but did not find novel alterations in any additional tyrosine kinases or downstream genes. The mechanism of kinase signaling activation in this high-risk subgroup of pediatric ALL remains largely unknown.

  3. Diagnostic Yield of Next-Generation Sequencing in Very Early-Onset Inflammatory Bowel Diseases: A Multicenter Study.

    PubMed

    Charbit-Henrion, Fabienne; Parlato, Marianna; Hanein, Sylvain; Duclaux-Loras, Rémi; Nowak, Jan; Begue, Bernadette; Rakotobe, Sabine; Bruneau, Julie; Fourrage, Cécile; Alibeu, Olivier; Rieux-Laucat, Frédéric; Lévy, Eva; Stolzenberg, Marie-Claude; Mazerolles, Fabienne; Latour, Sylvain; Lenoir, Christelle; Fischer, Alain; Picard, Capucine; Aloi, Marina; Amil Dias, Jorge; Ben Hariz, Mongi; Bourrier, Anne; Breuer, Christian; Breton, Anne; Bronski, Jiri; Buderus, Stephan; Cananzi, Mara; Coopman, Stéphanie; Crémilleux, Clara; Dabadie, Alain; Dumant-Forest, Clémentine; Egritas Gurkan, Odul; Fabre, Alexandre; Fischer, Aude; German Diaz, Marta; Gonzalez-Lama, Yago; Goulet, Olivier; Guariso, Graziella; Gurcan, Neslihan; Homan, Matjaz; Hugot, Jean-Pierre; Jeziorski, Eric; Karanika, Evi; Lachaux, Alain; Lewindon, Peter; Lima, Rosa; Magro, Fernando; Major, Janos; Malamut, Georgia; Mas, Emmanuel; Mattyus, Istvan; Mearin, Luisa M; Melek, Jan; Navas-Lopez, Victor Manuel; Paerregaard, Anders; Pelatan, Cecile; Pigneur, Bénédicte; Pinto Pais, Isabel; Rebeuh, Julie; Romano, Claudio; Siala, Nadia; Strisciuglio, Caterina; Tempia-Caliera, Michela; Tounian, Patrick; Turner, Dan; Urbonas, Vaidotas; Willot, Stéphanie; Ruemmele, Frank M; Cerf-Bensussan, Nadine

    2018-05-18

    An expanding number of monogenic defects have been identified as causative of severe forms of very early-onset inflammatory bowel diseases (VEO-IBD). The present study aimed at defining how next-generation sequencing (NGS) methods can be used to improve identification of known molecular diagnosis and adapt treatment. 207 children were recruited in 45 Paediatric centres through an international collaborative network (ESPGHAN GENIUS working group) with a clinical presentation of severe VEO-IBD (n=185) or an anamnesis suggestive of a monogenic disorder (n=22). Patients were divided at inclusion into three phenotypic subsets: predominantly small bowel inflammation, colitis with perianal lesions, and colitis only. Methods to obtain molecular diagnosis included functional tests followed by specific Sanger sequencing, custom-made targeted NGS, and in selected cases whole exome sequencing (WES) of parents-child trios. Genetic findings were validated clinically and/or functionally. Molecular diagnosis was achieved in 66/207 children (32%): 61% with small bowel inflammation, 39% with colitis and perianal lesions and 18% with colitis only. Targeted NGS pinpointed gene mutations causative of atypical presentations and identified large exonic copy number variations previously missed by WES. Our results lead us to propose an optimised diagnostic strategy to identify known monogenic causes of severe IBD.

  4. Novel mutations in LRP6 highlight the role of WNT signaling in tooth agenesis

    PubMed Central

    Ludwig, Kerstin U.; Sullivan, Robert; van Rooij, Iris A.L.M.; Thonissen, Michelle; Swinnen, Steven; Phan, Milien; Conte, Federica; Ishorst, Nina; Gilissen, Christian; RoaFuentes, Laury; van de Vorst, Maartje; Henkes, Arjen; Steehouwer, Marloes; van Beusekom, Ellen; Bloemen, Marjon; Vankeirsbilck, Bruno; Bergé, Stefaan; Hens, Greet; Schoenaers, Joseph; Poorten, Vincent Vander; Roosenboom, Jasmien; Verdonck, An; Devriendt, Koen; Roeleveldt, Nel; Jhangiani, Shalini N.; Vissers, Lisenka E.L.M.; Lupski, James R.; de Ligt, Joep; Von den Hoff, Johannes W.; Pfundt, Rolph; Brunner, Han G.; Zhou, Huiqing; Dixon, Jill; Mangold, Elisabeth; van Bokhoven, Hans; Dixon, Michael J.; Kleefstra, Tjitske

    2016-01-01

    Purpose Here we aimed to identify a novel genetic cause of tooth agenesis (TA) and/or orofacial clefting (OFC) by combining whole exome sequencing (WES) and targeted re-sequencing in a large cohort of TA and OFC patients. Methods WES was performed in two unrelated patients, one with severe TA and OFC and another with severe TA only. After identifying deleterious mutations in a gene encoding the low density lipoprotein receptor-related protein 6 (LRP6), all its exons were re-sequenced with molecular inversion probes, in 67 patients with TA, 1,072 patients with OFC and in 706 controls. Results We identified a frameshift (c.4594delG, p.Cys1532fs) and a canonical splice site mutation (c.3398-2A>C, p.?) in LRP6 respectively in the patient with TA and OFC, and in the patient with severe TA only. The targeted re-sequencing showed significant enrichment of unique LRP6 variants in TA patients, but not in nonsyndromic OFC. From the 5 variants in patients with TA, 2 affect the canonical splice site and 3 were missense variants; all variants segregated with the dominant phenotype and in 1 case the missense mutation occurred de novo. Conclusion Mutations in LRP6 cause tooth agenesis in man. PMID:26963285

  5. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    PubMed

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  6. Genetic analysis of a Chinese family with members affected with Usher syndrome type II and Waardenburg syndrome type IV.

    PubMed

    Wang, Xueling; Lin, Xiao-Jiang; Tang, Xiangrong; Chai, Yong-Chuan; Yu, De-Hong; Chen, Dong-Ye; Wu, Hao

    2017-11-01

    The purpose of this study was to identify the genetic causes of a family presenting with multiple symptoms overlapping Usher syndrome type II (USH2) and Waardenburg syndrome type IV (WS4). Targeted next-generation sequencing including the exon and flanking intron sequences of 79 deafness genes was performed on the proband. Co-segregation of the disease phenotype and the detected variants were confirmed in all family members by PCR amplification and Sanger sequencing. The affected members of this family had two different recessive disorders, USH2 and WS4. By targeted next-generation sequencing, we identified that USH2 was caused by a novel missense mutation, p.V4907D in GPR98; whereas WS4 due to p.V185M in EDNRB. This is the first report of homozygous p.V185M mutation in EDNRB in patient with WS4. This study reported a Chinese family with multiple independent and overlapping phenotypes. In condition, molecular level analysis was efficient to identify the causative variant p.V4907D in GPR98 and p.V185M in EDNRB, also was helpful to confirm the clinical diagnosis of USH2 and WS4. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. DNA-based stable isotope probing coupled with cultivation methods implicates Methylophaga in hydrocarbon degradation

    PubMed Central

    Mishamandani, Sara; Gutierrez, Tony; Aitken, Michael D.

    2014-01-01

    Marine hydrocarbon-degrading bacteria perform a fundamental role in the oxidation and ultimate removal of crude oil and its petrochemical derivatives in coastal and open ocean environments. Those with an almost exclusive ability to utilize hydrocarbons as a sole carbon and energy source have been found confined to just a few genera. Here we used stable isotope probing (SIP), a valuable tool to link the phylogeny and function of targeted microbial groups, to investigate hydrocarbon-degrading bacteria in coastal North Carolina sea water (Beaufort Inlet, USA) with uniformly labeled [13C]n-hexadecane. The dominant sequences in clone libraries constructed from 13C-enriched bacterial DNA (from n-hexadecane enrichments) were identified to belong to the genus Alcanivorax, with ≤98% sequence identity to the closest type strain—thus representing a putative novel phylogenetic taxon within this genus. Unexpectedly, we also identified 13C-enriched sequences in heavy DNA fractions that were affiliated to the genus Methylophaga. This is a contentious group since, though some of its members have been proposed to degrade hydrocarbons, substantive evidence has not previously confirmed this. We used quantitative PCR primers targeting the 16S rRNA gene of the SIP-identified Alcanivorax and Methylophaga to determine their abundance in incubations amended with unlabeled n-hexadecane. Both showed substantial increases in gene copy number during the experiments. Subsequently, we isolated a strain representing the SIP-identified Methylophaga sequences (99.9% 16S rRNA gene sequence identity) and used it to show, for the first time, direct evidence of hydrocarbon degradation by a cultured Methylophaga sp. This study demonstrates the value of coupling SIP with cultivation methods to identify and expand on the known diversity of hydrocarbon-degrading bacteria in the marine environment. PMID:24578702

  8. Bias-Corrected Targeted Next-Generation Sequencing for Rapid, Multiplexed Detection of Actionable Alterations in Cell-Free DNA from Advanced Lung Cancer Patients.

    PubMed

    Paweletz, Cloud P; Sacher, Adrian G; Raymond, Chris K; Alden, Ryan S; O'Connell, Allison; Mach, Stacy L; Kuang, Yanan; Gandhi, Leena; Kirschmeier, Paul; English, Jessie M; Lim, Lee P; Jänne, Pasi A; Oxnard, Geoffrey R

    2016-02-15

    Tumor genotyping is a powerful tool for guiding non-small cell lung cancer (NSCLC) care; however, comprehensive tumor genotyping can be logistically cumbersome. To facilitate genotyping, we developed a next-generation sequencing (NGS) assay using a desktop sequencer to detect actionable mutations and rearrangements in cell-free plasma DNA (cfDNA). An NGS panel was developed targeting 11 driver oncogenes found in NSCLC. Targeted NGS was performed using a novel methodology that maximizes on-target reads, and minimizes artifact, and was validated on DNA dilutions derived from cell lines. Plasma NGS was then blindly performed on 48 patients with advanced, progressive NSCLC and a known tumor genotype, and explored in two patients with incomplete tumor genotyping. NGS could identify mutations present in DNA dilutions at ≥ 0.4% allelic frequency with 100% sensitivity/specificity. Plasma NGS detected a broad range of driver and resistance mutations, including ALK, ROS1, and RET rearrangements, HER2 insertions, and MET amplification, with 100% specificity. Sensitivity was 77% across 62 known driver and resistance mutations from the 48 cases; in 29 cases with common EGFR and KRAS mutations, sensitivity was similar to droplet digital PCR. In two cases with incomplete tumor genotyping, plasma NGS rapidly identified a novel EGFR exon 19 deletion and a missed case of MET amplification. Blinded to tumor genotype, this plasma NGS approach detected a broad range of targetable genomic alterations in NSCLC with no false positives including complex mutations like rearrangements and unexpected resistance mutations such as EGFR C797S. Through use of widely available vacutainers and a desktop sequencing platform, this assay has the potential to be implemented broadly for patient care and translational research. ©2015 American Association for Cancer Research.

  9. Bias-corrected targeted next-generation sequencing for rapid, multiplexed detection of actionable alterations in cell-free DNA from advanced lung cancer patients

    PubMed Central

    Paweletz, Cloud P.; Sacher, Adrian G.; Raymond, Chris K.; Alden, Ryan S.; O'Connell, Allison; Mach, Stacy L.; Kuang, Yanan; Gandhi, Leena; Kirschmeier, Paul; English, Jessie M.; Lim, Lee P.; Jänne, Pasi A.; Oxnard, Geoffrey R.

    2015-01-01

    Purpose Tumor genotyping is a powerful tool for guiding non-small cell lung cancer (NSCLC) care, however comprehensive tumor genotyping can be logistically cumbersome. To facilitate genotyping, we developed a next-generation sequencing (NGS) assay using a desktop sequencer to detect actionable mutations and rearrangements in cell-free plasma DNA (cfDNA). Experimental Design An NGS panel was developed targeting 11 driver oncogenes found in NSCLC. Targeted NGS was performed using a novel methodology that maximizes on-target reads, and minimizes artifact, and was validated on DNA dilutions derived from cell lines. Plasma NGS was then blindly performed on 48 patients with advanced, progressive NSCLC and a known tumor genotype, and explored in two patients with incomplete tumor genotyping. Results NGS could identify mutations present in DNA dilutions at ≥0.4% allelic frequency with 100% sensitivity/specificity. Plasma NGS detected a broad range of driver and resistance mutations, including ALK, ROS1, and RET rearrangements, HER2 insertions, and MET amplification, with 100% specificity. Sensitivity was 77% across 62 known driver and resistance mutations from the 48 cases; in 29 cases with common EGFR and KRAS mutations, sensitivity was similar to droplet digital PCR. In two cases with incomplete tumor genotyping, plasma NGS rapidly identified a novel EGFR exon 19 deletion and a missed case of MET amplification. Conclusion Blinded to tumor genotype, this plasma NGS approach detected a broad range of targetable genomic alterations in NSCLC with no false positives including complex mutations like rearrangements and unexpected resistance mutations such as EGFR C797S. Through use of widely available vacutainers and a desktop sequencing platform, this assay has the potential to be implemented broadly for patient care and translational research. PMID:26459174

  10. Sequence Alignment to Predict Across Species Susceptibility ...

    EPA Pesticide Factsheets

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev

  11. Mechanism of foreign DNA selection in a bacterial adaptive immune system

    PubMed Central

    Sashital, Dipali G.; Wiedenheft, Blake; Doudna, Jennifer A.

    2012-01-01

    Summary In bacterial and archaeal CRISPR immune pathways, DNA sequences from invading bacteriophage or plasmids are integrated into CRISPR loci within the host genome, conferring immunity against subsequent infections. The ribonucleoprotein complex Cascade utilizes RNAs generated from these loci to target complementary “non-self” DNA sequences for destruction, while avoiding binding to “self” sequences within the CRISPR locus. Here we show that CasA, the largest protein subunit of Cascade, is required for non-self target recognition and binding. Combining a 2.3 Å crystal structure of CasA with cryo-EM structures of Cascade, we have identified a loop that is required for viral defense. This loop contacts a conserved 3-base pair motif that is required for non-self target selection. Our data suggest a model in which the CasA loop scans DNA for this short motif prior to target destabilization and binding, maximizing the efficiency of DNA surveillance by Cascade. PMID:22521690

  12. Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor

    DOE PAGES

    Faulon, Jean-Loup; Misra, Milind; Martin, Shawn; ...

    2007-11-23

    Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformaticsmore » representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.« less

  13. In silico study of breast cancer associated gene 3 using LION Target Engine and other tools.

    PubMed

    León, Darryl A; Cànaves, Jaume M

    2003-12-01

    Sequence analysis of individual targets is an important step in annotation and validation. As a test case, we investigated human breast cancer associated gene 3 (BCA3) with LION Target Engine and with other bioinformatics tools. LION Target Engine confirmed that the BCA3 gene is located on 11p15.4 and that the two most likely splice variants (lacking exon 3 and exons 3 and 5, respectively) exist. Based on our manual curation of sequence data, it is proposed that an additional variant (missing only exon 5) published in a public sequence repository, is a prediction artifact. A significant number of new orthologs were also identified, and these were the basis for a high-quality protein secondary structure prediction. Moreover, our research confirmed several distinct functional domains as described in earlier reports. Sequence conservation from multiple sequence alignments, splice variant identification, secondary structure predictions, and predicted phosphorylation sites suggest that the removal of interaction sites through alternative splicing might play a modulatory role in BCA3. This in silico approach shows the depth and relevance of an analysis that can be accomplished by including a variety of publicly available tools with an integrated and customizable life science informatics platform.

  14. Next generation sequencing to identify novel genetic variants causative of autosomal dominant familial hypercholesterolemia associated with increased risk of coronary heart disease.

    PubMed

    Al-Allaf, Faisal A; Athar, Mohammad; Abduljaleel, Zainularifeen; Taher, Mohiuddin M; Khan, Wajahatullah; Ba-Hammam, Faisal A; Abalkhail, Hala; Alashwal, Abdullah

    2015-07-01

    Familial hypercholesterolemia (FH) is an autosomal dominant inherited disease characterized by elevated plasma low-density lipoprotein cholesterol (LDL-C). It is an autosomal dominant disease, caused by variants in Ldlr, ApoB or Pcsk9, which results in high levels of LDL-cholesterol (LDL-C) leading to early coronary heart disease. Sequencing whole genome for screening variants for FH are not suitable due to high cost. Hence, in this study we performed targeted customized sequencing of FH 12 genes (Ldlr, ApoB, Pcsk9, Abca1, Apoa2, Apoc3, Apon2, Arh, Ldlrap1, Apoc2, ApoE, and Lpl) that have been implicated in the homozygous phenotype of a proband pedigree to identify candidate variants by NGS Ion torrent PGM. Only three genes (Ldlr, ApoB, and Pcsk9) were found to be highly associated with FH based on the variant rate. The results showed that seven deleterious variants in Ldlr, ApoB, and Pcsk9 genes were pathological and were clinically significant based on predictions identified by SIFT and PolyPhen. Targeted customized sequencing is an efficient technique for screening variants among targeted FH genes. Final validation of seven deleterious variants conducted by capillary resulted to only one novel variant in Ldlr gene that was found in exon 14 (c.2026delG, p. Gly676fs). The variant found in Ldlr gene was a novel heterozygous variant derived from a male in the proband. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Directed targeting of chromatin to the nuclear lamina is mediated by chromatin state and A-type lamins.

    PubMed

    Harr, Jennifer C; Luperchio, Teresa Romeo; Wong, Xianrong; Cohen, Erez; Wheelan, Sarah J; Reddy, Karen L

    2015-01-05

    Nuclear organization has been implicated in regulating gene activity. Recently, large developmentally regulated regions of the genome dynamically associated with the nuclear lamina have been identified. However, little is known about how these lamina-associated domains (LADs) are directed to the nuclear lamina. We use our tagged chromosomal insertion site system to identify small sequences from borders of fibroblast-specific variable LADs that are sufficient to target these ectopic sites to the nuclear periphery. We identify YY1 (Ying-Yang1) binding sites as enriched in relocating sequences. Knockdown of YY1 or lamin A/C, but not lamin A, led to a loss of lamina association. In addition, targeted recruitment of YY1 proteins facilitated ectopic LAD formation dependent on histone H3 lysine 27 trimethylation and histone H3 lysine di- and trimethylation. Our results also reveal that endogenous loci appear to be dependent on lamin A/C, YY1, H3K27me3, and H3K9me2/3 for maintenance of lamina-proximal positioning. © 2015 Harr et al.

  16. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences

    PubMed Central

    Šulc, Miroslav; Marín, Ray M.; Robins, Harlan S.; Vaníček, Jiří

    2015-01-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3′ untranslated regions (3′ UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3′ UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA–mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA–mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. PMID:25948580

  17. Many si/shRNAs can kill cancer cells by targeting multiple survival genes through an off-target mechanism

    PubMed Central

    van Dongen, Stijn; Haluck-Kangas, Ashley; Sarshad, Aishe A; Bartom, Elizabeth T; Kim, Kwang-Youn A; Scholtens, Denise M; Hafner, Markus; Zhao, Jonathan C; Murmann, Andrea E

    2017-01-01

    Over 80% of multiple-tested siRNAs and shRNAs targeting CD95 or CD95 ligand (CD95L) induce a form of cell death characterized by simultaneous activation of multiple cell death pathways preferentially killing transformed and cancer stem cells. We now show these si/shRNAs kill cancer cells through canonical RNAi by targeting the 3’UTR of critical survival genes in a unique form of off-target effect we call DISE (death induced by survival gene elimination). Drosha and Dicer-deficient cells, devoid of most miRNAs, are hypersensitive to DISE, suggesting cellular miRNAs protect cells from this form of cell death. By testing 4666 shRNAs derived from the CD95 and CD95L mRNA sequences and an unrelated control gene, Venus, we have identified many toxic sequences - most of them located in the open reading frame of CD95L. We propose that specific toxic RNAi-active sequences present in the genome can kill cancer cells. PMID:29063830

  18. The small RNA profile in latex from Hevea brasiliensis trees is affected by tapping panel dryness.

    PubMed

    Gébelin, Virginie; Leclercq, Julie; Kuswanhadi; Argout, Xavier; Chaidamsari, Tetty; Hu, Songnian; Tang, Chaorong; Sarah, Gautier; Yang, Meng; Montoro, Pascal

    2013-10-01

    Natural rubber is harvested by tapping Hevea brasiliensis (Willd. ex A. Juss.) Müll. Arg. Harvesting stress can lead to tapping panel dryness (TPD). MicroRNAs (miRNAs) are induced by abiotic stress and regulate gene expression by targeting the cleavage or translational inhibition of target messenger RNAs. This study set out to sequence miRNAs expressed in latex cells and to identify TPD-related putative targets. Deep sequencing of small RNAs was carried out on latex from trees affected by TPD using Solexa technology. The most abundant small RNA class size was 21 nucleotides for TPD trees compared with 24 nucleotides in healthy trees. By combining the LeARN pipeline, data from the Plant MicroRNA database and Hevea EST sequences, we identified 19 additional conserved and four putative species-specific miRNA families not found in previous studies on rubber. The relative transcript abundance of the Hbpre-MIR159b gene increased with TPD. This study revealed a small RNA-specific signature of TPD-affected trees. Both RNA degradation and a shift in miRNA biogenesis are suggested to explain the general decline in small RNAs and, particularly, in miRNAs.

  19. Bioinformatics prediction of siRNAs as potential antiviral agents against dengue viruses

    PubMed Central

    Villegas-Rosales, Paula M; Méndez-Tenorio, Alfonso; Ortega-Soto, Elizabeth; Barrón, Blanca L

    2012-01-01

    Dengue virus (DENV 1-4) represents the major emerging arthropod-borne viral infection in the world. Currently, there is neither an available vaccine nor a specific treatment. Hence, there is a need of antiviral drugs for these viral infections; we describe the prediction of short interfering RNA (siRNA) as potential therapeutic agents against the four DENV serotypes. Our strategy was to carry out a series of multiple alignments using ClustalX program to find conserved sequences among the four DENV serotype genomes to obtain a consensus sequence for siRNAs design. A highly conserved sequence among the four DENV serotypes, located in the encoding sequence for NS4B and NS5 proteins was found. A total of 2,893 complete DENV genomes were downloaded from the NCBI, and after a depuration procedure to identify identical sequences, 220 complete DENV genomes were left. They were edited to select the NS4B and NS5 sequences, which were aligned to obtain a consensus sequence. Three different servers were used for siRNA design, and the resulting siRNAs were aligned to identify the most prevalent sequences. Three siRNAs were chosen, one targeted the genome region that codifies for NS4B protein and the other two; the region for NS5 protein. Predicted secondary structure for DENV genomes was used to demonstrate that the siRNAs were able to target the viral genome forming double stranded structures, necessary to activate the RNA silencing machinery. PMID:22829722

  20. A screen of cell-surface molecules identifies leucine-rich repeat proteins as key mediators of synaptic target selection in the Drosophila neuromuscular system

    PubMed Central

    Kurusu, Mitsuhiko; Cording, Amy; Taniguchi, Misako; Menon, Kaushiki; Suzuki, Emiko; Zinn, Kai

    2008-01-01

    Summary In Drosophila embryos and larvae, a small number of identified motor neurons innervate body wall muscles in a highly stereotyped pattern. Although genetic screens have identified many proteins that are required for axon guidance and synaptogenesis in this system, little is known about the mechanisms by which muscle fibers are defined as targets for specific motor axons. To identify potential target labels, we screened 410 genes encoding cell-surface and secreted proteins, searching for those whose overexpression on all muscle fibers causes motor axons to make targeting errors. Thirty such genes were identified, and a number of these were members of a large gene family encoding proteins whose extracellular domains contain leucine-rich repeat (LRR) sequences, which are protein interaction modules. By manipulating gene expression in muscle 12, we showed that four LRR proteins participate in the selection of this muscle as the appropriate synaptic target for the RP5 motor neuron. PMID:18817735

  1. Deep sequencing of the LRRK2 gene in 14,002 individuals reveals evidence of purifying selection and independent origin of the p.Arg1628Pro mutation in Europe

    PubMed Central

    Rubio, Justin P.; Topp, Simon; Warren, Liling; St Jean, Pamela L.; Wegmann, Daniel; Kessner, Darren; Novembre, John; Shen, Judong; Fraser, Dana; Aponte, Jennifer; Nangle, Keith; Cardon, Lon R.; Ehm, Margaret G.; Chissoe, Stephanie L.; Whittaker, John C.; Nelson, Matthew R.; Mooser, Vincent E.

    2012-01-01

    Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single nucleotide variants (SNVs), 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Amongst Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci. PMID:22415848

  2. Deep sequencing of the LRRK2 gene in 14,002 individuals reveals evidence of purifying selection and independent origin of the p.Arg1628Pro mutation in Europe.

    PubMed

    Rubio, Justin P; Topp, Simon; Warren, Liling; St Jean, Pamela L; Wegmann, Daniel; Kessner, Darren; Novembre, John; Shen, Judong; Fraser, Dana; Aponte, Jennifer; Nangle, Keith; Cardon, Lon R; Ehm, Margaret G; Chissoe, Stephanie L; Whittaker, John C; Nelson, Matthew R; Mooser, Vincent E

    2012-07-01

    Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single-nucleotide variants, 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Among Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci. © 2012 Wiley Periodicals, Inc.

  3. Apple miRNAs and tasiRNAs with novel regulatory networks

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) and their regulatory functions have been extensively characterized in model species but whether apple has evolved similar or unique regulatory features remains unknown. Results We performed deep small RNA-seq and identified 23 conserved, 10 less-conserved and 42 apple-specific miRNAs or families with distinct expression patterns. The identified miRNAs target 118 genes representing a wide range of enzymatic and regulatory activities. Apple also conserves two TAS gene families with similar but unique trans-acting small interfering RNA (tasiRNA) biogenesis profiles and target specificities. Importantly, we found that miR159, miR828 and miR858 can collectively target up to 81 MYB genes potentially involved in diverse aspects of plant growth and development. These miRNA target sites are differentially conserved among MYBs, which is largely influenced by the location and conservation of the encoded amino acid residues in MYB factors. Finally, we found that 10 of the 19 miR828-targeted MYBs undergo small interfering RNA (siRNA) biogenesis at the 3' cleaved, highly divergent transcript regions, generating over 100 sequence-distinct siRNAs that potentially target over 70 diverse genes as confirmed by degradome analysis. Conclusions Our work identified and characterized apple miRNAs, their expression patterns, targets and regulatory functions. We also discovered that three miRNAs and the ensuing siRNAs exploit both conserved and divergent sequence features of MYB genes to initiate distinct regulatory networks targeting a multitude of genes inside and outside the MYB family. PMID:22704043

  4. Whole-exome sequencing and targeted gene sequencing provide insights into the role of PALB2 as a male breast cancer susceptibility gene.

    PubMed

    Silvestri, Valentina; Zelli, Veronica; Valentini, Virginia; Rizzolo, Piera; Navazio, Anna Sara; Coppa, Anna; Agata, Simona; Oliani, Cristina; Barana, Daniela; Castrignanò, Tiziana; Viel, Alessandra; Russo, Antonio; Tibiletti, Maria Grazia; Zanna, Ines; Masala, Giovanna; Cortesi, Laura; Manoukian, Siranoush; Azzollini, Jacopo; Peissel, Bernard; Bonanni, Bernardo; Peterlongo, Paolo; Radice, Paolo; Palli, Domenico; Giannini, Giuseppe; Chillemi, Giovanni; Montagna, Marco; Ottini, Laura

    2017-01-01

    Male breast cancer (MBC) is a rare disease whose etiology appears to be largely associated with genetic factors. BRCA1 and BRCA2 mutations account for about 10% of all MBC cases. Thus, a fraction of MBC cases are expected to be due to genetic factors not yet identified. To further explain the genetic susceptibility for MBC, whole-exome sequencing (WES) and targeted gene sequencing were applied to high-risk, BRCA1/2 mutation-negative MBC cases. Germ-line DNA of 1 male and 2 female BRCA1/2 mutation-negative breast cancer (BC) cases from a pedigree showing a first-degree family history of MBC was analyzed with WES. Targeted gene sequencing for the validation of WES results was performed for 48 high-risk, BRCA1/2 mutation-negative MBC cases from an Italian multicenter study of MBC. A case-control series of 433 BRCA1/2 mutation-negative MBC and female breast cancer (FBC) cases and 849 male and female controls was included in the study. WES in the family identified the partner and localizer of BRCA2 (PALB2) c.419delA truncating mutation carried by the proband, her father, and her paternal uncle (all affected with BC) and the N-acetyltransferase 1 (NAT1) c.97C>T nonsense mutation carried by the proband's maternal aunt. Targeted PALB2 sequencing detected the c.1984A>T nonsense mutation in 1 of the 48 BRCA1/2 mutation-negative MBC cases. NAT1 c.97C>T was not found in the case-control series. These results add strength to the evidence showing that PALB2 is involved in BC risk for both sexes and indicate that consideration should be given to clinical testing of PALB2 for BRCA1/2 mutation-negative families with multiple MBC and FBC cases. Cancer 2017;123:210-218. © 2016 American Cancer Society. © 2016 American Cancer Society.

  5. Evaluating the Detection of Hydrocarbon-Degrading Bacteria in 16S rRNA Gene Sequencing Surveys

    PubMed Central

    Berry, David; Gutierrez, Tony

    2017-01-01

    Hydrocarbonoclastic bacteria (HCB) play a key role in the biodegradation of oil hydrocarbons in marine and other environments. A small number of taxa have been identified as obligate HCB, notably the Gammaproteobacterial genera Alcanivorax, Cycloclasticus, Marinobacter, Neptumonas, Oleiphilus, Oleispira, and Thalassolituus, as well as the Alphaproteobacterial genus Thalassospira. Detection of HCB in amplicon-based sequencing surveys relies on high coverage by PCR primers and accurate taxonomic classification. In this study, we performed a phylogenetic analysis to identify 16S rRNA gene sequence regions that represent the breadth of sequence diversity within these taxa. Using validated sequences, we evaluated 449 universal 16S rRNA gene-targeted bacterial PCR primer pairs for their coverage of these taxa. The results of this analysis provide a practical framework for selection of suitable primer sets for optimal detection of HCB in sequencing surveys. PMID:28567035

  6. Evaluating the Detection of Hydrocarbon-Degrading Bacteria in 16S rRNA Gene Sequencing Surveys.

    PubMed

    Berry, David; Gutierrez, Tony

    2017-01-01

    Hydrocarbonoclastic bacteria (HCB) play a key role in the biodegradation of oil hydrocarbons in marine and other environments. A small number of taxa have been identified as obligate HCB, notably the Gammaproteobacterial genera Alcanivorax, Cycloclasticus, Marinobacter, Neptumonas, Oleiphilus, Oleispira , and Thalassolituus , as well as the Alphaproteobacterial genus Thalassospira . Detection of HCB in amplicon-based sequencing surveys relies on high coverage by PCR primers and accurate taxonomic classification. In this study, we performed a phylogenetic analysis to identify 16S rRNA gene sequence regions that represent the breadth of sequence diversity within these taxa. Using validated sequences, we evaluated 449 universal 16S rRNA gene-targeted bacterial PCR primer pairs for their coverage of these taxa. The results of this analysis provide a practical framework for selection of suitable primer sets for optimal detection of HCB in sequencing surveys.

  7. Heterogeneous breakpoints in patients with acute lymphoblastic leukemia and the dic(9;20)(p11~13;q11) show recurrent involvement of genes at 20q11.21

    PubMed Central

    An, Qian; Wright, Sarah L.; Moorman, Anthony V.; Parker, Helen; Griffiths, Mike; Ross, Fiona M.; Davies, Teresa; Harrison, Christine J.; Strefford, Jon C.

    2009-01-01

    The dic(9;20)(p11~13;q11) is a recurrent chromosomal abnormality in patients with acute lymphoblastic leukemia. Although it results in loss of material from 9p and 20q, the molecular targets on both chromosomes have not been fully elucidated. From an initial cohort of 58 with acute lymphoblastic leukemia patients with this translocation, breakpoint mapping with fluorescence in situ hybridization on 26 of them revealed breakpoint heterogeneity of both chromosomes. PAX5 has been proposed to be the target gene on 9p, while for 20q, FISH analysis implicated the involvement of the ASXL1 gene, either by a breakpoint within (n=4) or centromeric (deletion, n=12) of the gene. Molecular copy-number counting, long-distance inverse PCR and direct sequence analysis identified six dic(9;20) breakpoint sequences. In addition to the three previously reported: PAX5-ASXL1, PAX5-C20ORF112 and PAX5-KIF3B; we identified three new ones in this study: sequences 3’ of PAX5 disrupting ASXL1, and ZCCHC7 disrupted by sequences 3’ of FRG1B and LOC1499503. This study provides insight into the breakpoint complexity underlying dicentric chromosomal formation in acute lymphoblastic leukemia and highlights putative target gene loci. PMID:19586940

  8. Identification of Five Novel Variants in Chinese Oculocutaneous Albinism by Targeted Next-Generation Sequencing.

    PubMed

    Qiu, Biyuan; Ma, Tao; Peng, Chunyan; Zheng, Xiaoqin; Yang, Jiyun

    2018-04-01

    The diagnosis of oculocutaneous albinism (OCA) is established using clinical signs and symptoms. OCA is, however, a highly genetically heterogeneous disease with mutations identified in at least nineteen unique genes, many of which produce overlapping phenotypic traits. Thus, differentiating genetic OCA subtypes for diagnoses and genetic counseling is challenging, based on clinical presentation alone, and would benefit from a comprehensive molecular diagnostic. To develop and validate a more comprehensive, targeted, next-generation-sequencing-based diagnostic for the identification of OCA-causing variants. The genomic DNA samples from 28 OCA probands were analyzed by targeted next-generation sequencing (NGS), and the candidate variants were confirmed through Sanger sequencing. We observed mutations in the TYR, OCA2, and SLC45A2 genes in 25/28 (89%) patients with OCA. We identified 38 pathogenic variants among these three genes, including 5 novel variants: c.1970G>T (p.Gly657Val), c.1669A>C (p.Thr557Pro), c.2339-2A>C, and c.1349C>G (p.Thr450Arg) in OCA2; c.459_470delTTTTGCTGCCGA (p.Ala155_Phe158del) in SLC45A2. Our findings expand the mutational spectrum of OCA in the Chinese population, and the assay we developed should be broadly useful as a molecular diagnostic, and as an aid for genetic counseling for OCA patients.

  9. Heterogeneous breakpoints in patients with acute lymphoblastic leukemia and the dic(9;20)(p11-13;q11) show recurrent involvement of genes at 20q11.21.

    PubMed

    An, Qian; Wright, Sarah L; Moorman, Anthony V; Parker, Helen; Griffiths, Mike; Ross, Fiona M; Davies, Teresa; Harrison, Christine J; Strefford, Jon C

    2009-08-01

    The dic(9;20)(p11-13;q11) is a recurrent chromosomal abnormality in patients with acute lymphoblastic leukemia. Although it results in loss of material from 9p and 20q, the molecular targets on both chromosomes have not been fully elucidated. From an initial cohort of 58 with acute lymphoblastic leukemia patients with this translocation, breakpoint mapping with fluorescence in situ hybridization on 26 of them revealed breakpoint heterogeneity of both chromosomes. PAX5 has been proposed to be the target gene on 9p, while for 20q, FISH analysis implicated the involvement of the ASXL1 gene, either by a breakpoint within (n=4) or centromeric (deletion, n=12) of the gene. Molecular copy-number counting, long-distance inverse PCR and direct sequence analysis identified six dic(9;20) breakpoint sequences. In addition to the three previously reported: PAX5-ASXL1, PAX5-C20ORF112 and PAX5-KIF3B; we identified three new ones in this study: sequences 3' of PAX5 disrupting ASXL1, and ZCCHC7 disrupted by sequences 3' of FRG1B and LOC1499503. This study provides insight into the breakpoint complexity underlying dicentric chromosomal formation in acute lymphoblastic leukemia and highlights putative target gene loci.

  10. StarScan: a web server for scanning small RNA targets from degradome sequencing data.

    PubMed

    Liu, Shun; Li, Jun-Hao; Wu, Jie; Zhou, Ke-Ren; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2015-07-01

    Endogenous small non-coding RNAs (sRNAs), including microRNAs, PIWI-interacting RNAs and small interfering RNAs, play important gene regulatory roles in animals and plants by pairing to the protein-coding and non-coding transcripts. However, computationally assigning these various sRNAs to their regulatory target genes remains technically challenging. Recently, a high-throughput degradome sequencing method was applied to identify biologically relevant sRNA cleavage sites. In this study, an integrated web-based tool, StarScan (sRNA target Scan), was developed for scanning sRNA targets using degradome sequencing data from 20 species. Given a sRNA sequence from plants or animals, our web server performs an ultrafast and exhaustive search for potential sRNA-target interactions in annotated and unannotated genomic regions. The interactions between small RNAs and target transcripts were further evaluated using a novel tool, alignScore. A novel tool, degradomeBinomTest, was developed to quantify the abundance of degradome fragments located at the 9-11th nucleotide from the sRNA 5' end. This is the first web server for discovering potential sRNA-mediated RNA cleavage events in plants and animals, which affords mechanistic insights into the regulatory roles of sRNAs. The StarScan web server is available at http://mirlab.sysu.edu.cn/starscan/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Aquaporin 2 of Rhipicephalus (Boophilus) microplus as a potential target to control ticks and tick-borne parasites

    USDA-ARS?s Scientific Manuscript database

    In a collaboration with Washington State University and ARS-Pullman, WA researchers, we identified and sequenced a 1,059 base pair Rhipicephalus microplus transcript that contained the coding region for a water channel protein, Aquaporin 2 (RmAQP2). The clone sequencing resulted in the production of...

  12. Transcriptome-wide identification of Rauvolfia serpentina microRNAs and prediction of their potential targets.

    PubMed

    Prakash, Pravin; Rajakani, Raja; Gupta, Vikrant

    2016-04-01

    MicroRNAs (miRNAs) are small non-coding RNAs of ∼ 19-24 nucleotides (nt) in length and considered as potent regulators of gene expression at transcriptional and post-transcriptional levels. Here we report the identification and characterization of 15 conserved miRNAs belonging to 13 families from Rauvolfia serpentina through in silico analysis of available nucleotide dataset. The identified mature R. serpentina miRNAs (rse-miRNAs) ranged between 20 and 22nt in length, and the average minimal folding free energy index (MFEI) value of rse-miRNA precursor sequences was found to be -0.815 kcal/mol. Using the identified rse-miRNAs as query, their potential targets were predicted in R. serpentina and other plant species. Gene Ontology (GO) annotation showed that predicted targets of rse-miRNAs include transcription factors as well as genes involved in diverse biological processes such as primary and secondary metabolism, stress response, disease resistance, growth, and development. Few rse-miRNAs were predicted to target genes of pharmaceutically important secondary metabolic pathways such as alkaloids and anthocyanin biosynthesis. Phylogenetic analysis showed the evolutionary relationship of rse-miRNAs and their precursor sequences to homologous pre-miRNA sequences from other plant species. The findings under present study besides giving first hand information about R. serpentina miRNAs and their targets, also contributes towards the better understanding of miRNA-mediated gene regulatory processes in plants. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Development of a polymerase chain reaction assay for the specific identification of Burkholderia mallei and differentiation from Burkholderia pseudomallei and other closely related Burkholderiaceae.

    PubMed

    Ulrich, Ricky L; Ulrich, Melanie P; Schell, Mark A; Kim, H Stanley; DeShazer, David

    2006-05-01

    Burkholderia mallei and Burkholderia pseudomallei, the etiologic agents responsible for glanders and melioidosis, respectively, are genetically and phenotypically similar and are category B biothreat agents. We used an in silico approach to compare the B. mallei ATCC 23344 and B. pseudomallei K96243 genomes to identify nucleotide sequences unique to B. mallei. Five distinct B. mallei DNA sequences and/or genes were identified and evaluated for polymerase chain reaction (PCR) assay development. Genomic DNAs from a collection of 31 B. mallei and 34 B. pseudomallei isolates, obtained from various geographic, clinical, and environmental sources over a 70-year period, were tested with PCR primers targeted for each of the B. mallei ATCC 23344-specific nucleotide sequences. Of the 5 chromosomal targets analyzed, only PCR primers designed to bimA(Bm) were specific for B. mallei. These primers were used to develop a rapid PCR assay for the definitive identification of B. mallei and differentiation from all other bacteria.

  14. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study.

    PubMed

    Dewey, Frederick E; Murray, Michael F; Overton, John D; Habegger, Lukas; Leader, Joseph B; Fetterolf, Samantha N; O'Dushlaine, Colm; Van Hout, Cristopher V; Staples, Jeffrey; Gonzaga-Jauregui, Claudia; Metpally, Raghu; Pendergrass, Sarah A; Giovanni, Monica A; Kirchner, H Lester; Balasubramanian, Suganthi; Abul-Husn, Noura S; Hartzel, Dustin N; Lavage, Daniel R; Kost, Korey A; Packer, Jonathan S; Lopez, Alexander E; Penn, John; Mukherjee, Semanti; Gosalia, Nehal; Kanagaraj, Manoj; Li, Alexander H; Mitnaul, Lyndon J; Adams, Lance J; Person, Thomas N; Praveen, Kavita; Marcketta, Anthony; Lebo, Matthew S; Austin-Tse, Christina A; Mason-Suares, Heather M; Bruse, Shannon; Mellis, Scott; Phillips, Robert; Stahl, Neil; Murphy, Andrew; Economides, Aris; Skelding, Kimberly A; Still, Christopher D; Elmore, James R; Borecki, Ingrid B; Yancopoulos, George D; Davis, F Daniel; Faucett, William A; Gottesman, Omri; Ritchie, Marylyn D; Shuldiner, Alan R; Reid, Jeffrey G; Ledbetter, David H; Baras, Aris; Carey, David J

    2016-12-23

    The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System couples high-throughput sequencing to an integrated health care system using longitudinal electronic health records (EHRs). We sequenced the exomes of 50,726 adult participants in the DiscovEHR study to identify ~4.2 million rare single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in a loss of gene function. Linking these data to EHR-derived clinical phenotypes, we find clinical associations supporting therapeutic targets, including genes encoding drug targets for lipid lowering, and identify previously unidentified rare alleles associated with lipid levels and other blood level traits. About 3.5% of individuals harbor deleterious variants in 76 clinically actionable genes. The DiscovEHR data set provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic discovery. Copyright © 2016, American Association for the Advancement of Science.

  15. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute

    PubMed Central

    Islam, Md. Tariqul; Ferdous, Ahlan Sabah; Najnin, Rifat Ara; Sarker, Suprovath Kumar; Khan, Haseena

    2015-01-01

    MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops. PMID:25861616

  16. Genome-wide identification of conserved microRNA and their response to drought stress in Dongxiang wild rice (Oryza rufipogon Griff.).

    PubMed

    Zhang, Fantao; Luo, Xiangdong; Zhou, Yi; Xie, Jiankun

    2016-04-01

    To identify drought stress-responsive conserved microRNA (miRNA) from Dongxiang wild rice (Oryza rufipogon Griff., DXWR) on a genome-wide scale, high-throughput sequencing technology was used to sequence libraries of DXWR samples, treated with and without drought stress. 505 conserved miRNAs corresponding to 215 families were identified. 17 were significantly down-regulated and 16 were up-regulated under drought stress. Stem-loop qRT-PCR revealed the same expression patterns as high-throughput sequencing, suggesting the accuracy of the sequencing result was high. Potential target genes of the drought-responsive miRNA were predicted to be involved in diverse biological processes. Furthermore, 16 miRNA families were first identified to be involved in drought stress response from plants. These results present a comprehensive view of the conserved miRNA and their expression patterns under drought stress for DXWR, which will provide valuable information and sequence resources for future basis studies.

  17. Probable Diagnosis of a Patient with Niemann-Pick Disease Type C: Managing Pitfalls of Exome Sequencing.

    PubMed

    Zeiger, William A; Jamal, Nasheed I; Scheuner, Maren T; Pittman, Patricia; Raymond, Kimiyo M; Morra, Massimo; Mishra, Shri K

    2018-02-17

    Here, we present a case of a 31-year-old man with progressive cognitive decline, ataxia, and dystonia. Extensive laboratory, radiographic, and targeted genetic studies over the course of several years failed to yield a diagnosis. Initial whole exome sequencing through a commercial laboratory identified several variants of uncertain significance; however, follow-up clinical examination and testing ruled each of these out. Eventually, repeat whole exome sequencing identified a known pathogenic intronic variant in the NPC1 gene (NM_000271.4, c.1554-1009G>A) and an additional heterozygous exonic variant of uncertain significance in the NPC1 gene (NM_000271.4, c.2524T>C). Follow-up biochemical testing was consistent with a diagnosis of probable Niemann-Pick disease Type C (NP-C). This case illustrates the potential of whole exome sequencing for diagnosing rare complex neurologic diseases. It also identifies several potential common pitfalls that must be navigated by clinicians when interpreting commercial whole exome sequencing results.

  18. Geoseq: a tool for dissecting deep-sequencing datasets.

    PubMed

    Gurtowski, James; Cancio, Anthony; Shah, Hardik; Levovitz, Chaya; George, Ajish; Homann, Robert; Sachidanandam, Ravi

    2010-10-12

    Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  19. Circular RNA expression in basal cell carcinoma.

    PubMed

    Sand, Michael; Bechara, Falk G; Sand, Daniel; Gambichler, Thilo; Hahn, Stephan A; Bromba, Michael; Stockfleth, Eggert; Hessam, Schapoor

    2016-05-01

    Circular RNAs (circRNAs), are nonprotein coding RNAs consisting of a circular loop with multiple miRNA, binding sites called miRNA response elements (MREs), functioning as miRNA sponges. This study was performed to identify differentially expressed circRNAs and their MREs in basal cell carcinoma (BCC). Microarray circRNA expression profiles were acquired from BCC and control followed by qRT-PCR validation. Bioinformatical target prediction revealed multiple MREs. Sequence analysis was performed concerning MRE interaction potential with the BCC miRNome. We identified 23 upregulated and 48 downregulated circRNAs with 354 miRNA response elements capable of sequestering miRNA target sequences of the BCC miRNome. The present study describes a variety of circRNAs that are potentially involved in the molecular pathogenesis of BCC.

  20. GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases.

    PubMed

    Zhu, Lihua Julie; Lawrence, Michael; Gupta, Ankit; Pagès, Hervé; Kucukural, Alper; Garber, Manuel; Wolfe, Scot A

    2017-05-15

    Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity. The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html .

  1. Targeted next generation sequencing of the entire vitamin D receptor gene reveals polymorphisms correlated with vitamin D deficiency among older Filipino women with and without fragility fracture.

    PubMed

    Zumaraga, Mark Pretzel; Medina, Paul Julius; Recto, Juan Miguel; Abrahan, Lauro; Azurin, Edelyn; Tanchoco, Celeste C; Jimeno, Cecilia A; Palmes-Saloma, Cynthia

    2017-03-01

    This study aimed to discover genetic variants in the entire 101 kB vitamin D receptor (VDR) gene for vitamin D deficiency in a group of postmenopausal Filipino women using targeted next generation sequencing (TNGS) approach in a case-control study design. A total of 50 women with and without osteoporotic fracture seen at the Philippine Orthopedic Center were included. Blood samples were collected for determination of serum vitamin D, calcium, phosphorus, glucose, blood urea nitrogen, creatinine, aspartate aminotransferase, alanine aminotransferase and as primary source for targeted VDR gene sequencing using the Ion Torrent Personal Genome Machine. The variant calling was based on the GATK best practice workflow and annotated using Annovar tool. A total of 1496 unique variants in the whole 101-kb VDR gene were identified. Novel sequence variations not registered in the dbSNP database were found among cases and controls at a rate of 23.1% and 16.6% of total discovered variants, respectively. One disease-associated enhancer showed statistically significant association to low serum 25-hydroxy vitamin D levels (Pearson chi-square P-value=0.009). The transcription factor binding site prediction program PROMO predicted the disruption of three transcription factor binding sites in this enhancer region. These findings show the power of TNGS in identifying sequence variations in a very large gene and the surprising results obtained in this study greatly expand the catalog of known VDR sequence variants that may represent an important clue in the emergence of vitamin D deficiency. Such information will also provide the additional guidance necessary toward a personalized nutritional advice to reach sufficient vitamin D status. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Recent research on the high-probability instructional sequence: A brief review.

    PubMed

    Lipschultz, Joshua; Wilder, David A

    2017-04-01

    The high-probability (high-p) instructional sequence consists of the delivery of a series of high-probability instructions immediately before delivery of a low-probability or target instruction. It is commonly used to increase compliance in a variety of populations. Recent research has described variations of the high-p instructional sequence and examined the conditions under which the sequence is most effective. This manuscript reviews the most recent research on the sequence and identifies directions for future research. Recommendations for practitioners regarding the use of the high-p instructional sequence are also provided. © 2017 Society for the Experimental Analysis of Behavior.

  3. A dual selection based, targeted gene replacement tool for Magnaporthe grisea and Fusarium oxysporum.

    PubMed

    Khang, Chang Hyun; Park, Sook-Young; Lee, Yong-Hwan; Kang, Seogchan

    2005-06-01

    Rapid progress in fungal genome sequencing presents many new opportunities for functional genomic analysis of fungal biology through the systematic mutagenesis of the genes identified through sequencing. However, the lack of efficient tools for targeted gene replacement is a limiting factor for fungal functional genomics, as it often necessitates the screening of a large number of transformants to identify the desired mutant. We developed an efficient method of gene replacement and evaluated factors affecting the efficiency of this method using two plant pathogenic fungi, Magnaporthe grisea and Fusarium oxysporum. This method is based on Agrobacterium tumefaciens-mediated transformation with a mutant allele of the target gene flanked by the herpes simplex virus thymidine kinase (HSVtk) gene as a conditional negative selection marker against ectopic transformants. The HSVtk gene product converts 5-fluoro-2'-deoxyuridine to a compound toxic to diverse fungi. Because ectopic transformants express HSVtk, while gene replacement mutants lack HSVtk, growing transformants on a medium amended with 5-fluoro-2'-deoxyuridine facilitates the identification of targeted mutants by counter-selecting against ectopic transformants. In addition to M. grisea and F. oxysporum, the method and associated vectors are likely to be applicable to manipulating genes in a broad spectrum of fungi, thus potentially serving as an efficient, universal functional genomic tool for harnessing the growing body of fungal genome sequence data to study fungal biology.

  4. Discovery and characterization of 3000+ main-sequence binaries from APOGEE spectra

    NASA Astrophysics Data System (ADS)

    El-Badry, Kareem; Ting, Yuan-Sen; Rix, Hans-Walter; Quataert, Eliot; Weisz, Daniel R.; Cargile, Phillip; Conroy, Charlie; Hogg, David W.; Bergemann, Maria; Liu, Chao

    2018-05-01

    We develop a data-driven spectral model for identifying and characterizing spatially unresolved multiple-star systems and apply it to APOGEE DR13 spectra of main-sequence stars. Binaries and triples are identified as targets whose spectra can be significantly better fit by a superposition of two or three model spectra, drawn from the same isochrone, than any single-star model. From an initial sample of ˜20 000 main-sequence targets, we identify ˜2500 binaries in which both the primary and secondary stars contribute detectably to the spectrum, simultaneously fitting for the velocities and stellar parameters of both components. We additionally identify and fit ˜200 triple systems, as well as ˜700 velocity-variable systems in which the secondary does not contribute detectably to the spectrum. Our model simplifies the process of simultaneously fitting single- or multi-epoch spectra with composite models and does not depend on a velocity offset between the two components of a binary, making it sensitive to traditionally undetectable systems with periods of hundreds or thousands of years. In agreement with conventional expectations, almost all the spectrally identified binaries with measured parallaxes fall above the main sequence in the colour-magnitude diagram. We find excellent agreement between spectrally and dynamically inferred mass ratios for the ˜600 binaries in which a dynamical mass ratio can be measured from multi-epoch radial velocities. We obtain full orbital solutions for 64 systems, including 14 close binaries within hierarchical triples. We make available catalogues of stellar parameters, abundances, mass ratios, and orbital parameters.

  5. Quantitative molecular diagnostic assays of grain washes for Claviceps purpurea are correlated with visual determinations of ergot contamination.

    PubMed

    Comte, Alexia; Gräfenhan, Tom; Links, Matthew G; Hemmingsen, Sean M; Dumonceaux, Tim J

    2017-01-01

    We examined the epiphytic microbiome of cereal grain using the universal barcode chaperonin-60 (cpn60). Microbial community profiling of seed washes containing DNA extracts prepared from field-grown cereal grain detected sequences from a fungus identified only to Class Sordariomycetes. To identify the fungal sequence and to improve the reference database, we determined cpn60 sequences from field-collected and reference strains of the ergot fungus, Claviceps purpurea. These data allowed us to identify this fungal sequence as deriving from C. purpurea, and suggested that C. purpurea DNA is readily detectable on agricultural commodities, including those for which ergot was not identified as a grading factor. To get a sense of the prevalence and level of C. purpurea DNA in cereal grains, we developed a quantitative PCR assay based on the fungal internal transcribed spacer (ITS) and applied it to 137 samples from the 2014 crop year. The amount of Claviceps DNA quantified correlated strongly with the proportion of ergot sclerotia identified in each grain lot, although there was evidence that non-target organisms were responsible for some false positives with the ITS-based assay. We therefore developed a cpn60-targeted loop-mediated isothermal amplification assay and applied it to the same grain wash samples. The time to positive displayed a significant, inverse correlation to ergot levels determined by visual ratings. These results indicate that both laboratory-based and field-adaptable molecular diagnostic assays can be used to detect and quantify pathogen load in bulk commodities using cereal grain washes.

  6. Quantitative molecular diagnostic assays of grain washes for Claviceps purpurea are correlated with visual determinations of ergot contamination

    PubMed Central

    Comte, Alexia; Gräfenhan, Tom; Links, Matthew G.; Hemmingsen, Sean M.

    2017-01-01

    We examined the epiphytic microbiome of cereal grain using the universal barcode chaperonin-60 (cpn60). Microbial community profiling of seed washes containing DNA extracts prepared from field-grown cereal grain detected sequences from a fungus identified only to Class Sordariomycetes. To identify the fungal sequence and to improve the reference database, we determined cpn60 sequences from field-collected and reference strains of the ergot fungus, Claviceps purpurea. These data allowed us to identify this fungal sequence as deriving from C. purpurea, and suggested that C. purpurea DNA is readily detectable on agricultural commodities, including those for which ergot was not identified as a grading factor. To get a sense of the prevalence and level of C. purpurea DNA in cereal grains, we developed a quantitative PCR assay based on the fungal internal transcribed spacer (ITS) and applied it to 137 samples from the 2014 crop year. The amount of Claviceps DNA quantified correlated strongly with the proportion of ergot sclerotia identified in each grain lot, although there was evidence that non-target organisms were responsible for some false positives with the ITS-based assay. We therefore developed a cpn60-targeted loop-mediated isothermal amplification assay and applied it to the same grain wash samples. The time to positive displayed a significant, inverse correlation to ergot levels determined by visual ratings. These results indicate that both laboratory-based and field-adaptable molecular diagnostic assays can be used to detect and quantify pathogen load in bulk commodities using cereal grain washes. PMID:28257512

  7. Pyrosequencing analysis of the gyrB gene to differentiate bacteria responsible for diarrheal diseases.

    PubMed

    Hou, X-L; Cao, Q-Y; Jia, H-Y; Chen, Z

    2008-07-01

    Pathogens causing acute diarrhea include a large variety of species from Enterobacteriaceae and Vibrionaceae. A method based on pyrosequencing was used here to differentiate bacteria commonly associated with diarrhea in China; the method is targeted to a partial amplicon of the gyrB gene, which encodes the B subunit of DNA gyrase. Twenty-eight specific polymorphic positions were identified from sequence alignment of a large sequence dataset and targeted using 17 sequencing primers. Of 95 isolates tested, belonging to 13 species within 7 genera, most could be identified to the species level; O157 type could be differentiated from other E. coli types; Salmonella enterica subsp. enterica could be identified at the serotype level; the genus Shigella, except for S. boydii and S. dysenteriae, could also be identified. All these isolates were also subjected to conventional sequencing of a relatively long ( approximately1.2 kb) region of gyrB DNA; these results confirmed those with pyrosequencing. Twenty-two fecal samples were surveyed, the results of which were concordant with culture-based bacterial identification, and the pathogen detection limit with simulated stool specimens was 10(4) CFU/ml. DNA from different pathogens was also mixed to simulate a case of multibacterial infection, and the generated signals correlated well with the mix ratio. In summary, the gyrB-based pyrosequencing approach proved to have significant reliability and discriminatory power for enteropathogenic bacterial identification and provided a fast and effective method for clinical diagnosis.

  8. Uncovering leaf rust responsive miRNAs in wheat (Triticum aestivum L.) using high-throughput sequencing and prediction of their targets through degradome analysis.

    PubMed

    Kumar, Dhananjay; Dutta, Summi; Singh, Dharmendra; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal

    2017-01-01

    Deep sequencing identified 497 conserved and 559 novel miRNAs in wheat, while degradome analysis revealed 701 targets genes. QRT-PCR demonstrated differential expression of miRNAs during stages of leaf rust progression. Bread wheat (Triticum aestivum L.) is an important cereal food crop feeding 30 % of the world population. Major threat to wheat production is the rust epidemics. This study was targeted towards identification and functional characterizations of micro(mi)RNAs and their target genes in wheat in response to leaf rust ingression. High-throughput sequencing was used for transcriptome-wide identification of miRNAs and their expression profiling in retort to leaf rust using mock and pathogen-inoculated resistant and susceptible near-isogenic wheat plants. A total of 1056 mature miRNAs were identified, of which 497 miRNAs were conserved and 559 miRNAs were novel. The pathogen-inoculated resistant plants manifested more miRNAs compared with the pathogen infected susceptible plants. The miRNA counts increased in susceptible isoline due to leaf rust, conversely, the counts decreased in the resistant isoline in response to pathogenesis illustrating precise spatial tuning of miRNAs during compatible and incompatible interaction. Stem-loop quantitative real-time PCR was used to profile 10 highly differentially expressed miRNAs obtained from high-throughput sequencing data. The spatio-temporal profiling validated the differential expression of miRNAs between the isolines as well as in retort to pathogen infection. Degradome analysis provided 701 predicted target genes associated with defense response, signal transduction, development, metabolism, and transcriptional regulation. The obtained results indicate that wheat isolines employ diverse arrays of miRNAs that modulate their target genes during compatible and incompatible interaction. Our findings contribute to increase knowledge on roles of microRNA in wheat-leaf rust interactions and could help in rust resistance breeding programs.

  9. High-Throughput Sequencing of Arabidopsis microRNAs: Evidence for Frequent Birth and Death of MIRNA Genes

    PubMed Central

    Fahlgren, Noah; Howell, Miya D.; Kasschau, Kristin D.; Chapman, Elisabeth J.; Sullivan, Christopher M.; Cumbie, Jason S.; Givan, Scott A.; Law, Theresa F.; Grant, Sarah R.; Dangl, Jeffery L.; Carrington, James C.

    2007-01-01

    In plants, microRNAs (miRNAs) comprise one of two classes of small RNAs that function primarily as negative regulators at the posttranscriptional level. Several MIRNA genes in the plant kingdom are ancient, with conservation extending between angiosperms and the mosses, whereas many others are more recently evolved. Here, we use deep sequencing and computational methods to identify, profile and analyze non-conserved MIRNA genes in Arabidopsis thaliana. 48 non-conserved MIRNA families, nearly all of which were represented by single genes, were identified. Sequence similarity analyses of miRNA precursor foldback arms revealed evidence for recent evolutionary origin of 16 MIRNA loci through inverted duplication events from protein-coding gene sequences. Interestingly, these recently evolved MIRNA genes have taken distinct paths. Whereas some non-conserved miRNAs interact with and regulate target transcripts from gene families that donated parental sequences, others have drifted to the point of non-interaction with parental gene family transcripts. Some young MIRNA loci clearly originated from one gene family but form miRNAs that target transcripts in another family. We suggest that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks. PMID:17299599

  10. Genetic Variants Identified from Epilepsy of Unknown Etiology in Chinese Children by Targeted Exome Sequencing

    PubMed Central

    Wang, Yimin; Du, Xiaonan; Bin, Rao; Yu, Shanshan; Xia, Zhezhi; Zheng, Guo; Zhong, Jianmin; Zhang, Yunjian; Jiang, Yong-hui; Wang, Yi

    2017-01-01

    Genetic factors play a major role in the etiology of epilepsy disorders. Recent genomics studies using next generation sequencing (NGS) technique have identified a large number of genetic variants including copy number (CNV) and single nucleotide variant (SNV) in a small set of genes from individuals with epilepsy. These discoveries have contributed significantly to evaluate the etiology of epilepsy in clinic and lay the foundation to develop molecular specific treatment. However, the molecular basis for a majority of epilepsy patients remains elusive, and furthermore, most of these studies have been conducted in Caucasian children. Here we conducted a targeted exome-sequencing of 63 trios of Chinese epilepsy families using a custom-designed NGS panel that covers 412 known and candidate genes for epilepsy. We identified pathogenic and likely pathogenic variants in 15 of 63 (23.8%) families in known epilepsy genes including SCN1A, CDKL5, STXBP1, CHD2, SCN3A, SCN9A, TSC2, MBD5, POLG and EFHC1. More importantly, we identified likely pathologic variants in several novel candidate genes such as GABRE, MYH1, and CLCN6. Our results provide the evidence supporting the application of custom-designed NGS panel in clinic and indicate a conserved genetic susceptibility for epilepsy between Chinese and Caucasian children. PMID:28074849

  11. Identification of evolutionarily conserved Momordica charantia microRNAs using computational approach and its utility in phylogeny analysis.

    PubMed

    Thirugnanasambantham, Krishnaraj; Saravanan, Subramanian; Karikalan, Kulandaivelu; Bharanidharan, Rajaraman; Lalitha, Perumal; Ilango, S; HairulIslam, Villianur Ibrahim

    2015-10-01

    Momordica charantia (bitter gourd, bitter melon) is a monoecious Cucurbitaceae with anti-oxidant, anti-microbial, anti-viral and anti-diabetic potential. Molecular studies on this economically valuable plant are very essential to understand its phylogeny and evolution. MicroRNAs (miRNAs) are conserved, small, non-coding RNA with ability to regulate gene expression by bind the 3' UTR region of target mRNA and are evolved at different rates in different plant species. In this study we have utilized homology based computational approach and identified 27 mature miRNAs for the first time from this bio-medically important plant. The phylogenetic tree developed from binary data derived from the data on presence/absence of the identified miRNAs were noticed to be uncertain and biased. Most of the identified miRNAs were highly conserved among the plant species and sequence based phylogeny analysis of miRNAs resolved the above difficulties in phylogeny approach using miRNA. Predicted gene targets of the identified miRNAs revealed their importance in regulation of plant developmental process. Reported miRNAs held sequence conservation in mature miRNAs and the detailed phylogeny analysis of pre-miRNA sequences revealed genus specific segregation of clusters. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Identification, Expression Analysis, and Target Prediction of Flax Genotroph MicroRNAs Under Normal and Nutrient Stress Conditions

    PubMed Central

    Melnikova, Nataliya V.; Dmitriev, Alexey A.; Belenikin, Maxim S.; Koroban, Nadezhda V.; Speranskaya, Anna S.; Krinitsina, Anastasia A.; Krasnov, George S.; Lakunina, Valentina A.; Snezhkina, Anastasiya V.; Sadritdinova, Asiya F.; Kishlyan, Natalya V.; Rozhmina, Tatiana A.; Klimina, Kseniya M.; Amosova, Alexandra V.; Zelenin, Alexander V.; Muravenko, Olga V.; Bolsheva, Nadezhda L.; Kudryavtseva, Anna V.

    2016-01-01

    Cultivated flax (Linum usitatissimum L.) is an important plant valuable for industry. Some flax lines can undergo heritable phenotypic and genotypic changes (LIS-1 insertion being the most common) in response to nutrient stress and are called plastic lines. Offspring of plastic lines, which stably inherit the changes, are called genotrophs. MicroRNAs (miRNAs) are involved in a crucial regulatory mechanism of gene expression. They have previously been assumed to take part in nutrient stress response and can, therefore, participate in genotroph formation. In the present study, we performed high-throughput sequencing of small RNAs (sRNAs) extracted from flax plants grown under normal, phosphate deficient and nutrient excess conditions to identify miRNAs and evaluate their expression. Our analysis revealed expression of 96 conserved miRNAs from 21 families in flax. Moreover, 475 novel potential miRNAs were identified for the first time, and their targets were predicted. However, none of the identified miRNAs were transcribed from LIS-1. Expression of seven miRNAs (miR168, miR169, miR395, miR398, miR399, miR408, and lus-miR-N1) with up- or down-regulation under nutrient stress (on the basis of high-throughput sequencing data) was evaluated on extended sampling using qPCR. Reference gene search identified ETIF3H and ETIF3E genes as most suitable for this purpose. Down-regulation of novel potential lus-miR-N1 and up-regulation of conserved miR399 were revealed under the phosphate deficient conditions. In addition, the negative correlation of expression of lus-miR-N1 and its predicted target, ubiquitin-activating enzyme E1 gene, as well as, miR399 and its predicted target, ubiquitin-conjugating enzyme E2 gene, was observed. Thus, in our study, miRNAs expressed in flax plastic lines and genotrophs were identified and their expression and expression of their targets was evaluated using high-throughput sequencing and qPCR for the first time. These data provide new insights into nutrient stress response regulation in plastic flax cultivars. PMID:27092149

  13. Performance Comparison of Bench-Top Next Generation Sequencers Using Microdroplet PCR-Based Enrichment for Targeted Sequencing in Patients with Autism Spectrum Disorder

    PubMed Central

    Okamoto, Nobuhiko; Nakashima, Mitsuko; Tsurusaki, Yoshinori; Miyake, Noriko; Saitsu, Hirotomo; Matsumoto, Naomichi

    2013-01-01

    Next-generation sequencing (NGS) combined with enrichment of target genes enables highly efficient and low-cost sequencing of multiple genes for genetic diseases. The aim of this study was to validate the accuracy and sensitivity of our method for comprehensive mutation detection in autism spectrum disorder (ASD). We assessed the performance of the bench-top Ion Torrent PGM and Illumina MiSeq platforms as optimized solutions for mutation detection, using microdroplet PCR-based enrichment of 62 ASD associated genes. Ten patients with known mutations were sequenced using NGS to validate the sensitivity of our method. The overall read quality was better with MiSeq, largely because of the increased indel-related error associated with PGM. The sensitivity of SNV detection was similar between the two platforms, suggesting they are both suitable for SNV detection in the human genome. Next, we used these methods to analyze 28 patients with ASD, and identified 22 novel variants in genes associated with ASD, with one mutation detected by MiSeq only. Thus, our results support the combination of target gene enrichment and NGS as a valuable molecular method for investigating rare variants in ASD. PMID:24066114

  14. Greater than the sum of its parts: single-nucleus sequencing identifies convergent evolution of independent EGFR mutants in GBM.

    PubMed

    Gini, Beatrice; Mischel, Paul S

    2014-08-01

    Single-cell sequencing approaches are needed to characterize the genomic diversity of complex tumors, shedding light on their evolutionary paths and potentially suggesting more effective therapies. In this issue of Cancer Discovery, Francis and colleagues develop a novel integrative approach to identify distinct tumor subpopulations based on joint detection of clonal and subclonal events from bulk tumor and single-nucleus whole-genome sequencing, allowing them to infer a subclonal architecture. Surprisingly, the authors identify convergent evolution of multiple, mutually exclusive, independent EGFR gain-of-function variants in a single tumor. This study demonstrates the value of integrative single-cell genomics and highlights the biologic primacy of EGFR as an actionable target in glioblastoma. ©2014 American Association for Cancer Research.

  15. Linking maternal and somatic 5S rRNA types with different sequence-specific non-LTR retrotransposons

    PubMed Central

    Pagano, Johanna F.B.; Ensink, Wim A.; van Olst, Marina; van Leeuwen, Selina; Nehrdich, Ulrike; Zhu, Kongju; Spaink, Herman P.; Girard, Geneviève; Rauwerda, Han; Jonker, Martijs J.; Dekker, Rob J.

    2017-01-01

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo, and adult tissue identified maternal-type 5S rRNA that is exclusively accumulated during oogenesis, replaced throughout the embryogenesis by a somatic-type, and thus virtually absent in adult somatic tissue. The maternal-type 5S rDNA contains several thousands of gene copies on chromosome 4 in tandem repeats with small intergenic regions, whereas the somatic-type is present in only 12 gene copies on chromosome 18 with large intergenic regions. The nine-nucleotide variation between the two 5S rRNA types likely affects TFIII binding and riboprotein L5 binding, probably leading to storage of maternal-type rRNA. Remarkably, these sequence differences are located exactly at the sequence-specific target site for genome integration by the 5S rRNA-specific Mutsu retrotransposon family. Thus, we could define maternal- and somatic-type MutsuDr subfamilies. Furthermore, we identified four additional maternal-type and two new somatic-type MutsuDr subfamilies, each with their own target sequence. This target-site specificity, frequently intact maternal-type retrotransposon elements, plus specific presence of Mutsu retrotransposon RNA and piRNA in egg and adult tissue, suggest an involvement of retrotransposons in achieving the differential copy number of the two types of 5S rDNA loci. PMID:28003516

  16. Identification of Mucorales From Clinical Specimens: A 4-Year Experience in a Single Institution

    PubMed Central

    Yang, Mina; Lee, Jang Ho; Kim, Young-Kwon; Ki, Chang-Seok

    2016-01-01

    Mucormycosis, a fatal opportunistic infection in immunocompromised hosts, is caused by fungi belonging to the order Mucorales. Early diagnosis based on exact identification and multidisciplinary treatments is critical. However, identification of Mucorales fungi is difficult and often delayed, resulting in poor prognosis. This study aimed to compare the results of phenotypic and molecular identification of 12 Mucorales isolates collected from 4-yr-accumulated data. All isolates were identified on the basis of phenotypic characteristics such as growth rate, colony morphology, and reproductive structures. PCR and direct sequencing were performed to target internal transcribed spacer (ITS) and/or D1/D2 regions. Target DNA sequencing identified five Lichtheimia isolates, two Rhizopus microsporus isolates, two Rhizomucor pusillus isolates, one Cunninghamella bertholletiae isolate, one Mucor fragilis isolate, and one Syncephalastrum racemosum isolate. Five of the 12 (41.7%) isolates were incorrectly identified on the basis of phenotypic identification. DNA sequencing showed that of these five isolates, two were Lichtheimia isolates, one was Mucor isolate, one was Rhizomucor isolate, and one was Rhizopus microspores. All the isolates were identified at the species level by ITS and/or D1/D2 analyses. Phenotypic differentiation and identification of Mucorales is difficult because different Mucorales share similar morphology. Our results indicate that the molecular methods employed in this study are valuable for identifying Mucorales. PMID:26522761

  17. Identification of mucorales from clinical specimens: a 4-year experience in a single institution.

    PubMed

    Yang, Mina; Lee, Jang Ho; Kim, Young Kwon; Ki, Chang Seok; Huh, Hee Jae; Lee, Nam Yong

    2016-01-01

    Mucormycosis, a fatal opportunistic infection in immunocompromised hosts, is caused by fungi belonging to the order Mucorales. Early diagnosis based on exact identification and multidisciplinary treatments is critical. However, identification of Mucorales fungi is difficult and often delayed, resulting in poor prognosis. This study aimed to compare the results of phenotypic and molecular identification of 12 Mucorales isolates collected from 4-yr-accumulated data. All isolates were identified on the basis of phenotypic characteristics such as growth rate, colony morphology, and reproductive structures. PCR and direct sequencing were performed to target internal transcribed spacer (ITS) and/or D1/D2 regions. Target DNA sequencing identified five Lichtheimia isolates, two Rhizopus microsporus isolates, two Rhizomucor pusillus isolates, one Cunninghamella bertholletiae isolate, one Mucor fragilis isolate, and one Syncephalastrum racemosum isolate. Five of the 12 (41.7%) isolates were incorrectly identified on the basis of phenotypic identification. DNA sequencing showed that of these five isolates, two were Lichtheimia isolates, one was Mucor isolate, one was Rhizomucor isolate, and one was Rhizopus microspores. All the isolates were identified at the species level by ITS and/or D1/D2 analyses. Phenotypic differentiation and identification of Mucorales is difficult because different Mucorales share similar morphology. Our results indicate that the molecular methods employed in this study are valuable for identifying Mucorales.

  18. Transcriptome Profiling of Antimicrobial Resistance in Pseudomonas aeruginosa.

    PubMed

    Khaledi, Ariane; Schniederjans, Monika; Pohl, Sarah; Rainer, Roman; Bodenhofer, Ulrich; Xia, Boyang; Klawonn, Frank; Bruchmann, Sebastian; Preusse, Matthias; Eckweiler, Denitsa; Dötsch, Andreas; Häussler, Susanne

    2016-08-01

    Emerging resistance to antimicrobials and the lack of new antibiotic drug candidates underscore the need for optimization of current diagnostics and therapies to diminish the evolution and spread of multidrug resistance. As the antibiotic resistance status of a bacterial pathogen is defined by its genome, resistance profiling by applying next-generation sequencing (NGS) technologies may in the future accomplish pathogen identification, prompt initiation of targeted individualized treatment, and the implementation of optimized infection control measures. In this study, qualitative RNA sequencing was used to identify key genetic determinants of antibiotic resistance in 135 clinical Pseudomonas aeruginosa isolates from diverse geographic and infection site origins. By applying transcriptome-wide association studies, adaptive variations associated with resistance to the antibiotic classes fluoroquinolones, aminoglycosides, and β-lactams were identified. Besides potential novel biomarkers with a direct correlation to resistance, global patterns of phenotype-associated gene expression and sequence variations were identified by predictive machine learning approaches. Our research serves to establish genotype-based molecular diagnostic tools for the identification of the current resistance profiles of bacterial pathogens and paves the way for faster diagnostics for more efficient, targeted treatment strategies to also mitigate the future potential for resistance evolution. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  19. Transcriptome Profiling of Antimicrobial Resistance in Pseudomonas aeruginosa

    PubMed Central

    Khaledi, Ariane; Schniederjans, Monika; Pohl, Sarah; Rainer, Roman; Bodenhofer, Ulrich; Xia, Boyang; Klawonn, Frank; Bruchmann, Sebastian; Preusse, Matthias; Eckweiler, Denitsa; Dötsch, Andreas

    2016-01-01

    Emerging resistance to antimicrobials and the lack of new antibiotic drug candidates underscore the need for optimization of current diagnostics and therapies to diminish the evolution and spread of multidrug resistance. As the antibiotic resistance status of a bacterial pathogen is defined by its genome, resistance profiling by applying next-generation sequencing (NGS) technologies may in the future accomplish pathogen identification, prompt initiation of targeted individualized treatment, and the implementation of optimized infection control measures. In this study, qualitative RNA sequencing was used to identify key genetic determinants of antibiotic resistance in 135 clinical Pseudomonas aeruginosa isolates from diverse geographic and infection site origins. By applying transcriptome-wide association studies, adaptive variations associated with resistance to the antibiotic classes fluoroquinolones, aminoglycosides, and β-lactams were identified. Besides potential novel biomarkers with a direct correlation to resistance, global patterns of phenotype-associated gene expression and sequence variations were identified by predictive machine learning approaches. Our research serves to establish genotype-based molecular diagnostic tools for the identification of the current resistance profiles of bacterial pathogens and paves the way for faster diagnostics for more efficient, targeted treatment strategies to also mitigate the future potential for resistance evolution. PMID:27216077

  20. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.

    PubMed

    Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A

    2014-10-01

    Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.

  1. In Vitro Selection for Small-Molecule-Triggered Strand Displacement and Riboswitch Activity.

    PubMed

    Martini, Laura; Meyer, Adam J; Ellefson, Jared W; Milligan, John N; Forlin, Michele; Ellington, Andrew D; Mansy, Sheref S

    2015-10-16

    An in vitro selection method for ligand-responsive RNA sensors was developed that exploited strand displacement reactions. The RNA library was based on the thiamine pyrophosphate (TPP) riboswitch, and RNA sequences capable of hybridizing to a target duplex DNA in a TPP regulated manner were identified. After three rounds of selection, RNA molecules that mediated a strand exchange reaction upon TPP binding were enriched. The enriched sequences also showed riboswitch activity. Our results demonstrated that small-molecule-responsive nucleic acid sensors can be selected to control the activity of target nucleic acid circuitry.

  2. Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease.

    PubMed

    Dilliott, Allison A; Farhan, Sali M K; Ghani, Mahdi; Sato, Christine; Liang, Eric; Zhang, Ming; McIntyre, Adam D; Cao, Henian; Racacho, Lemuel; Robinson, John F; Strong, Michael J; Masellis, Mario; Bulman, Dennis E; Rogaeva, Ekaterina; Lang, Anthony; Tartaglia, Carmela; Finger, Elizabeth; Zinman, Lorne; Turnbull, John; Freedman, Morris; Swartz, Rick; Black, Sandra E; Hegele, Robert A

    2018-04-04

    Next-generation sequencing (NGS) is quickly revolutionizing how research into the genetic determinants of constitutional disease is performed. The technique is highly efficient with millions of sequencing reads being produced in a short time span and at relatively low cost. Specifically, targeted NGS is able to focus investigations to genomic regions of particular interest based on the disease of study. Not only does this further reduce costs and increase the speed of the process, but it lessens the computational burden that often accompanies NGS. Although targeted NGS is restricted to certain regions of the genome, preventing identification of potential novel loci of interest, it can be an excellent technique when faced with a phenotypically and genetically heterogeneous disease, for which there are previously known genetic associations. Because of the complex nature of the sequencing technique, it is important to closely adhere to protocols and methodologies in order to achieve sequencing reads of high coverage and quality. Further, once sequencing reads are obtained, a sophisticated bioinformatics workflow is utilized to accurately map reads to a reference genome, to call variants, and to ensure the variants pass quality metrics. Variants must also be annotated and curated based on their clinical significance, which can be standardized by applying the American College of Medical Genetics and Genomics Pathogenicity Guidelines. The methods presented herein will display the steps involved in generating and analyzing NGS data from a targeted sequencing panel, using the ONDRISeq neurodegenerative disease panel as a model, to identify variants that may be of clinical significance.

  3. Profiling of potential driver mutations in sarcomas by targeted next generation sequencing.

    PubMed

    Andersson, Carola; Fagman, Henrik; Hansson, Magnus; Enlund, Fredrik

    2016-04-01

    Comprehensive genetic profiling by massively parallel sequencing, commonly known as next generation sequencing (NGS), is becoming the foundation of personalized oncology. For sarcomas very few targeted treatments are currently in routine use. In clinical practice the preoperative diagnostic workup of soft tissue tumours largely relies on core needle biopsies. Although mostly sufficient for histopathological diagnosis, only very limited amounts of formalin fixated paraffin embedded tissue are often available for predictive mutation analysis. Targeted NGS may thus open up new possibilities for comprehensive characterization of scarce biopsies. We therefore set out to search for driver mutations by NGS in a cohort of 55 clinically and morphologically well characterized sarcomas using low input of DNA from formalin fixated paraffin embedded tissues. The aim was to investigate if there are any recurrent or targetable aberrations in cancer driver genes in addition to known chromosome translocations in different types of sarcomas. We employed a panel covering 207 mutation hotspots in 50 cancer-associated genes to analyse DNA from nine gastrointestinal stromal tumours, 14 synovial sarcomas, seven myxoid liposarcomas, 22 Ewing sarcomas and three Ewing-like small round cell tumours at a large sequencing depth to detect also mutations that are subclonal or occur at low allele frequencies. We found nine mutations in eight different potential driver genes, some of which are potentially actionable by currently existing targeted therapies. Even though no recurrent mutations in driver genes were found in the different sarcoma groups, we show that targeted NGS-based sequencing is clearly feasible in a diagnostic setting with very limited amounts of paraffin embedded tissue and may provide novel insights into mesenchymal cell signalling and potentially druggable targets. Interestingly, we also identify five non-synonymous sequence variants in 4 established cancer driver genes in DNA from normal tissue from sarcoma patients that may possibly predispose or contribute to neoplastic development. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Clinical applicability and cost of a 46-gene panel for genomic analysis of solid tumours: Retrospective validation and prospective audit in the UK National Health Service.

    PubMed

    Hamblin, Angela; Wordsworth, Sarah; Fermont, Jilles M; Page, Suzanne; Kaur, Kulvinder; Camps, Carme; Kaisaki, Pamela; Gupta, Avinash; Talbot, Denis; Middleton, Mark; Henderson, Shirley; Cutts, Anthony; Vavoulis, Dimitrios V; Housby, Nick; Tomlinson, Ian; Taylor, Jenny C; Schuh, Anna

    2017-02-01

    Single gene tests to predict whether cancers respond to specific targeted therapies are performed increasingly often. Advances in sequencing technology, collectively referred to as next generation sequencing (NGS), mean the entire cancer genome or parts of it can now be sequenced at speed with increased depth and sensitivity. However, translation of NGS into routine cancer care has been slow. Healthcare stakeholders are unclear about the clinical utility of NGS and are concerned it could be an expensive addition to cancer diagnostics, rather than an affordable alternative to single gene testing. We validated a 46-gene hotspot cancer panel assay allowing multiple gene testing from small diagnostic biopsies. From 1 January 2013 to 31 December 2013, solid tumour samples (including non-small-cell lung carcinoma [NSCLC], colorectal carcinoma, and melanoma) were sequenced in the context of the UK National Health Service from 351 consecutively submitted prospective cases for which treating clinicians thought the patient had potential to benefit from more extensive genetic analysis. Following histological assessment, tumour-rich regions of formalin-fixed paraffin-embedded (FFPE) sections underwent macrodissection, DNA extraction, NGS, and analysis using a pipeline centred on Torrent Suite software. With a median turnaround time of seven working days, an integrated clinical report was produced indicating the variants detected, including those with potential diagnostic, prognostic, therapeutic, or clinical trial entry implications. Accompanying phenotypic data were collected, and a detailed cost analysis of the panel compared with single gene testing was undertaken to assess affordability for routine patient care. Panel sequencing was successful for 97% (342/351) of tumour samples in the prospective cohort and showed 100% concordance with known mutations (detected using cobas assays). At least one mutation was identified in 87% (296/342) of tumours. A locally actionable mutation (i.e., available targeted treatment or clinical trial) was identified in 122/351 patients (35%). Forty patients received targeted treatment, in 22/40 (55%) cases solely due to use of the panel. Examination of published data on the potential efficacy of targeted therapies showed theoretically actionable mutations (i.e., mutations for which targeted treatment was potentially appropriate) in 66% (71/107) and 39% (41/105) of melanoma and NSCLC patients, respectively. At a cost of £339 (US$449) per patient, the panel was less expensive locally than performing more than two or three single gene tests. Study limitations include the use of FFPE samples, which do not always provide high-quality DNA, and the use of "real world" data: submission of cases for sequencing did not always follow clinical guidelines, meaning that when mutations were detected, patients were not always eligible for targeted treatments on clinical grounds. This study demonstrates that more extensive tumour sequencing can identify mutations that could improve clinical decision-making in routine cancer care, potentially improving patient outcomes, at an affordable level for healthcare providers.

  5. Islander: A database of precisely mapped genomic islands in tRNA and tmRNA genes

    DOE PAGES

    Hudson, Corey M.; Lau, Britney Y.; Williams, Kelly P.

    2014-11-05

    Genomic islands are mobile DNAs that are major agents of bacterial and archaeal evolution. Integration into prokaryotic chromosomes usually occurs site-specifically at tRNA or tmRNA gene (together, tDNA) targets, catalyzed by tyrosine integrases. This splits the target gene, yet sequences within the island restore the disrupted gene; the regenerated target and its displaced fragment precisely mark the endpoints of the island. We applied this principle to search for islands in genomic DNA sequences. Our algorithm identifies tDNAs, finds fragments of those tDNAs in the same replicon and removes unlikely candidate islands through a series of filters. A search for islandsmore » in 2168 whole prokaryotic genomes produced 3919 candidates. The website Islander (recently moved to http://bioinformatics.sandia.gov/islander/) presents these precisely mapped candidate islands, the gene content and the island sequence. The algorithm further insists that each island encode an integrase, and attachment site sequence identity is carefully noted; therefore, the database also serves in the study of integrase site-specificity and its evolution.« less

  6. Molecular Characterization of Transgene Integration by Next-Generation Sequencing in Transgenic Cattle

    PubMed Central

    Zhang, Ran; Yin, Yinliang; Zhang, Yujun; Li, Kexin; Zhu, Hongxia; Gong, Qin; Wang, Jianwu; Hu, Xiaoxiang; Li, Ning

    2012-01-01

    As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integration events is limited, making it technically challenging to characterize transgenes. Next-generation sequencing has enabled cost-effective, routine and widespread high-throughput genomic analysis. Here, we demonstrate the use of next-generation sequencing to extensively characterize cattle harboring a 150-kb human lactoferrin transgene that was initially analyzed by chromosome walking without success. Using this approach, the sites upstream and downstream of the target gene integration site in the host genome were identified at the single nucleotide level. The sequencing result was verified by event-specific PCR for the integration sites and FISH for the chromosomal location. Sequencing depth analysis revealed that multiple copies of the incomplete target gene and the vector backbone were present in the host genome. Upon integration, complex recombination was also observed between the target gene and the vector backbone. These findings indicate that next-generation sequencing is a reliable and accurate approach for the molecular characterization of the transgene sequence, integration sites and copy number in transgenic species. PMID:23185606

  7. Rapid Identification of Cell-Specific, Internalizing RNA Aptamers with Bioinformatics Analyses of a Cell-Based Aptamer Selection

    PubMed Central

    Thiel, William H.; Bair, Thomas; Peek, Andrew S.; Liu, Xiuying; Dassie, Justin; Stockdale, Katie R.; Behlke, Mark A.; Miller, Francis J.; Giangrande, Paloma H.

    2012-01-01

    Background The broad applicability of RNA aptamers as cell-specific delivery tools for therapeutic reagents depends on the ability to identify aptamer sequences that selectively access the cytoplasm of distinct cell types. Towards this end, we have developed a novel approach that combines a cell-based selection method (cell-internalization SELEX) with high-throughput sequencing (HTS) and bioinformatics analyses to rapidly identify cell-specific, internalization-competent RNA aptamers. Methodology/Principal Findings We demonstrate the utility of this approach by enriching for RNA aptamers capable of selective internalization into vascular smooth muscle cells (VSMCs). Several rounds of positive (VSMCs) and negative (endothelial cells; ECs) selection were performed to enrich for aptamer sequences that preferentially internalize into VSMCs. To identify candidate RNA aptamer sequences, HTS data from each round of selection were analyzed using bioinformatics methods: (1) metrics of selection enrichment; and (2) pairwise comparisons of sequence and structural similarity, termed edit and tree distance, respectively. Correlation analyses of experimentally validated aptamers or rounds revealed that the best cell-specific, internalizing aptamers are enriched as a result of the negative selection step performed against ECs. Conclusions and Significance We describe a novel approach that combines cell-internalization SELEX with HTS and bioinformatics analysis to identify cell-specific, cell-internalizing RNA aptamers. Our data highlight the importance of performing a pre-clear step against a non-target cell in order to select for cell-specific aptamers. We expect the extended use of this approach to enable the identification of aptamers to a multitude of different cell types, thereby facilitating the broad development of targeted cell therapies. PMID:22962591

  8. Sequencing and Characterisation of an Extensive Atlantic Salmon (Salmo salar L.) MicroRNA Repertoire

    PubMed Central

    Bekaert, Michaël; Lowe, Natalie R.; Bishop, Stephen C.; Bron, James E.; Taggart, John B.; Houston, Ross D.

    2013-01-01

    Atlantic salmon (Salmo salar L.), a member of the family Salmonidae, is a totemic species of ecological and cultural significance that is also economically important in terms of both sports fisheries and aquaculture. These factors have promoted the continuous development of genomic resources for this species, furthering both fundamental and applied research. MicroRNAs (miRNA) are small endogenous non-coding RNA molecules that control spatial and temporal expression of targeted genes through post-transcriptional regulation. While miRNA have been characterised in detail for many other species, this is not yet the case for Atlantic salmon. To identify miRNAs from Atlantic salmon, we constructed whole fish miRNA libraries for 18 individual juveniles (fry, four months post hatch) and characterised them by Illumina high-throughput sequencing (total of 354,505,167 paired-ended reads). We report an extensive and partly novel repertoire of miRNA sequences, comprising 888 miRNA genes (547 unique mature miRNA sequences), quantify their expression levels in basal conditions, examine their homology to miRNAs from other species and identify their predicted target genes. We also identify the location and putative copy number of the miRNA genes in the draft Atlantic salmon reference genome sequence. The Atlantic salmon miRNAs experimentally identified in this study provide a robust large-scale resource for functional genome research in salmonids. There is an opportunity to explore the evolution of salmonid miRNAs following the relatively recent whole genome duplication event in salmonid species and to investigate the role of miRNAs in the regulation of gene expression in particular their contribution to variation in economically and ecologically important traits. PMID:23922936

  9. Genome-wide identification of conserved and novel microRNAs in one bud and two tender leaves of tea plant (Camellia sinensis) by small RNA sequencing, microarray-based hybridization and genome survey scaffold sequences.

    PubMed

    Jeyaraj, Anburaj; Zhang, Xiao; Hou, Yan; Shangguan, Mingzhu; Gajjeraman, Prabu; Li, Yeyun; Wei, Chaoling

    2017-11-21

    MicroRNAs (miRNAs) are important for plant growth and responses to environmental stresses via post-transcriptional regulation of gene expression. Tea, which is primarily produced from one bud and two tender leaves of the tea plant (Camellia sinensis), is one of the most popular non-alcoholic beverages worldwide owing to its abundance of secondary metabolites. A large number of miRNAs have been identified in various plants, including non-model species. However, due to the lack of reference genome sequences and/or information of tea plant genome survey scaffold sequences, discovery of miRNAs has been limited in C. sinensis. Using small RNA sequencing, combined with our recently obtained genome survey data, we have identified and analyzed 175 conserved and 83 novel miRNAs mainly in one bud and two tender leaves of the tea plant. Among these, 93 conserved and 18 novel miRNAs were validated using miRNA microarray hybridization. In addition, the expression pattern of 11 conserved and 8 novel miRNAs were validated by stem-loop-qRT-PCR. A total of 716 potential target genes of identified miRNAs were predicted. Further, Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that most of the target genes were primarily involved in stress response and enzymes related to phenylpropanoid biosynthesis. The predicted targets of 4 conserved miRNAs were further validated by 5'RLM-RACE. A negative correlation between expression profiles of 3 out of 4 conserved miRNAs (csn-miR160a-5p, csn-miR164a, csn-miR828 and csn-miR858a) and their targets (ARF17, NAC100, WER and MYB12 transcription factor) were observed. In summary, the present study is one of few such studies on miRNA detection and identification in the tea plant. The predicted target genes of majority of miRNAs encoded enzymes, transcription factors, and functional proteins. The miRNA-target transcription factor gene interactions may provide important clues about the regulatory mechanism of these miRNAs in the tea plant. The data reported in this study will make a huge contribution to knowledge on the potential miRNA regulators of the secondary metabolism pathway and other important biological processes in C. sinensis.

  10. Evaluation of targeted exome sequencing for 28 protein-based blood group systems, including the homologous gene systems, for blood group genotyping.

    PubMed

    Schoeman, Elizna M; Lopez, Genghis H; McGowan, Eunike C; Millard, Glenda M; O'Brien, Helen; Roulis, Eileen V; Liew, Yew-Wah; Martin, Jacqueline R; McGrath, Kelli A; Powley, Tanya; Flower, Robert L; Hyland, Catherine A

    2017-04-01

    Blood group single nucleotide polymorphism genotyping probes for a limited range of polymorphisms. This study investigated whether massively parallel sequencing (also known as next-generation sequencing), with a targeted exome strategy, provides an extended blood group genotype and the extent to which massively parallel sequencing correctly genotypes in homologous gene systems, such as RH and MNS. Donor samples (n = 28) that were extensively phenotyped and genotyped using single nucleotide polymorphism typing, were analyzed using the TruSight One Sequencing Panel and MiSeq platform. Genes for 28 protein-based blood group systems, GATA1, and KLF1 were analyzed. Copy number variation analysis was used to characterize complex structural variants in the GYPC and RH systems. The average sequencing depth per target region was 66.2 ± 39.8. Each sample harbored on average 43 ± 9 variants, of which 10 ± 3 were used for genotyping. For the 28 samples, massively parallel sequencing variant sequences correctly matched expected sequences based on single nucleotide polymorphism genotyping data. Copy number variation analysis defined the Rh C/c alleles and complex RHD hybrids. Hybrid RHD*D-CE-D variants were correctly identified, but copy number variation analysis did not confidently distinguish between D and CE exon deletion versus rearrangement. The targeted exome sequencing strategy employed extended the range of blood group genotypes detected compared with single nucleotide polymorphism typing. This single-test format included detection of complex MNS hybrid cases and, with copy number variation analysis, defined RH hybrid genes along with the RHCE*C allele hitherto difficult to resolve by variant detection. The approach is economical compared with whole-genome sequencing and is suitable for a red blood cell reference laboratory setting. © 2017 AABB.

  11. Use of amplicon sequencing to improve sensitivity in PCR-based detection of microbial pathogen in environmental samples.

    PubMed

    Saingam, Prakit; Li, Bo; Yan, Tao

    2018-06-01

    DNA-based molecular detection of microbial pathogens in complex environments is still plagued by sensitivity, specificity and robustness issues. We propose to address these issues by viewing them as inadvertent consequences of requiring specific and adequate amplification (SAA) of target DNA molecules by current PCR methods. Using the invA gene of Salmonella as the model system, we investigated if next generation sequencing (NGS) can be used to directly detect target sequences in false-negative PCR reaction (PCR-NGS) in order to remove the SAA requirement from PCR. False-negative PCR and qPCR reactions were first created using serial dilutions of laboratory-prepared Salmonella genomic DNA and then analyzed directly by NGS. Target invA sequences were detected in all false-negative PCR and qPCR reactions, which lowered the method detection limits near the theoretical minimum of single gene copy detection. The capability of the PCR-NGS approach in correcting false negativity was further tested and confirmed under more environmentally relevant conditions using Salmonella-spiked stream water and sediment samples. Finally, the PCR-NGS approach was applied to ten urban stream water samples and detected invA sequences in eight samples that would be otherwise deemed Salmonella negative. Analysis of the non-target sequences in the false-negative reactions helped to identify primer dime-like short sequences as the main cause of the false negativity. Together, the results demonstrated that the PCR-NGS approach can significantly improve method sensitivity, correct false-negative detections, and enable sequence-based analysis for failure diagnostics in complex environmental samples. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. Detection of canonical A-to-G editing events at 3′ UTRs and microRNA target sites in human lungs using next-generation sequencing

    PubMed Central

    Soundararajan, Ramani; Stearns, Timothy M.; Griswold, Anthony J.; Mehta, Arpit; Czachor, Alexander; Fukumoto, Jutaro; Lockey, Richard F.; King, Benjamin L.; Kolliputi, Narasaiah

    2015-01-01

    RNA editing is a post-transcriptional modification of RNA. The majority of these changes result from adenosine deaminase acting on RNA (ADARs) catalyzing the conversion of adenosine residues to inosine in double-stranded RNAs (dsRNAs). Massively parallel sequencing has enabled the identification of RNA editing sites in human transcriptomes. In this study, we sequenced DNA and RNA from human lungs and identified RNA editing sites with high confidence via a computational pipeline utilizing stringent analysis thresholds. We identified a total of 3,447 editing sites that overlapped in three human lung samples, and with 50% of these sites having canonical A-to-G base changes. Approximately 27% of the edited sites overlapped with Alu repeats, and showed A-to-G clustering (>3 clusters in 100 bp). The majority of edited sites mapped to either 3′ untranslated regions (UTRs) or introns close to splice sites; whereas, only few sites were in exons resulting in non-synonymous amino acid changes. Interestingly, we identified 652 A-to-G editing events in the 3′ UTR of 205 target genes that mapped to 932 potential miRNA target binding sites. Several of these miRNA edited sites were validated in silico. Additionally, we validated several A-to-G edited sites by Sanger sequencing. Altogether, our study suggests a role for RNA editing in miRNA-mediated gene regulation and splicing in human lungs. In this study, we have generated a RNA editome of human lung tissue that can be compared with other RNA editomes across different lung tissues to delineate a role for RNA editing in normal and diseased states. PMID:26486088

  13. Detection of canonical A-to-G editing events at 3' UTRs and microRNA target sites in human lungs using next-generation sequencing.

    PubMed

    Soundararajan, Ramani; Stearns, Timothy M; Griswold, Anthony L; Mehta, Arpit; Czachor, Alexander; Fukumoto, Jutaro; Lockey, Richard F; King, Benjamin L; Kolliputi, Narasaiah

    2015-11-03

    RNA editing is a post-transcriptional modification of RNA. The majority of these changes result from adenosine deaminase acting on RNA (ADARs) catalyzing the conversion of adenosine residues to inosine in double-stranded RNAs (dsRNAs). Massively parallel sequencing has enabled the identification of RNA editing sites in human transcriptomes. In this study, we sequenced DNA and RNA from human lungs and identified RNA editing sites with high confidence via a computational pipeline utilizing stringent analysis thresholds. We identified a total of 3,447 editing sites that overlapped in three human lung samples, and with 50% of these sites having canonical A-to-G base changes. Approximately 27% of the edited sites overlapped with Alu repeats, and showed A-to-G clustering (>3 clusters in 100 bp). The majority of edited sites mapped to either 3' untranslated regions (UTRs) or introns close to splice sites; whereas, only few sites were in exons resulting in non-synonymous amino acid changes. Interestingly, we identified 652 A-to-G editing events in the 3' UTR of 205 target genes that mapped to 932 potential miRNA target binding sites. Several of these miRNA edited sites were validated in silico. Additionally, we validated several A-to-G edited sites by Sanger sequencing. Altogether, our study suggests a role for RNA editing in miRNA-mediated gene regulation and splicing in human lungs. In this study, we have generated a RNA editome of human lung tissue that can be compared with other RNA editomes across different lung tissues to delineate a role for RNA editing in normal and diseased states.

  14. PIK3CA-associated developmental disorders exhibit distinct classes of mutations with variable expression and tissue distribution

    PubMed Central

    Timms, Andrew E.; Conti, Valerio; Girisha, Katta M.; Martin, Beth; Olds, Carissa; Collins, Sarah; Park, Kaylee; Carter, Melissa; Krägeloh-Mann, Inge; Chitayat, David; Parikh, Aditi Shah; Bradshaw, Rachael; Torti, Erin; Braddock, Stephen; Burke, Leah; Ghedia, Sondhya; Stephan, Mark; Stewart, Fiona; Prasad, Chitra; Napier, Melanie; Saitta, Sulagna; Straussberg, Rachel; Gabbett, Michael; O’Connor, Bridget C.; Yin, Lim Jiin; Lai, Angeline Hwei Meeng; Martin, Nicole; McKinnon, Margaret; Addor, Marie-Claude; Schwartz, Charles E.; Lanoel, Agustina; Conway, Robert L.; Devriendt, Koenraad; Tatton-Brown, Katrina; Pierpont, Mary Ella; Painter, Michael; Worgan, Lisa; Reggin, James; Hennekam, Raoul; Pritchard, Colin C.; Aracena, Mariana; Gripp, Karen W.; Cordisco, Maria; Van Esch, Hilde; Garavelli, Livia; Curry, Cynthia; Goriely, Anne; Kayserilli, Hulya; Shendure, Jay; Graham, John; Guerrini, Renzo; Dobyns, William B.

    2016-01-01

    Mosaicism is increasingly recognized as a cause of developmental disorders with the advent of next-generation sequencing (NGS). Mosaic mutations of PIK3CA have been associated with the widest spectrum of phenotypes associated with overgrowth and vascular malformations. We performed targeted NGS using 2 independent deep-coverage methods that utilize molecular inversion probes and amplicon sequencing in a cohort of 241 samples from 181 individuals with brain and/or body overgrowth. We identified PIK3CA mutations in 60 individuals. Several other individuals (n = 12) were identified separately to have mutations in PIK3CA by clinical targeted-panel testing (n = 6), whole-exome sequencing (n = 5), or Sanger sequencing (n = 1). Based on the clinical and molecular features, this cohort segregated into three distinct groups: (a) severe focal overgrowth due to low-level but highly activating (hotspot) mutations, (b) predominantly brain overgrowth and less severe somatic overgrowth due to less-activating mutations, and (c) intermediate phenotypes (capillary malformations with overgrowth) with intermediately activating mutations. Sixteen of 29 PIK3CA mutations were novel. We also identified constitutional PIK3CA mutations in 10 patients. Our molecular data, combined with review of the literature, show that PIK3CA-related overgrowth disorders comprise a discontinuous spectrum of disorders that correlate with the severity and distribution of mutations. PMID:27631024

  15. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

    PubMed Central

    Margulies, Elliott H.; Cooper, Gregory M.; Asimenos, George; Thomas, Daryl J.; Dewey, Colin N.; Siepel, Adam; Birney, Ewan; Keefe, Damian; Schwartz, Ariel S.; Hou, Minmei; Taylor, James; Nikolaev, Sergey; Montoya-Burgos, Juan I.; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Brown, James B.; Bickel, Peter; Holmes, Ian; Mullikin, James C.; Ureta-Vidal, Abel; Paten, Benedict; Stone, Eric A.; Rosenbloom, Kate R.; Kent, W. James; Bouffard, Gerard G.; Guan, Xiaobin; Hansen, Nancy F.; Idol, Jacquelyn R.; Maduro, Valerie V.B.; Maskeri, Baishali; McDowell, Jennifer C.; Park, Morgan; Thomas, Pamela J.; Young, Alice C.; Blakesley, Robert W.; Muzny, Donna M.; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Jiang, Huaiyang; Weinstock, George M.; Gibbs, Richard A.; Graves, Tina; Fulton, Robert; Mardis, Elaine R.; Wilson, Richard K.; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B.; Chang, Jean L.; Lindblad-Toh, Kerstin; Lander, Eric S.; Hinrichs, Angie; Trumbower, Heather; Clawson, Hiram; Zweig, Ann; Kuhn, Robert M.; Barber, Galt; Harte, Rachel; Karolchik, Donna; Field, Matthew A.; Moore, Richard A.; Matthewson, Carrie A.; Schein, Jacqueline E.; Marra, Marco A.; Antonarakis, Stylianos E.; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross; Haussler, David; Miller, Webb; Pachter, Lior; Green, Eric D.; Sidow, Arend

    2007-01-01

    A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. PMID:17567995

  16. Beyond the binding site: in vivo identification of tbx2, smarca5 and wnt5b as molecular targets of CNBP during embryonic development.

    PubMed

    Armas, Pablo; Margarit, Ezequiel; Mouguelar, Valeria S; Allende, Miguel L; Calcaterra, Nora B

    2013-01-01

    CNBP is a nucleic acid chaperone implicated in vertebrate craniofacial development, as well as in myotonic dystrophy type 2 (DM2) and sporadic inclusion body myositis (sIBM) human muscle diseases. CNBP is highly conserved among vertebrates and has been implicated in transcriptional regulation; however, its DNA binding sites and molecular targets remain elusive. The main goal of this work was to identify CNBP DNA binding sites that might reveal target genes involved in vertebrate embryonic development. To accomplish this, we used a recently described yeast one-hybrid assay to identify DNA sequences bound in vivo by CNBP. Bioinformatic analyses revealed that these sequences are G-enriched and show high frequency of putative G-quadruplex DNA secondary structure. Moreover, an in silico approach enabled us to establish the CNBP DNA-binding site and to predict CNBP putative targets based on gene ontology terms and synexpression with CNBP. The direct interaction between CNBP and candidate genes was proved by EMSA and ChIP assays. Besides, the role of CNBP upon the identified genes was validated in loss-of-function experiments in developing zebrafish. We successfully confirmed that CNBP up-regulates tbx2b and smarca5, and down-regulates wnt5b gene expression. The highly stringent strategy used in this work allowed us to identify new CNBP target genes functionally important in different contexts of vertebrate embryonic development. Furthermore, it represents a novel approach toward understanding the biological function and regulatory networks involving CNBP in the biology of vertebrates.

  17. Beyond the Binding Site: In Vivo Identification of tbx2, smarca5 and wnt5b as Molecular Targets of CNBP during Embryonic Development

    PubMed Central

    Mouguelar, Valeria S.; Allende, Miguel L.; Calcaterra, Nora B.

    2013-01-01

    CNBP is a nucleic acid chaperone implicated in vertebrate craniofacial development, as well as in myotonic dystrophy type 2 (DM2) and sporadic inclusion body myositis (sIBM) human muscle diseases. CNBP is highly conserved among vertebrates and has been implicated in transcriptional regulation; however, its DNA binding sites and molecular targets remain elusive. The main goal of this work was to identify CNBP DNA binding sites that might reveal target genes involved in vertebrate embryonic development. To accomplish this, we used a recently described yeast one-hybrid assay to identify DNA sequences bound in vivo by CNBP. Bioinformatic analyses revealed that these sequences are G-enriched and show high frequency of putative G-quadruplex DNA secondary structure. Moreover, an in silico approach enabled us to establish the CNBP DNA-binding site and to predict CNBP putative targets based on gene ontology terms and synexpression with CNBP. The direct interaction between CNBP and candidate genes was proved by EMSA and ChIP assays. Besides, the role of CNBP upon the identified genes was validated in loss-of-function experiments in developing zebrafish. We successfully confirmed that CNBP up-regulates tbx2b and smarca5, and down-regulates wnt5b gene expression. The highly stringent strategy used in this work allowed us to identify new CNBP target genes functionally important in different contexts of vertebrate embryonic development. Furthermore, it represents a novel approach toward understanding the biological function and regulatory networks involving CNBP in the biology of vertebrates. PMID:23667590

  18. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3' UTRs and coding sequences.

    PubMed

    Šulc, Miroslav; Marín, Ray M; Robins, Harlan S; Vaníček, Jiří

    2015-07-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3' untranslated regions (3' UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3' UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA-mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA-mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

    PubMed Central

    2012-01-01

    Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. Conclusions We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species. PMID:22805587

  20. Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

    PubMed

    Yang, Huaan; Tao, Ye; Zheng, Zequn; Li, Chengdao; Sweetingham, Mark W; Howieson, John G

    2012-07-17

    In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for marker-assisted selection in the Australian national lupin breeding program. We demonstrated that more than 30 molecular markers linked to a target gene of agronomic trait of interest can be identified from a small portion (1/8) of one sequencing run on HiSeq2000 by applying NGS based RAD sequencing in marker development. The markers developed by the strategy described in this study are all co-dominant SNP markers, which can readily be converted into high throughput multiplex format or low-cost, simple PCR-based markers desirable for large scale marker implementation in plant breeding programs. The high density and closely linked molecular markers associated with a target trait help to overcome a major bottleneck for implementation of molecular markers on a wide range of germplasm in breeding programs. We conclude that application of NGS based RAD sequencing as DNA fingerprinting is a very rapid and cost-effective strategy for marker development in molecular plant breeding. The strategy does not require any prior genome knowledge or molecular information for the species under investigation, and it is applicable to other plant species.

  1. Modeling read counts for CNV detection in exome sequencing data.

    PubMed

    Love, Michael I; Myšičková, Alena; Sun, Ruping; Kalscheuer, Vera; Vingron, Martin; Haas, Stefan A

    2011-11-08

    Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.

  2. Systematic screening of isogenic cancer cells identifies DUSP6 as context-specific synthetic lethal target in melanoma

    PubMed Central

    Wittig-Blaich, Stephanie; Wittig, Rainer; Schmidt, Steffen; Lyer, Stefan; Bewerunge-Hudler, Melanie; Gronert-Sum, Sabine; Strobel-Freidekind, Olga; Müller, Carolin; List, Markus; Jaskot, Aleksandra; Christiansen, Helle; Hafner, Mathias; Schadendorf, Dirk; Block, Ines; Mollenhauer, Jan

    2017-01-01

    Next-generation sequencing has dramatically increased genome-wide profiling options and conceptually initiates the possibility for personalized cancer therapy. State-of-the-art sequencing studies yield large candidate gene sets comprising dozens or hundreds of mutated genes. However, few technologies are available for the systematic downstream evaluation of these results to identify novel starting points of future cancer therapies. We improved and extended a site-specific recombination-based system for systematic analysis of the individual functions of a large number of candidate genes. This was facilitated by a novel system for the construction of isogenic constitutive and inducible gain- and loss-of-function cell lines. Additionally, we demonstrate the construction of isogenic cell lines with combinations of the traits for advanced functional in vitro analyses. In a proof-of-concept experiment, a library of 108 isogenic melanoma cell lines was constructed and 8 genes were identified that significantly reduced viability in a discovery screen and in an independent validation screen. Here, we demonstrate the broad applicability of this recombination-based method and we proved its potential to identify new drug targets via the identification of the tumor suppressor DUSP6 as potential synthetic lethal target in melanoma cell lines with BRAF V600E mutations and high DUSP6 expression. PMID:28423600

  3. Illuminator, a desktop program for mutation detection using short-read clonal sequencing.

    PubMed

    Carr, Ian M; Morgan, Joanne E; Diggle, Christine P; Sheridan, Eamonn; Markham, Alexander F; Logan, Clare V; Inglehearn, Chris F; Taylor, Graham R; Bonthron, David T

    2011-10-01

    Current methods for sequencing clonal populations of DNA molecules yield several gigabases of data per day, typically comprising reads of < 100 nt. Such datasets permit widespread genome resequencing and transcriptome analysis or other quantitative tasks. However, this huge capacity can also be harnessed for the resequencing of smaller (gene-sized) target regions, through the simultaneous parallel analysis of multiple subjects, using sample "tagging" or "indexing". These methods promise to have a huge impact on diagnostic mutation analysis and candidate gene testing. Here we describe a software package developed for such studies, offering the ability to resolve pooled samples carrying barcode tags and to align reads to a reference sequence using a mutation-tolerant process. The program, Illuminator, can identify rare sequence variants, including insertions and deletions, and permits interactive data analysis on standard desktop computers. It facilitates the effective analysis of targeted clonal sequencer data without dedicated computational infrastructure or specialized training. Copyright © 2011 Elsevier Inc. All rights reserved.

  4. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    PubMed Central

    2011-01-01

    Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the target can be reliably identified. The HomPPI web server is available at http://homppi.cs.iastate.edu/. Conclusions Sequence homology-based methods offer a class of computationally efficient and reliable approaches for predicting the protein-protein interface residues that participate in either obligate or transient interactions. For query proteins involved in transient interactions, the reliability of interface residue prediction can be improved by exploiting knowledge of putative interaction partners. PMID:21682895

  5. Comparative Analysis of Predicted Plastid-Targeted Proteomes of Sequenced Higher Plant Genomes

    PubMed Central

    Schaeffer, Scott; Harper, Artemus; Raja, Rajani; Jaiswal, Pankaj; Dhingra, Amit

    2014-01-01

    Plastids are actively involved in numerous plant processes critical to growth, development and adaptation. They play a primary role in photosynthesis, pigment and monoterpene synthesis, gravity sensing, starch and fatty acid synthesis, as well as oil, and protein storage. We applied two complementary methods to analyze the recently published apple genome (Malus × domestica) to identify putative plastid-targeted proteins, the first using TargetP and the second using a custom workflow utilizing a set of predictive programs. Apple shares roughly 40% of its 10,492 putative plastid-targeted proteins with that of the Arabidopsis (Arabidopsis thaliana) plastid-targeted proteome as identified by the Chloroplast 2010 project and ∼57% of its entire proteome with Arabidopsis. This suggests that the plastid-targeted proteomes between apple and Arabidopsis are different, and interestingly alludes to the presence of differential targeting of homologs between the two species. Co-expression analysis of 2,224 genes encoding putative plastid-targeted apple proteins suggests that they play a role in plant developmental and intermediary metabolism. Further, an inter-specific comparison of Arabidopsis, Prunus persica (Peach), Malus × domestica (Apple), Populus trichocarpa (Black cottonwood), Fragaria vesca (Woodland Strawberry), Solanum lycopersicum (Tomato) and Vitis vinifera (Grapevine) also identified a large number of novel species-specific plastid-targeted proteins. This analysis also revealed the presence of alternatively targeted homologs across species. Two separate analyses revealed that a small subset of proteins, one representing 289 protein clusters and the other 737 unique protein sequences, are conserved between seven plastid-targeted angiosperm proteomes. Majority of the novel proteins were annotated to play roles in stress response, transport, catabolic processes, and cellular component organization. Our results suggest that the current state of knowledge regarding plastid biology, preferentially based on model systems is deficient. New plant genomes are expected to enable the identification of potentially new plastid-targeted proteins that will aid in studying novel roles of plastids. PMID:25393533

  6. CRISPR interference and priming varies with individual spacer sequences

    PubMed Central

    Xue, Chaoyou; Seetharam, Arun S.; Musharova, Olga; Severinov, Konstantin; J. Brouns, Stan J.; Severin, Andrew J.; Sashital, Dipali G.

    2015-01-01

    CRISPR–Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) systems allow bacteria to adapt to infection by acquiring ‘spacer’ sequences from invader DNA into genomic CRISPR loci. Cas proteins use RNAs derived from these loci to target cognate sequences for destruction through CRISPR interference. Mutations in the protospacer adjacent motif (PAM) and seed regions block interference but promote rapid ‘primed’ adaptation. Here, we use multiple spacer sequences to reexamine the PAM and seed sequence requirements for interference and priming in the Escherichia coli Type I-E CRISPR–Cas system. Surprisingly, CRISPR interference is far more tolerant of mutations in the seed and the PAM than previously reported, and this mutational tolerance, as well as priming activity, is highly dependent on spacer sequence. We identify a large number of functional PAMs that can promote interference, priming or both activities, depending on the associated spacer sequence. Functional PAMs are preferentially acquired during unprimed ‘naïve’ adaptation, leading to a rapid priming response following infection. Our results provide numerous insights into the importance of both spacer and target sequences for interference and priming, and reveal that priming is a major pathway for adaptation during initial infection. PMID:26586800

  7. ETS target genes: Identification of Egr1 as a target by RNA differential display and whole genome PCR techniques

    PubMed Central

    Robinson, Lois; Panayiotakis, Alexandra; Papas, Takis S.; Kola, Ismail; Seth, Arun

    1997-01-01

    ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene. PMID:9207063

  8. Programmable RNA recognition and cleavage by CRISPR/Cas9.

    PubMed

    O'Connell, Mitchell R; Oakes, Benjamin L; Sternberg, Samuel H; East-Seletsky, Alexandra; Kaplan, Matias; Doudna, Jennifer A

    2014-12-11

    The CRISPR-associated protein Cas9 is an RNA-guided DNA endonuclease that uses RNA-DNA complementarity to identify target sites for sequence-specific double-stranded DNA (dsDNA) cleavage. In its native context, Cas9 acts on DNA substrates exclusively because both binding and catalysis require recognition of a short DNA sequence, known as the protospacer adjacent motif (PAM), next to and on the strand opposite the twenty-nucleotide target site in dsDNA. Cas9 has proven to be a versatile tool for genome engineering and gene regulation in a large range of prokaryotic and eukaryotic cell types, and in whole organisms, but it has been thought to be incapable of targeting RNA. Here we show that Cas9 binds with high affinity to single-stranded RNA (ssRNA) targets matching the Cas9-associated guide RNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide. Furthermore, PAM-presenting oligonucleotides (PAMmers) stimulate site-specific endonucleolytic cleavage of ssRNA targets, similar to PAM-mediated stimulation of Cas9-catalysed DNA cleavage. Using specially designed PAMmers, Cas9 can be specifically directed to bind or cut RNA targets while avoiding corresponding DNA sequences, and we demonstrate that this strategy enables the isolation of a specific endogenous messenger RNA from cells. These results reveal a fundamental connection between PAM binding and substrate selection by Cas9, and highlight the utility of Cas9 for programmable transcript recognition without the need for tags.

  9. Programmable RNA recognition and cleavage by CRISPR/Cas9

    PubMed Central

    O’Connell, Mitchell R.; Oakes, Benjamin L.; Sternberg, Samuel H.; East-Seletsky, Alexandra; Kaplan, Matias; Doudna, Jennifer A.

    2014-01-01

    The CRISPR-associated protein Cas9 is an RNA-guided DNA endonuclease that uses RNA:DNA complementarity to identify target sites for sequence-specific doublestranded DNA (dsDNA) cleavage1-5. In its native context, Cas9 acts on DNA substrates exclusively because both binding and catalysis require recognition of a short DNA sequence, the protospacer adjacent motif (PAM), next to and on the strand opposite the 20-nucleotide target site in dsDNA4-7. Cas9 has proven to be a versatile tool for genome engineering and gene regulation in many cell types and organisms8, but it has been thought to be incapable of targeting RNA5. Here we show that Cas9 binds with high affinity to single-stranded RNA (ssRNA) targets matching the Cas9-associated guide RNA sequence when the PAM is presented in trans as a separate DNA oligonucleotide. Furthermore, PAM-presenting oligonucleotides (PAMmers) stimulate site-specific endonucleolytic cleavage of ssRNA targets, similar to PAM-mediated stimulation of Cas9-catalyzed DNA cleavage7. Using specially designed PAMmers, Cas9 can be specifically directed to bind or cut RNA targets while avoiding corresponding DNA sequences, and we demonstrate that this strategy enables the isolation of a specific endogenous mRNA from cells. These results reveal a fundamental connection between PAM binding and substrate selection by Cas9, and highlight the utility of Cas9 for programmable and tagless transcript recognition. PMID:25274302

  10. Comparative genome analysis identifies novel nucleic acid diagnostic targets for use in the specific detection of Haemophilus influenzae.

    PubMed

    Coughlan, Helena; Reddington, Kate; Tuite, Nina; Boo, Teck Wee; Cormican, Martin; Barrett, Louise; Smith, Terry J; Clancy, Eoin; Barry, Thomas

    2015-10-01

    Haemophilus influenzae is recognised as an important human pathogen associated with invasive infections, including bloodstream infection and meningitis. Currently used molecular-based diagnostic assays lack specificity in correctly detecting and identifying H. influenzae. As such, there is a need to develop novel diagnostic assays for the specific identification of H. influenzae. Whole genome comparative analysis was performed to identify putative diagnostic targets, which are unique in nucleotide sequence to H. influenzae. From this analysis, we identified 2H. influenzae putative diagnostic targets, phoB and pstA, for use in real-time PCR diagnostic assays. Real-time PCR diagnostic assays using these targets were designed and optimised to specifically detect and identify all 55H. influenzae strains tested. These novel rapid assays can be applied to the specific detection and identification of H. influenzae for use in epidemiological studies and could also enable improved monitoring of invasive disease caused by these bacteria. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Real-time functional imaging for monitoring miR-133 during myogenic differentiation.

    PubMed

    Kato, Yoshio; Miyaki, Shigeru; Yokoyama, Shigetoshi; Omori, Shin; Inoue, Atsushi; Horiuchi, Machiko; Asahara, Hiroshi

    2009-11-01

    MicroRNAs (miRNAs) are a class of non-coding small RNAs that act as negative regulators of gene expression through sequence-specific interactions with the 3' untranslated regions (UTRs) of target mRNA and play various biological roles. miR-133 was identified as a muscle-specific miRNA that enhanced the proliferation of myoblasts during myogenic differentiation, although its activity in myogenesis has not been fully characterized. Here, we developed a novel retroviral vector system for monitoring muscle-specific miRNA in living cells by using a green fluorescent protein (GFP) that is connected to the target sequence of miR-133 via the UTR and a red fluorescent protein for normalization. We demonstrated that the functional promotion of miR-133 during myogenesis is visualized by the reduction of GFP carrying the miR-133 target sequence, suggesting that miR-133 specifically down-regulates its targets during myogenesis in accordance with its expression. Our cell-based miRNA functional assay monitoring miR-133 activity should be a useful tool in elucidating the role of miRNAs in various biological events.

  12. EphB4 localises to the nucleus of prostate cancer cells

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mertens-Walker, Inga, E-mail: inga.mertenswalker@qut.edu.au; Australian Prostate Cancer Research Centre—Queensland, Translational Research Institute, 37 Kent Street, Woolloongabba 4102, QLD; Lisle, Jessica E.

    2015-04-10

    The EphB4 receptor tyrosine kinase is over-expressed in a variety of different epithelial cancers including prostate where it has been shown to be involved in survival, migration and angiogenesis. We report here that EphB4 also resides in the nucleus of prostate cancer cell lines. We used in silico methods to identify a bipartite nuclear localisation signal (NLS) in the extracellular domain and a monopartite NLS sequence in the intracellular kinase domain of EphB4. To determine whether both putative NLS sequences were functional, fragments of the EphB4 sequence containing each NLS were cloned to create EphB4NLS-GFP fusion proteins. Localisation of bothmore » NLS-GFP proteins to the nuclei of transfected cells was observed, demonstrating that EphB4 contains two functional NLS sequences. Mutation of the key amino residues in both NLS sequences resulted in diminished nuclear accumulation. As nuclear translocation is often dependent on importins we confirmed that EphB4 and importin-α can interact. To assess if nuclear EphB4 could be implicated in gene regulatory functions potential EphB4-binding genomic loci were identified using chromatin immunoprecipitation and Lef1 was confirmed as a potential target of EphB4-mediated gene regulation. These novel findings add further complexity to the biology of this important cancer-associated receptor. - Highlights: • The EphB4 protein can be found in the nucleus of prostate cancer cell lines. • EphB4 contains two functional nuclear localisation signals. • Chromatin immunoprecipitation has identified potential genome sequences to which EphB4 binds. • Lef1 is a confirmed target for EphB4-mediated gene regulation.« less

  13. Targeted exome sequencing identifies novel compound heterozygous mutations in P3H1 in a fetus with osteogenesis imperfecta type VIII.

    PubMed

    Huang, Yanru; Mei, Libin; Lv, Weigang; Li, Haoxian; Zhang, Rui; Pan, Qian; Tan, Hu; Guo, Jing; Luo, Xiaomei; Chen, Chen; Liang, Desheng; Wu, Lingqian

    2017-01-01

    Osteogenesis imperfecta (OI) is a highly clinically and genetically heterogeneous group of disorders. It is difficult to identify severe OI in the perinatal period. Here, a Chinese woman with a suspected history of fetal OI was referred to our institution at 19weeks of gestation, due to ultrasound inspection during antenatal screening, which revealed bulbous metaphyses, short humeri, and short thick bent femora in the fetus. Using targeted exome sequencing of 248 genes known to be involved in skeletal system diseases, we identified novel compound heterozygous mutation in the P3H1 gene in the fetus with OI type VIII: c.105_120del (p.D36Rfs*16) and c.2164C>T (p.Q722*). These two mutations were inherited from the father and mother, respectively. The mRNA level of P3H1 wasn't changed suggested that mRNA with this mutation escaped from nonsense-mediated RNA decay. Besides, the level of P3H1 was absence while the CRTAP was mildly decreased. In conclusion, our findings imply this novel compound heterozygous mutation as the molecular pathogenetic in a Chinese fetus with OI type VIII, and demonstrate that targeted next-generation sequencing (NGS) is an accurate, rapid, and cost-effective method in the genetic diagnosis of fetal skeletal dysplasia with genetic and clinical heterogeneity, especially for autosomal recessive skeletal disorders. Copyright © 2016 Elsevier B.V. All rights reserved.

  14. Deep sequencing of Salmonella RNA associated with heterologous Hfq proteins in vivo reveals small RNAs as a major target class and identifies RNA processing phenotypes.

    PubMed

    Sittka, Alexandra; Sharma, Cynthia M; Rolle, Katarzyna; Vogel, Jörg

    2009-01-01

    The bacterial Sm-like protein, Hfq, is a key factor for the stability and function of small non-coding RNAs (sRNAs) in Escherichia coli. Homologues of this protein have been predicted in many distantly related organisms yet their functional conservation as sRNA-binding proteins has not entirely been clear. To address this, we expressed in Salmonella the Hfq proteins of two eubacteria (Neisseria meningitides, Aquifex aeolicus) and an archaeon (Methanocaldococcus jannaschii), and analyzed the associated RNA by deep sequencing. This in vivo approach identified endogenous Salmonella sRNAs as a major target of the foreign Hfq proteins. New Salmonella sRNA species were also identified, and some of these accumulated specifically in the presence of a foreign Hfq protein. In addition, we observed specific RNA processing defects, e.g., suppression of precursor processing of SraH sRNA by Methanocaldococcus Hfq, or aberrant accumulation of extracytoplasmic target mRNAs of the Salmonella GcvB, MicA or RybB sRNAs. Taken together, our study provides evidence of a conserved inherent sRNA-binding property of Hfq, which may facilitate the lateral transmission of regulatory sRNAs among distantly related species. It also suggests that the expression of heterologous RNA-binding proteins combined with deep sequencing analysis of RNA ligands can be used as a molecular tool to dissect individual steps of RNA metabolism in vivo.

  15. Unbiased Combinatorial Genomic Approaches to Identify Alternative Therapeutic Targets within the TSC Signaling Network

    DTIC Science & Technology

    2014-06-01

    Specifically, we combined the CRISPR genome editing system with a novel approach allowing efficient single cell cloning of Drosophila cells with the aim of...and culture these to produce cultures completely lacking wildtype sequence at the target locus. No robust methods existed to clone single Drosophila ...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . 65 samples that displayed synthetic lethality (15 genes) or synthetic

  16. indCAPS: A tool for designing screening primers for CRISPR/Cas9 mutagenesis events.

    PubMed

    Hodgens, Charles; Nimchuk, Zachary L; Kieber, Joseph J

    2017-01-01

    Genetic manipulation of organisms using CRISPR/Cas9 technology generally produces small insertions/deletions (indels) that can be difficult to detect. Here, we describe a technique to easily and rapidly identify such indels. Sequence-identified mutations that alter a restriction enzyme recognition site can be readily distinguished from wild-type alleles using a cleaved amplified polymorphic sequence (CAPS) technique. If a restriction site is created or altered by the mutation such that only one allele contains the restriction site, a polymerase chain reaction (PCR) followed by a restriction digest can be used to distinguish the two alleles. However, in the case of most CRISPR-induced alleles, no such restriction sites are present in the target sequences. In this case, a derived CAPS (dCAPS) approach can be used in which mismatches are purposefully introduced in the oligonucleotide primers to create a restriction site in one, but not both, of the amplified templates. Web-based tools exist to aid dCAPS primer design, but when supplied sequences that include indels, the current tools often fail to suggest appropriate primers. Here, we report the development of a Python-based, species-agnostic web tool, called indCAPS, suitable for the design of PCR primers used in dCAPS assays that is compatible with indels. This tool should have wide utility for screening editing events following CRISPR/Cas9 mutagenesis as well as for identifying specific editing events in a pool of CRISPR-mediated mutagenesis events. This tool was field-tested in a CRISPR mutagenesis experiment targeting a cytokinin receptor (AHK3) in Arabidopsis thaliana. The tool suggested primers that successfully distinguished between wild-type and edited alleles of a target locus and facilitated the isolation of two novel ahk3 null alleles. Users can access indCAPS and design PCR primers to employ dCAPS to identify CRISPR/Cas9 alleles at http://indcaps.kieber.cloudapps.unc.edu/.

  17. Confetti: A Multiprotease Map of the HeLa Proteome for Comprehensive Proteomics*

    PubMed Central

    Guo, Xiaofeng; Trudgian, David C.; Lemoff, Andrew; Yadavalli, Sivaramakrishna; Mirzaei, Hamid

    2014-01-01

    Bottom-up proteomics largely relies on tryptic peptides for protein identification and quantification. Tryptic digestion often provides limited coverage of protein sequence because of issues such as peptide length, ionization efficiency, and post-translational modification colocalization. Unfortunately, a region of interest in a protein, for example, because of proximity to an active site or the presence of important post-translational modifications, may not be covered by tryptic peptides. Detection limits, quantification accuracy, and isoform differentiation can also be improved with greater sequence coverage. Selected reaction monitoring (SRM) would also greatly benefit from being able to identify additional targetable sequences. In an attempt to improve protein sequence coverage and to target regions of proteins that do not generate useful tryptic peptides, we deployed a multiprotease strategy on the HeLa proteome. First, we used seven commercially available enzymes in single, double, and triple enzyme combinations. A total of 48 digests were performed. 5223 proteins were detected by analyzing the unfractionated cell lysate digest directly; with 42% mean sequence coverage. Additional strong-anion exchange fractionation of the most complementary digests permitted identification of over 3000 more proteins, with improved mean sequence coverage. We then constructed a web application (https://proteomics.swmed.edu/confetti) that allows the community to examine a target protein or protein isoform in order to discover the enzyme or combination of enzymes that would yield peptides spanning a certain region of interest in the sequence. Finally, we examined the use of nontryptic digests for SRM. From our strong-anion exchange fractionation data, we were able to identify three or more proteotypic SRM candidates within a single digest for 6056 genes. Surprisingly, in 25% of these cases the digest producing the most observable proteotypic peptides was neither trypsin nor Lys-C. SRM analysis of Asp-N versus tryptic peptides for eight proteins determined that Asp-N yielded higher signal in five of eight cases. PMID:24696503

  18. [Screening and identification of a bacterium capable of converting agar to neoagaro oligosaccharides].

    PubMed

    Han, Junping; Huang, Yayan; Ye, Jing; Xiao, Meitian

    2015-09-04

    To screen and identify a bacterium capable of converting agar to neoagaro oligosaccharides. We took samples of porphyra haitanensis and nearby seawater, and then used the medium containing 1 per thousand agar to enrich the target bacteria. The target isolates were obtained by dilution-plate method, of which crude enzymes were further obtained by liquid culture. We adopted DNS method to determine the target bacteria which can convert agar to neoagaro oligosaccharides. The phylogenetics was identified by analyzing 16S rDNA sequence and combining the strain's morphological and bacterial colonial physiological biochemical characteristics. We isolated a gram-negative bacterial strain HJPHYXJ-1 capable of transforming agar to neoagaro oligosaccharides. Basic Local Alignment Search Tool (BLAST) search of HJPHYXJ-1's 16S rDNA sequence on GenBank suggested that the similarity between this strain and Vibrio natriegens reached 99% . In addition, the morphological and physiological biochemical characteristics of HJPHYXJ-1 also showed highly similarity to Vibrio natriegens. So we identified HJPHYXJ-1 as Vibrio natriegens. The results of HPLC suggested that the metabolite of enzymatic degradation was neoagaro oligosaccharides. HJPHYXJ-1 or the new isolate of Vibrio natriegens was capable of converting agar to neoagaro oligosaccharides.

  19. Construction of the BAC Library of Small Abalone (Haliotis diversicolor) for Gene Screening and Genome Characterization.

    PubMed

    Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng

    2016-02-01

    The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor.

  20. Natural product discovery: past, present, and future.

    PubMed

    Katz, Leonard; Baltz, Richard H

    2016-03-01

    Microorganisms have provided abundant sources of natural products which have been developed as commercial products for human medicine, animal health, and plant crop protection. In the early years of natural product discovery from microorganisms (The Golden Age), new antibiotics were found with relative ease from low-throughput fermentation and whole cell screening methods. Later, molecular genetic and medicinal chemistry approaches were applied to modify and improve the activities of important chemical scaffolds, and more sophisticated screening methods were directed at target disease states. In the 1990s, the pharmaceutical industry moved to high-throughput screening of synthetic chemical libraries against many potential therapeutic targets, including new targets identified from the human genome sequencing project, largely to the exclusion of natural products, and discovery rates dropped dramatically. Nonetheless, natural products continued to provide key scaffolds for drug development. In the current millennium, it was discovered from genome sequencing that microbes with large genomes have the capacity to produce about ten times as many secondary metabolites as was previously recognized. Indeed, the most gifted actinomycetes have the capacity to produce around 30-50 secondary metabolites. With the precipitous drop in cost for genome sequencing, it is now feasible to sequence thousands of actinomycete genomes to identify the "biosynthetic dark matter" as sources for the discovery of new and novel secondary metabolites. Advances in bioinformatics, mass spectrometry, proteomics, transcriptomics, metabolomics and gene expression are driving the new field of microbial genome mining for applications in natural product discovery and development.

  1. Defining the wheat gluten peptide fingerprint via a discovery and targeted proteomics approach.

    PubMed

    Martínez-Esteso, María José; Nørgaard, Jørgen; Brohée, Marcel; Haraszi, Reka; Maquet, Alain; O'Connor, Gavin

    2016-09-16

    Accurate, reliable and sensitive detection methods for gluten are required to support current EU regulations. The enforcement of legislative levels requires that measurement results are comparable over time and between methods. This is not a trivial task for gluten which comprises a large number of protein targets. This paper describes a strategy for defining a set of specific analytical targets for wheat gluten. A comprehensive proteomic approach was applied by fractionating wheat gluten using RP-HPLC (reversed phase high performance liquid chromatography) followed by a multi-enzymatic digestion (LysC, trypsin and chymotrypsin) with subsequent mass spectrometric analysis. This approach identified 434 peptide sequences from gluten. Peptides were grouped based on two criteria: unique to a single gluten protein sequence; contained known immunogenic and toxic sequences in the context of coeliac disease. An LC-MS/MS method based on selected reaction monitoring (SRM) was developed on a triple quadrupole mass spectrometer for the specific detection of the target peptides. The SRM based screening approach was applied to gluten containing cereals (wheat, rye, barley and oats) and non-gluten containing flours (corn, soy and rice). A unique set of wheat gluten marker peptides were identified and are proposed as wheat specific markers. The measurement of gluten in processed food products in support of regulatory limits is performed routinely. Mass spectrometry is emerging as a viable alternative to ELISA based methods. Here we outline a set of peptide markers that are representative of gluten and consider the end user's needs in protecting those with coeliac disease. The approach taken has been applied to wheat but can be easily extended to include other species potentially enabling the MS quantification of different gluten containing species from the identified markers. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  2. Characterisation of microRNAs from apple (Malus domestica 'Royal Gala') vascular tissue and phloem sap.

    PubMed

    Varkonyi-Gasic, Erika; Gould, Nick; Sandanayaka, Manoharie; Sutherland, Paul; MacDiarmid, Robin M

    2010-08-04

    Plant microRNAs (miRNAs) are a class of small, non-coding RNAs that play an important role in development and environmental responses. Hundreds of plant miRNAs have been identified to date, mainly from the model species for which there are available genome sequences. The current challenge is to characterise miRNAs from plant species with agricultural and horticultural importance, to aid our understanding of important regulatory mechanisms in crop species and enable improvement of crops and rootstocks. Based on the knowledge that many miRNAs occur in large gene families and are highly conserved among distantly related species, we analysed expression of twenty-one miRNA sequences in different tissues of apple (Malus x domestica 'Royal Gala'). We identified eighteen sequences that are expressed in at least one of the tissues tested. Some, but not all, miRNAs expressed in apple tissues including the phloem tissue were also detected in the phloem sap sample derived from the stylets of woolly apple aphids. Most of the miRNAs detected in apple phloem sap were also abundant in the phloem sap of herbaceous species. Potential targets for apple miRNAs were identified that encode putative proteins shown to be targets of corresponding miRNAs in a number of plant species. Expression patterns of potential targets were analysed and correlated with expression of corresponding miRNAs. This study validated tissue-specific expression of apple miRNAs that target genes responsible for plant growth, development, and stress response. A subset of characterised miRNAs was also present in the apple phloem translocation stream. A comparative analysis of phloem miRNAs in herbaceous species and woody perennials will aid our understanding of non-cell autonomous roles of miRNAs in plants.

  3. Characterisation of microRNAs from apple (Malus domestica 'Royal Gala') vascular tissue and phloem sap

    PubMed Central

    2010-01-01

    Background Plant microRNAs (miRNAs) are a class of small, non-coding RNAs that play an important role in development and environmental responses. Hundreds of plant miRNAs have been identified to date, mainly from the model species for which there are available genome sequences. The current challenge is to characterise miRNAs from plant species with agricultural and horticultural importance, to aid our understanding of important regulatory mechanisms in crop species and enable improvement of crops and rootstocks. Results Based on the knowledge that many miRNAs occur in large gene families and are highly conserved among distantly related species, we analysed expression of twenty-one miRNA sequences in different tissues of apple (Malus x domestica 'Royal Gala'). We identified eighteen sequences that are expressed in at least one of the tissues tested. Some, but not all, miRNAs expressed in apple tissues including the phloem tissue were also detected in the phloem sap sample derived from the stylets of woolly apple aphids. Most of the miRNAs detected in apple phloem sap were also abundant in the phloem sap of herbaceous species. Potential targets for apple miRNAs were identified that encode putative proteins shown to be targets of corresponding miRNAs in a number of plant species. Expression patterns of potential targets were analysed and correlated with expression of corresponding miRNAs. Conclusions This study validated tissue-specific expression of apple miRNAs that target genes responsible for plant growth, development, and stress response. A subset of characterised miRNAs was also present in the apple phloem translocation stream. A comparative analysis of phloem miRNAs in herbaceous species and woody perennials will aid our understanding of non-cell autonomous roles of miRNAs in plants. PMID:20682080

  4. Personalized oncogenomic analysis of metastatic adenoid cystic carcinoma: using whole-genome sequencing to inform clinical decision-making

    PubMed Central

    Chahal, Manik; Pleasance, Erin; Grewal, Jasleen; Zhao, Eric; Ng, Tony; Chapman, Erin; Jones, Martin R.; Shen, Yaoqing; Mungall, Karen L.; Bonakdar, Melika; Taylor, Gregory A.; Ma, Yussanne; Mungall, Andrew J.; Moore, Richard A.; Lim, Howard; Renouf, Daniel; Yip, Stephen; Jones, Steven J.M.; Marra, Marco A.; Laskin, Janessa

    2018-01-01

    Metastatic adenoid cystic carcinomas (ACCs) can cause significant morbidity and mortality. Because of their slow growth and relative rarity, there is limited evidence for systemic therapy regimens. Recently, molecular profiling studies have begun to reveal the genetic landscape of these poorly understood cancers, and new treatment possibilities are beginning to emerge. The objective is to use whole-genome and transcriptome sequencing and analysis to better understand the genetic alterations underlying the pathology of metastatic and rare ACCs and determine potentially actionable therapeutic targets. We report five cases of metastatic ACC, not originating in the salivary glands, in patients enrolled in the Personalized Oncogenomics (POG) Program at the BC Cancer Agency. Genomic workup included whole-genome and transcriptome sequencing, detailed analysis of tumor alterations, and integration with existing knowledge of drug–target combinations to identify potential therapeutic targets. Analysis reveals low mutational burden in these five ACC cases, and mutation signatures that are commonly observed in multiple cancer types. Notably, the only recurrent structural aberration identified was the well-described MYB-NFIB fusion that was present in four of five cases, and one case exhibited a closely related MYBL1-NFIB fusion. Recurrent mutations were also identified in BAP1 and BCOR, with additional mutations in individual samples affecting NOTCH1 and the epigenetic regulators ARID2, SMARCA2, and SMARCB1. Copy changes were rare, and they included amplification of MYC and homozygous loss of CDKN2A in individual samples. Genomic analysis revealed therapeutic targets in all five cases and served to inform a therapeutic choice in three of the cases to date. PMID:29610392

  5. Natural Allelic Variations in Highly Polyploidy Saccharum Complex

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.

    Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less

  6. Natural Allelic Variations in Highly Polyploidy Saccharum Complex

    DOE PAGES

    Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.; ...

    2016-06-08

    Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less

  7. Molecular Diagnosis of Usher Syndrome: Application of Two Different Next Generation Sequencing-Based Procedures

    PubMed Central

    Licastro, Danilo; Mutarelli, Margherita; Peluso, Ivana; Neveling, Kornelia; Wieskamp, Nienke; Rispoli, Rossella; Vozzi, Diego; Athanasakis, Emmanouil; D'Eustacchio, Angela; Pizzo, Mariateresa; D'Amico, Francesca; Ziviello, Carmela; Simonelli, Francesca; Fabretto, Antonella; Scheffer, Hans; Gasparini, Paolo; Banfi, Sandro; Nigro, Vincenzo

    2012-01-01

    Usher syndrome (USH) is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS) technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II) and Roche 454 (GS FLX) for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous) out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified. PMID:22952768

  8. Methods for determining the genetic affinity of microorganisms and viruses

    NASA Technical Reports Server (NTRS)

    Fox, George E. (Inventor); Willson, III, Richard C. (Inventor); Zhang, Zhengdong (Inventor)

    2012-01-01

    Selecting which sub-sequences in a database of nucleic acid such as 16S rRNA are highly characteristic of particular groupings of bacteria, microorganisms, fungi, etc. on a substantially phylogenetic tree. Also applicable to viruses comprising viral genomic RNA or DNA. A catalogue of highly characteristic sequences identified by this method is assembled to establish the genetic identity of an unknown organism. The characteristic sequences are used to design nucleic acid hybridization probes that include the characteristic sequence or its complement, or are derived from one or more characteristic sequences. A plurality of these characteristic sequences is used in hybridization to determine the phylogenetic tree position of the organism(s) in a sample. Those target organisms represented in the original sequence database and sufficient characteristic sequences can identify to the species or subspecies level. Oligonucleotide arrays of many probes are especially preferred. A hybridization signal can comprise fluorescence, chemiluminescence, or isotopic labeling, etc.; or sequences in a sample can be detected by direct means, e.g. mass spectrometry. The method's characteristic sequences can also be used to design specific PCR primers. The method uniquely identifies the phylogenetic affinity of an unknown organism without requiring prior knowledge of what is present in the sample. Even if the organism has not been previously encountered, the method still provides useful information about which phylogenetic tree bifurcation nodes encompass the organism.

  9. HIV-1 RNAs are Not Part of the Argonaute 2 Associated RNA Interference Pathway in Macrophages.

    PubMed

    Vongrad, Valentina; Imig, Jochen; Mohammadi, Pejman; Kishore, Shivendra; Jaskiewicz, Lukasz; Hall, Jonathan; Günthard, Huldrych F; Beerenwinkel, Niko; Metzner, Karin J

    2015-01-01

    MiRNAs and other small noncoding RNAs (sncRNAs) are key players in post-transcriptional gene regulation. HIV-1 derived small noncoding RNAs (sncRNAs) have been described in HIV-1 infected cells, but their biological functions still remain to be elucidated. Here, we approached the question whether viral sncRNAs may play a role in the RNA interference (RNAi) pathway or whether viral mRNAs are targeted by cellular miRNAs in human monocyte derived macrophages (MDM). The incorporation of viral sncRNAs and/or their target RNAs into RNA-induced silencing complex was investigated using photoactivatable ribonucleoside-induced cross-linking and immunoprecipitation (PAR-CLIP) as well as high-throughput sequencing of RNA isolated by cross-linking immunoprecipitation (HITS-CLIP), which capture Argonaute2-bound miRNAs and their target RNAs. HIV-1 infected monocyte-derived macrophages (MDM) were chosen as target cells, as they have previously been shown to express HIV-1 sncRNAs. In addition, we applied small RNA deep sequencing to study differential cellular miRNA expression in HIV-1 infected versus non-infected MDMs. PAR-CLIP and HITS-CLIP data demonstrated the absence of HIV-1 RNAs in Ago2-RISC, although the presence of a multitude of HIV-1 sncRNAs in HIV-1 infected MDMs was confirmed by small RNA sequencing. Small RNA sequencing revealed that 1.4% of all sncRNAs were of HIV-1 origin. However, neither HIV-1 derived sncRNAs nor putative HIV-1 target sequences incorporated into Ago2-RISC were identified suggesting that HIV-1 sncRNAs are not involved in the canonical RNAi pathway nor is HIV-1 targeted by this pathway in HIV-1 infected macrophages.

  10. Toxins of Prokaryotic Toxin-Antitoxin Systems with Sequence-Specific Endoribonuclease Activity

    PubMed Central

    Masuda, Hisako; Inouye, Masayori

    2017-01-01

    Protein translation is the most common target of toxin-antitoxin system (TA) toxins. Sequence-specific endoribonucleases digest RNA in a sequence-specific manner, thereby blocking translation. While past studies mainly focused on the digestion of mRNA, recent analysis revealed that toxins can also digest tRNA, rRNA and tmRNA. Purified toxins can digest single-stranded portions of RNA containing recognition sequences in the absence of ribosome in vitro. However, increasing evidence suggests that in vivo digestion may occur in association with ribosomes. Despite the prevalence of recognition sequences in many mRNA, preferential digestion seems to occur at specific positions within mRNA and also in certain reading frames. In this review, a variety of tools utilized to study the nuclease activities of toxins over the past 15 years will be reviewed. A recent adaptation of an RNA-seq-based technique to analyze entire sets of cellular RNA will be introduced with an emphasis on its strength in identifying novel targets and redefining recognition sequences. The differences in biochemical properties and postulated physiological roles will also be discussed. PMID:28420090

  11. Clinical utility of circulating tumor DNA for molecular assessment in pancreatic cancer.

    PubMed

    Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Morizane, Chigusa; Nara, Satoshi; Hama, Natsuko; Suzuki, Masami; Furukawa, Eisaku; Kato, Mamoru; Hayashi, Hideyuki; Kohno, Takashi; Ueno, Hideki; Shimada, Kazuaki; Okusaka, Takuji; Nakagama, Hitoshi; Shibata, Tatsuhiro; Yachida, Shinichi

    2015-12-16

    Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect and monitor molecular characteristics of tumors. In the present study, we determined the mutational status of KRAS in plasma cfDNA using multiplex picoliter-droplet digital PCR in 259 patients with PDAC. We constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA and matched germline DNA samples in 48 patients who had ≥1% mutant allele frequencies of KRAS in plasma cfDNA. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2%) examined by targeted deep sequencing of cfDNA. We also analyzed somatic copy number alterations based on the targeted sequencing data using our in-house algorithm, and potentially targetable amplifications were detected. Assessment of mutations and copy number alterations in plasma cfDNA may provide a prognostic and diagnostic tool to assist decisions regarding optimal therapeutic strategies for PDAC patients.

  12. Rapid molecular diagnostics of severe primary immunodeficiency determined by using targeted next-generation sequencing.

    PubMed

    Yu, Hui; Zhang, Victor Wei; Stray-Pedersen, Asbjørg; Hanson, Imelda Celine; Forbes, Lisa R; de la Morena, M Teresa; Chinn, Ivan K; Gorman, Elizabeth; Mendelsohn, Nancy J; Pozos, Tamara; Wiszniewski, Wojciech; Nicholas, Sarah K; Yates, Anne B; Moore, Lindsey E; Berge, Knut Erik; Sorte, Hanne; Bayer, Diana K; ALZahrani, Daifulah; Geha, Raif S; Feng, Yanming; Wang, Guoli; Orange, Jordan S; Lupski, James R; Wang, Jing; Wong, Lee-Jun

    2016-10-01

    Primary immunodeficiency diseases (PIDDs) are inherited disorders of the immune system. The most severe form, severe combined immunodeficiency (SCID), presents with profound deficiencies of T cells, B cells, or both at birth. If not treated promptly, affected patients usually do not live beyond infancy because of infections. Genetic heterogeneity of SCID frequently delays the diagnosis; a specific diagnosis is crucial for life-saving treatment and optimal management. We developed a next-generation sequencing (NGS)-based multigene-targeted panel for SCID and other severe PIDDs requiring rapid therapeutic actions in a clinical laboratory setting. The target gene capture/NGS assay provides an average read depth of approximately 1000×. The deep coverage facilitates simultaneous detection of single nucleotide variants and exonic copy number variants in one comprehensive assessment. Exons with insufficient coverage (<20× read depth) or high sequence homology (pseudogenes) are complemented by amplicon-based sequencing with specific primers to ensure 100% coverage of all targeted regions. Analysis of 20 patient samples with low T-cell receptor excision circle numbers on newborn screening or a positive family history or clinical suspicion of SCID or other severe PIDD identified deleterious mutations in 14 of them. Identified pathogenic variants included both single nucleotide variants and exonic copy number variants, such as hemizygous nonsense, frameshift, and missense changes in IL2RG; compound heterozygous changes in ATM, RAG1, and CIITA; homozygous changes in DCLRE1C and IL7R; and a heterozygous nonsense mutation in CHD7. High-throughput deep sequencing analysis with complete clinical validation greatly increases the diagnostic yield of severe primary immunodeficiency. Establishing a molecular diagnosis enables early immune reconstitution through prompt therapeutic intervention and guides management for improved long-term quality of life. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  13. Revealing the Genomic Landscape of Pediatric T-ALL | Office of Cancer Genomics

    Cancer.gov

    T-lineage acute lymphoblastic leukemia (T-ALL) comprises 15-20% of childhood ALL and has historically been associated with inferior outcome to B-cell  ALL (B-ALL). Recent studies have used genome-wide sequencing approaches to identify new subtypes and targets of mutation in B-ALL, but comprehensive sequencing studies of large cohorts of T-ALL have not been performed.

  14. Targeted next generation sequencing of endoscopic ultrasound acquired cytology from ampullary and pancreatic adenocarcinoma has the potential to aid patient stratification for optimal therapy selection

    PubMed Central

    Gleeson, Ferga C.; Kerr, Sarah E.; Kipp, Benjamin R.; Voss, Jesse S.; Minot, Douglas M.; Tu, Zheng Jin; Henry, Michael R.; Graham, Rondell P.; Vasmatzis, George; Cheville, John C.; Lazaridis, Konstantinos N.; Levy, Michael J.

    2016-01-01

    Background & Aims Less than 10% of registered drug intervention trials for pancreatic ductal adenocarcinoma (PDAC) include a biomarker stratification strategy. The ability to identify distinct mutation subsets via endoscopic ultrasound fine needle aspiration (EUS FNA) molecular cytology could greatly aid clinical trial patient stratification and offer predictive markers. We identified chemotherapy treatment naïve ampullary adenocarcinoma and PDAC patients who underwent EUS FNA to assess multigene mutational frequency and diversity with a surgical resection concordance assessment, where available. Methods Following strict cytology smear screening criteria, targeted next generation sequencing (NGS) using a 160 cancer gene panel was performed. Results Complete sequencing was achieved in 29 patients, whereby 83 pathogenic alterations were identified in 21 genes. Cytology genotyping revealed that the majority of mutations were identified in KRAS (93%), TP53 (72%), SMAD4 (31%), and GNAS (10%). There was 100% concordance for the following pathogenic alterations: KRAS, TP53, SMAD4, KMT2D, NOTCH2, MSH2, RB1, SMARCA4, PPP2R1A, PIK3R1, SCL7A8, ATM, and FANCD2. Absolute multigene mutational concordance was 83%. Incremental cytology smear mutations in GRIN2A, GATA3 and KDM6A were identified despite re-examination of raw sequence reads in the corresponding resection specimens. Conclusions EUS FNA cytology genotyping using a 160 cancer gene NGS panel revealed a broad spectrum of pathogenic alterations. The fidelity of cytology genotyping to that of paired surgical resection specimens suggests that EUS FNA represents a suitable surrogate and may complement the conventional stratification criteria in decision making for therapies and may guide future biomarker driven therapeutic development. PMID:27203738

  15. A distributed computational search strategy for the identification of diagnostics targets: application to finding aptamer targets for methicillin-resistant staphylococci.

    PubMed

    Flanagan, Keith; Cockell, Simon; Harwood, Colin; Hallinan, Jennifer; Nakjang, Sirintra; Lawry, Beth; Wipat, Anil

    2014-06-30

    The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.

  16. A distributed computational search strategy for the identification of diagnostics targets: Application to finding aptamer targets for methicillin-resistant staphylococci.

    PubMed

    Flanagan, Keith; Cockell, Simon; Harwood, Colin; Hallinan, Jennifer; Nakjang, Sirintra; Lawry, Beth; Wipat, Anil

    2014-06-01

    The rapid and cost-effective identification of bacterial species is crucial, especially for clinical diagnosis and treatment. Peptide aptamers have been shown to be valuable for use as a component of novel, direct detection methods. These small peptides have a number of advantages over antibodies, including greater specificity and longer shelf life. These properties facilitate their use as the detector components of biosensor devices. However, the identification of suitable aptamer targets for particular groups of organisms is challenging. We present a semi-automated processing pipeline for the identification of candidate aptamer targets from whole bacterial genome sequences. The pipeline can be configured to search for protein sequence fragments that uniquely identify a set of strains of interest. The system is also capable of identifying additional organisms that may be of interest due to their possession of protein fragments in common with the initial set. Through the use of Cloud computing technology and distributed databases, our system is capable of scaling with the rapidly growing genome repositories, and consequently of keeping the resulting data sets up-to-date. The system described is also more generically applicable to the discovery of specific targets for other diagnostic approaches such as DNA probes, PCR primers and antibodies.

  17. Sequence-based design of bioactive small molecules that target precursor microRNAs.

    PubMed

    Velagapudi, Sai Pradeep; Gallo, Steven M; Disney, Matthew D

    2014-04-01

    Oligonucleotides are designed to target RNA using base pairing rules, but they can be hampered by poor cellular delivery and nonspecific stimulation of the immune system. Small molecules are preferred as lead drugs or probes but cannot be designed from sequence. Herein, we describe an approach termed Inforna that designs lead small molecules for RNA from solely sequence. Inforna was applied to all human microRNA hairpin precursors, and it identified bioactive small molecules that inhibit biogenesis by binding nuclease-processing sites (44% hit rate). Among 27 lead interactions, the most avid interaction is between a benzimidazole (1) and precursor microRNA-96. Compound 1 selectively inhibits biogenesis of microRNA-96, upregulating a protein target (FOXO1) and inducing apoptosis in cancer cells. Apoptosis is ablated when FOXO1 mRNA expression is knocked down by an siRNA, validating compound selectivity. Markedly, microRNA profiling shows that 1 only affects microRNA-96 biogenesis and is at least as selective as an oligonucleotide.

  18. Sequence-based design of bioactive small molecules that target precursor microRNAs

    PubMed Central

    Velagapudi, Sai Pradeep; Gallo, Steven M.; Disney, Matthew D.

    2014-01-01

    Oligonucleotides are designed to target RNA using base pairing rules, however, they are hampered by poor cellular delivery and non-specific stimulation of the immune system. Small molecules are preferred as lead drugs or probes, but cannot be designed from sequence. Herein, we describe an approach termed Inforna that designs lead small molecules for RNA from solely sequence. Inforna was applied to all human microRNA precursors and identified bioactive small molecules that inhibit biogenesis by binding to nuclease processing sites (41% hit rate). Amongst 29 lead interactions, the most avid interaction is between a benzimidazole (1) and precursor microRNA-96. Compound 1 selectively inhibits biogenesis of microRNA-96, upregulating a protein target (FOXO1) and inducing apoptosis in cancer cells. Apoptosis is ablated when FOXO1 mRNA expression is knocked down by an siRNA, validating compound selectivity. Importantly, microRNA profiling shows that 1 only significantly effects microRNA-96 biogenesis and is more selective than an oligonucleotide. PMID:24509821

  19. A Report on Molecular Diagnostic Testing for Inherited Retinal Dystrophies by Targeted Genetic Analyses.

    PubMed

    Ramkumar, Hema L; Gudiseva, Harini V; Kishaba, Kameron T; Suk, John J; Verma, Rohan; Tadimeti, Keerti; Thorson, John A; Ayyagari, Radha

    2017-02-01

    To test the utility of targeted sequencing as a method of clinical molecular testing in patients diagnosed with inherited retinal degeneration (IRD). After genetic counseling, peripheral blood was drawn from 188 probands and 36 carriers of IRD. Single gene testing was performed on each patient in a Clinical Laboratory Improvement Amendment (CLIA) certified laboratory. DNA was isolated, and all exons in the gene of interest were analyzed along with 20 base pairs of flanking intronic sequence. Genetic testing was most often performed on ABCA4, CTRP5, ELOV4, BEST1, CRB1, and PRPH2. Pathogenicity of novel sequence changes was predicted by PolyPhen2 and sorting intolerant from tolerant (SIFT). Of the 225 genetic tests performed, 150 were for recessive IRD, and 75 were for dominant IRD. A positive molecular diagnosis was made in 70 (59%) of probands with recessive IRD and 19 (26%) probands with dominant IRD. Analysis confirmed 12 (34%) of individuals as carriers of familial mutations associated with IRD. Thirty-two novel variants were identified; among these, 17 sequence changes in four genes were predicted to be possibly or probably damaging including: ABCA4 (14), BEST1 (2), PRPH2 (1), and TIMP3 (1). Targeted analysis of clinically suspected genes in 225 subjects resulted in a positive molecular diagnosis in 26% of patients with dominant IRD and 59% of patients with recessive IRD. Novel damaging mutations were identified in four genes. Single gene screening is not an ideal method for diagnostic testing given the phenotypic and genetic heterogeneity among IRD cases. High-throughput sequencing of all genes associated with retinal degeneration may be more efficient for molecular diagnosis.

  20. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    PubMed

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  1. Genetic analyses of isolated high-grade pancreatic intraepithelial neoplasia (HG-PanIN) reveal paucity of alterations in TP53 and SMAD4.

    PubMed

    Hosoda, Waki; Chianchiano, Peter; Griffin, James F; Pittman, Meredith E; Brosens, Lodewijk Aa; Noë, Michaël; Yu, Jun; Shindo, Koji; Suenaga, Masaya; Rezaee, Neda; Yonescu, Raluca; Ning, Yi; Albores-Saavedra, Jorge; Yoshizawa, Naohiko; Harada, Kenichi; Yoshizawa, Akihiko; Hanada, Keiji; Yonehara, Shuji; Shimizu, Michio; Uehara, Takeshi; Samra, Jaswinder S; Gill, Anthony J; Wolfgang, Christopher L; Goggins, Michael G; Hruban, Ralph H; Wood, Laura D

    2017-05-01

    High-grade pancreatic intraepithelial neoplasia (HG-PanIN) is the major precursor of pancreatic ductal adenocarcinoma (PDAC) and is an ideal target for early detection. To characterize pure HG-PanIN, we analysed 23 isolated HG-PanIN lesions occurring in the absence of PDAC. Whole-exome sequencing of five of these HG-PanIN lesions revealed a median of 33 somatic mutations per lesion, with a total of 318 mutated genes. Targeted next-generation sequencing of 17 HG-PanIN lesions identified KRAS mutations in 94% of the lesions. CDKN2A alterations occurred in six HG-PanIN lesions, and RNF43 alterations in five. Mutations in TP53, GNAS, ARID1A, PIK3CA, and TGFBR2 were limited to one or two HG-PanINs. No non-synonymous mutations in SMAD4 were detected. Immunohistochemistry for p53 and SMAD4 proteins in 18 HG-PanINs confirmed the paucity of alterations in these genes, with aberrant p53 labelling noted only in three lesions, two of which were found to be wild type in sequencing analyses. Sixteen adjacent LG-PanIN lesions from ten patients were also sequenced using targeted sequencing. LG-PanIN harboured KRAS mutations in 94% of the lesions; mutations in CDKN2A, TP53, and SMAD4 were not identified. These results suggest that inactivation of TP53 and SMAD4 are late genetic alterations, predominantly occurring in invasive PDAC. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. Copyright © 2017 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  2. Linking maternal and somatic 5S rRNA types with different sequence-specific non-LTR retrotransposons.

    PubMed

    Locati, Mauro D; Pagano, Johanna F B; Ensink, Wim A; van Olst, Marina; van Leeuwen, Selina; Nehrdich, Ulrike; Zhu, Kongju; Spaink, Herman P; Girard, Geneviève; Rauwerda, Han; Jonker, Martijs J; Dekker, Rob J; Breit, Timo M

    2017-04-01

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo, and adult tissue identified maternal-type 5S rRNA that is exclusively accumulated during oogenesis, replaced throughout the embryogenesis by a somatic-type, and thus virtually absent in adult somatic tissue. The maternal-type 5S rDNA contains several thousands of gene copies on chromosome 4 in tandem repeats with small intergenic regions, whereas the somatic-type is present in only 12 gene copies on chromosome 18 with large intergenic regions. The nine-nucleotide variation between the two 5S rRNA types likely affects TFIII binding and riboprotein L5 binding, probably leading to storage of maternal-type rRNA. Remarkably, these sequence differences are located exactly at the sequence-specific target site for genome integration by the 5S rRNA-specific Mutsu retrotransposon family. Thus, we could define maternal- and somatic-type MutsuDr subfamilies. Furthermore, we identified four additional maternal-type and two new somatic-type MutsuDr subfamilies, each with their own target sequence. This target-site specificity, frequently intact maternal-type retrotransposon elements, plus specific presence of Mutsu retrotransposon RNA and piRNA in egg and adult tissue, suggest an involvement of retrotransposons in achieving the differential copy number of the two types of 5S rDNA loci. © 2017 Locati et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  3. Single-Center Experience with a Targeted Next Generation Sequencing Assay for Assessment of Relevant Somatic Alterations in Solid Tumors.

    PubMed

    Paasinen-Sohns, Aino; Koelzer, Viktor H; Frank, Angela; Schafroth, Julian; Gisler, Aline; Sachs, Melanie; Graber, Anne; Rothschild, Sacha I; Wicki, Andreas; Cathomas, Gieri; Mertz, Kirsten D

    2017-03-01

    Companion diagnostics rely on genomic testing of molecular alterations to enable effective cancer treatment. Here we report the clinical application and validation of the Oncomine Focus Assay (OFA), an integrated, commercially available next-generation sequencing (NGS) assay for the rapid and simultaneous detection of single nucleotide variants, short insertions and deletions, copy number variations, and gene rearrangements in 52 cancer genes with therapeutic relevance. Two independent patient cohorts were investigated to define the workflow, turnaround times, feasibility, and reliability of OFA targeted sequencing in clinical application and using archival material. Cohort I consisted of 59 diagnostic clinical samples from the daily routine submitted for molecular testing over a 4-month time period. Cohort II consisted of 39 archival melanoma samples that were up to 15years old. Libraries were prepared from isolated nucleic acids and sequenced on the Ion Torrent PGM sequencer. Sequencing datasets were analyzed using the Ion Reporter software. Genomic alterations were identified and validated by orthogonal conventional assays including pyrosequencing and immunohistochemistry. Sequencing results of both cohorts, including archival formalin-fixed, paraffin-embedded material stored up to 15years, were consistent with published variant frequencies. A concordance of 100% between established assays and OFA targeted NGS was observed. The OFA workflow enabled a turnaround of 3½ days. Taken together, OFA was found to be a convenient tool for fast, reliable, broadly applicable and cost-effective targeted NGS of tumor samples in routine diagnostics. Thus, OFA has strong potential to become an important asset for precision oncology. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design.

    PubMed

    Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A

    2016-03-01

    The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.

  5. Defining the disulphide stress response in Streptomyces coelicolor A3(2): identification of the sigmaR regulon.

    PubMed

    Paget, M S; Molle, V; Cohen, G; Aharonowitz, Y; Buttner, M J

    2001-11-01

    In the Gram-positive, antibiotic-producing bacterium Streptomyces coelicolor A3(2), the thiol-disulphide status of the hyphae is controlled by a novel regulatory system consisting of a sigma factor, sigmaR, and its cognate anti-sigma factor, RsrA. Oxidative stress induces intramolecular disulphide bond formation in RsrA, which causes it to lose affinity for sigmaR, thereby releasing sigmaR to activate transcription of the thioredoxin operon, trxBA. Here, we exploit a preliminary consensus sequence for sigmaR target promoters to identify 27 new sigmaR target genes and operons, thereby defining the global response to disulphide stress in this organism. Target genes related to thiol metabolism encode a second thioredoxin (TrxC), a glutaredoxin-like protein and enzymes involved in the biosynthesis of the low-molecular-weight thiol-containing compounds cysteine and molybdopterin. In addition, the level of the major actinomycete thiol buffer, mycothiol, was fourfold lower in a sigR null mutant, although no candidate mycothiol biosynthetic genes were identified among the sigmaR targets. Three sigmaR target genes encode ribosome-associated products (ribosomal subunit L31, ppGpp synthetase and tmRNA), suggesting that the translational machinery is modified by disulphide stress. The product of another sigmaR target gene was found to be a novel RNA polymerase-associated protein, RbpA, suggesting that the transcriptional machinery may also be modified in response to disulphide stress. We present DNA sequence evidence that many of the targets identified in S. coelicolor are also under the control of the sigmaR homologue in the actinomycete pathogen Mycobacterium tuberculosis.

  6. A global comparability approach for biosimilar monoclonal antibodies using LC-tandem MS based proteomics.

    PubMed

    Chen, Shun-Li; Wu, Shiaw-Lin; Huang, Li-Juan; Huang, Jia-Bao; Chen, Shu-Hui

    2013-06-01

    Liquid chromatography-tandem mass spectrometry-based proteomics for peptide mapping and sequencing was used to characterize the marketed monoclonal antibody trastuzumab and compare it with two biosimilar products, mAb A containing D359E and L361M variations at the Fc site and mAb B without variants. Complete sequence coverage (100%) including disulfide linkages, glycosylations and other commonly occurring modifications (i.e., deamidation, oxidation, dehydration and K-clipping) were identified using maps generated from multi-enzyme digestions. In addition to the targeted comparison for the relative populations of targeted modification forms, a non-targeted approach was used to globally compare ion intensities in tryptic maps. The non-targeted comparison provided an extra-dimensional view to examine any possible differences related to variants or modifications. A peptide containing the two variants in mAb A, D359E and L361M, was revealed using the non-targeted comparison of the tryptic maps. In contrast, no significant differences were observed when trastuzumab was self-compared or compared with mAb B. These results were consistent with the data derived from peptide sequencing via collision induced dissociation/electron transfer dissociation. Thus, combined targeted and non-targeted approaches using powerful mass spectrometry-based proteomic tools hold great promise for the structural characterization of biosimilar products. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. Cultivation-dependent and cultivation-independent characterization of hydrocarbon-degrading bacteria in Guaymas Basin sediments.

    PubMed

    Gutierrez, Tony; Biddle, Jennifer F; Teske, Andreas; Aitken, Michael D

    2015-01-01

    Marine hydrocarbon-degrading bacteria perform a fundamental role in the biodegradation of crude oil and its petrochemical derivatives in coastal and open ocean environments. However, there is a paucity of knowledge on the diversity and function of these organisms in deep-sea sediment. Here we used stable-isotope probing (SIP), a valuable tool to link the phylogeny and function of targeted microbial groups, to investigate polycyclic aromatic hydrocarbon (PAH)-degrading bacteria under aerobic conditions in sediments from Guaymas Basin with uniformly labeled [(13)C]-phenanthrene (PHE). The dominant sequences in clone libraries constructed from (13)C-enriched bacterial DNA (from PHE enrichments) were identified to belong to the genus Cycloclasticus. We used quantitative PCR primers targeting the 16S rRNA gene of the SIP-identified Cycloclasticus to determine their abundance in sediment incubations amended with unlabeled PHE and showed substantial increases in gene abundance during the experiments. We also isolated a strain, BG-2, representing the SIP-identified Cycloclasticus sequence (99.9% 16S rRNA gene sequence identity), and used this strain to provide direct evidence of PHE degradation and mineralization. In addition, we isolated Halomonas, Thalassospira, and Lutibacterium sp. with demonstrable PHE-degrading capacity from Guaymas Basin sediment. This study demonstrates the value of coupling SIP with cultivation methods to identify and expand on the known diversity of PAH-degrading bacteria in the deep-sea.

  8. Cultivation-dependent and cultivation-independent characterization of hydrocarbon-degrading bacteria in Guaymas Basin sediments

    PubMed Central

    Gutierrez, Tony; Biddle, Jennifer F.; Teske, Andreas; Aitken, Michael D.

    2015-01-01

    Marine hydrocarbon-degrading bacteria perform a fundamental role in the biodegradation of crude oil and its petrochemical derivatives in coastal and open ocean environments. However, there is a paucity of knowledge on the diversity and function of these organisms in deep-sea sediment. Here we used stable-isotope probing (SIP), a valuable tool to link the phylogeny and function of targeted microbial groups, to investigate polycyclic aromatic hydrocarbon (PAH)-degrading bacteria under aerobic conditions in sediments from Guaymas Basin with uniformly labeled [13C]-phenanthrene (PHE). The dominant sequences in clone libraries constructed from 13C-enriched bacterial DNA (from PHE enrichments) were identified to belong to the genus Cycloclasticus. We used quantitative PCR primers targeting the 16S rRNA gene of the SIP-identified Cycloclasticus to determine their abundance in sediment incubations amended with unlabeled PHE and showed substantial increases in gene abundance during the experiments. We also isolated a strain, BG-2, representing the SIP-identified Cycloclasticus sequence (99.9% 16S rRNA gene sequence identity), and used this strain to provide direct evidence of PHE degradation and mineralization. In addition, we isolated Halomonas, Thalassospira, and Lutibacterium sp. with demonstrable PHE-degrading capacity from Guaymas Basin sediment. This study demonstrates the value of coupling SIP with cultivation methods to identify and expand on the known diversity of PAH-degrading bacteria in the deep-sea. PMID:26217326

  9. Rapid Identification of Chemoresistance Mechanisms Using Yeast DNA Mismatch Repair Mutants

    PubMed Central

    Ojini, Irene; Gammie, Alison

    2015-01-01

    Resistance to cancer therapy is a major obstacle in the long-term treatment of cancer. A greater understanding of drug resistance mechanisms will ultimately lead to the development of effective therapeutic strategies to prevent resistance from occurring. Here, we exploit the mutator phenotype of mismatch repair defective yeast cells combined with whole genome sequencing to identify drug resistance mutations in key pathways involved in the development of chemoresistance. The utility of this approach was demonstrated via the identification of the known CAN1 and TOP1 resistance targets for two compounds, canavanine and camptothecin, respectively. We have also experimentally validated the plasma membrane transporter HNM1 as the primary drug resistance target of mechlorethamine. Furthermore, the sequencing of mitoxantrone-resistant strains identified inactivating mutations within IPT1, a gene encoding inositolphosphotransferase, an enzyme involved in sphingolipid biosynthesis. In the case of bactobolin, a promising anticancer drug, the endocytosis pathway was identified as the drug resistance target responsible for conferring resistance. Finally, we show that that rapamycin, an mTOR inhibitor previously shown to alter the fitness of the ipt1 mutant, can effectively prevent the formation of mitoxantrone resistance. The rapid and robust nature of these techniques, using Saccharomyces cerevisiae as a model organism, should accelerate the identification of drug resistance targets and guide the development of novel therapeutic combination strategies to prevent the development of chemoresistance in various cancers. PMID:26199284

  10. Detection of Plasmodium falciparum Infection in Anopheles squamosus (Diptera: Culicidae) in an Area Targeted for Malaria Elimination, Southern Zambia

    PubMed Central

    Stevenson, Jennifer C.; Simubali, Limonty; Mbambara, Saidon; Musonda, Michael; Mweetwa, Sydney; Mudenda, Twig; Pringle, Julia C.; Jones, Christine M.; Norris, Douglas E.

    2016-01-01

    Southern Zambia is the focus of strategies to create malaria-free zones. Interventions being rolled out include test and treat strategies and distribution of insecticide-treated bed nets that target vectors that host-seek indoors and late at night. In Macha, Choma District, collections of mosquitoes were made outdoors using barrier screens within homesteads or UV bulb light traps set next to goats, cattle, or chickens during the rainy season of 2015. Anopheline mosquitoes were identified to species using molecular methods and Plasmodium falciparum infectivity was determined by ELISA and real-time qPCR methods. More than 40% of specimens caught were identified as Anopheles squamosus Theobald, 1901 of which six were found harboring malaria parasites. A single sample, morphologically identified as Anopheles coustani Laveran, 1900, was also found to be infectious. All seven specimens were caught outdoors next to goat pens. Parasite-positive specimens as well as a subset of An. squamosus specimens from either the same study or archive collections from the same area underwent sequencing of the mitochondrial cytochrome oxidase subunit I gene. Maximum parsimony trees constructed from the aligned sequences indicated presence of at least two clades of An. squamosus with infectious specimens falling in each clade. The single infectious specimen identified morphologically as An. coustani could not be matched to reference sequences. This is the first report from Zambia of infections in An. squamosus, a species which is described in literature to display exophagic traits. The bionomic characteristics of this species needs to be studied further to fully evaluate the implications for indoor-targeted vector control. PMID:27297214

  11. Whole-exome sequencing identifies recurrent SF3B1 R625 mutation and comutation of NF1 and KIT in mucosal melanoma.

    PubMed

    Hintzsche, Jennifer D; Gorden, Nicholas T; Amato, Carol M; Kim, Jihye; Wuensch, Kelsey E; Robinson, Steven E; Applegate, Allison J; Couts, Kasey L; Medina, Theresa M; Wells, Keith R; Wisell, Joshua A; McCarter, Martin D; Box, Neil F; Shellman, Yiqun G; Gonzalez, Rene C; Lewis, Karl D; Tentler, John J; Tan, Aik Choon; Robinson, William A

    2017-06-01

    Mucosal melanomas are a rare subtype of melanoma, arising in mucosal tissues, which have a very poor prognosis due to the lack of effective targeted therapies. This study aimed to better understand the molecular landscape of these cancers and find potential new therapeutic targets. Whole-exome sequencing was performed on mucosal melanomas from 19 patients and 135 sun-exposed cutaneous melanomas, with matched peripheral blood samples when available. Mutational profiles were compared between mucosal subgroups and sun-exposed cutaneous melanomas. Comparisons of molecular profiles identified 161 genes enriched in mucosal melanoma (P<0.05). KIT and NF1 were frequently comutated (32%) in the mucosal subgroup, with a significantly higher incidence than that in cutaneous melanoma (4%). Recurrent SF3B1 R625H/S/C mutations were identified and validated in 7 of 19 (37%) mucosal melanoma patients. Mutations in the spliceosome pathway were found to be enriched in mucosal melanomas when compared with cutaneous melanomas. Alternative splicing in four genes were observed in SF3B1-mutant samples compared with the wild-type samples. This study identified potential new therapeutic targets for mucosal melanoma, including comutation of NF1 and KIT, and recurrent R625 mutations in SF3B1. This is the first report of SF3B1 R625 mutations in vulvovaginal mucosal melanoma, with the largest whole-exome sequencing project of mucosal melanomas to date. The results here also indicated that the mutations in SF3B1 lead to alternative splicing in multiple genes. These findings expand our knowledge of this rare disease.

  12. Whole-exome sequencing identifies recurrent SF3B1 R625 mutation and comutation of NF1 and KIT in mucosal melanoma

    PubMed Central

    Hintzsche, Jennifer D.; Gorden, Nicholas T.; Amato, Carol M.; Kim, Jihye; Wuensch, Kelsey E.; Robinson, Steven E.; Applegate, Allison J.; Couts, Kasey L.; Medina, Theresa M.; Wells, Keith R.; Wisell, Joshua A.; McCarter, Martin D.; Box, Neil F.; Shellman, Yiqun G.; Gonzalez, Rene C.; Lewis, Karl D.; Tentler, John J.

    2017-01-01

    Mucosal melanomas are a rare subtype of melanoma, arising in mucosal tissues, which have a very poor prognosis due to the lack of effective targeted therapies. This study aimed to better understand the molecular landscape of these cancers and find potential new therapeutic targets. Whole-exome sequencing was performed on mucosal melanomas from 19 patients and 135 sun-exposed cutaneous melanomas, with matched peripheral blood samples when available. Mutational profiles were compared between mucosal subgroups and sun-exposed cutaneous melanomas. Comparisons of molecular profiles identified 161 genes enriched in mucosal melanoma (P<0.05). KIT and NF1 were frequently comutated (32%) in the mucosal subgroup, with a significantly higher incidence than that in cutaneous melanoma (4%). Recurrent SF3B1 R625H/S/C mutations were identified and validated in 7 of 19 (37%) mucosal melanoma patients. Mutations in the spliceosome pathway were found to be enriched in mucosal melanomas when compared with cutaneous melanomas. Alternative splicing in four genes were observed in SF3B1-mutant samples compared with the wild-type samples. This study identified potential new therapeutic targets for mucosal melanoma, including comutation of NF1 and KIT, and recurrent R625 mutations in SF3B1. This is the first report of SF3B1 R625 mutations in vulvovaginal mucosal melanoma, with the largest whole-exome sequencing project of mucosal melanomas to date. The results here also indicated that the mutations in SF3B1 lead to alternative splicing in multiple genes. These findings expand our knowledge of this rare disease. PMID:28296713

  13. Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA.

    PubMed

    Dong, F; Allawi, H T; Anderson, T; Neri, B P; Lyamichev, V I

    2001-08-01

    DNA sequence analysis by oligonucleotide binding is often affected by interference with the secondary structure of the target DNA. Here we describe an approach that improves DNA secondary structure prediction by combining enzymatic probing of DNA by structure-specific 5'-nucleases with an energy minimization algorithm that utilizes the 5'-nuclease cleavage sites as constraints. The method can identify structural differences between two DNA molecules caused by minor sequence variations such as a single nucleotide mutation. It also demonstrates the existence of long-range interactions between DNA regions separated by >300 nt and the formation of multiple alternative structures by a 244 nt DNA molecule. The differences in the secondary structure of DNA molecules revealed by 5'-nuclease probing were used to design structure-specific probes for mutation discrimination that target the regions of structural, rather than sequence, differences. We also demonstrate the performance of structure-specific 'bridge' probes complementary to non-contiguous regions of the target molecule. The structure-specific probes do not require the high stringency binding conditions necessary for methods based on mismatch formation and permit mutation detection at temperatures from 4 to 37 degrees C. Structure-specific sequence analysis is applied for mutation detection in the Mycobacterium tuberculosis katG gene and for genotyping of the hepatitis C virus.

  14. Mitochondrial targeting sequence variants of the CHCHD2 gene are a risk for Lewy body disorders

    PubMed Central

    Ogaki, Kotaro; Koga, Shunsuke; Heckman, Michael G.; Fiesel, Fabienne C.; Ando, Maya; Labbé, Catherine; Lorenzo-Betancor, Oswaldo; Moussaud-Lamodière, Elisabeth L.; Soto-Ortolaza, Alexandra I.; Walton, Ronald L.; Strongosky, Audrey J.; Uitti, Ryan J.; McCarthy, Allan; Lynch, Timothy; Siuda, Joanna; Opala, Grzegorz; Rudzinska, Monika; Krygowska-Wajs, Anna; Barcikowska, Maria; Czyzewski, Krzysztof; Puschmann, Andreas; Nishioka, Kenya; Funayama, Manabu; Hattori, Nobutaka; Parisi, Joseph E.; Petersen, Ronald C.; Graff-Radford, Neill R.; Boeve, Bradley F.; Springer, Wolfdieter; Wszolek, Zbigniew K.; Dickson, Dennis W.

    2015-01-01

    Objective: To assess the role of CHCHD2 variants in patients with Parkinson disease (PD) and Lewy body disease (LBD) in Caucasian populations. Methods: All exons of the CHCHD2 gene were sequenced in a US Caucasian patient-control series (878 PD, 610 LBD, and 717 controls). Subsequently, exons 1 and 2 were sequenced in an Irish series (355 PD and 365 controls) and a Polish series (394 PD and 350 controls). Immunohistochemistry and immunofluorescence studies were performed on pathologic LBD cases with rare CHCHD2 variants. Results: We identified 9 rare exonic variants of unknown significance. These variants were more frequent in the combined group of PD and LBD patients compared to controls (0.6% vs 0.1%, p = 0.013). In addition, the presence of any rare variant was more common in patients with LBD (2.5% vs 1.0%, p = 0.050) compared to controls. Eight of these 9 variants were located within the gene's mitochondrial targeting sequence. Conclusions: Although the role of variants of the CHCHD2 gene in PD and LBD remains to be further elucidated, the rare variants in the mitochondrial targeting sequence may be a risk factor for Lewy body disorders, which may link CHCHD2 to other genetic forms of parkinsonism with mitochondrial dysfunction. PMID:26561290

  15. Code-modulated visual evoked potentials using fast stimulus presentation and spatiotemporal beamformer decoding.

    PubMed

    Wittevrongel, Benjamin; Van Wolputte, Elia; Van Hulle, Marc M

    2017-11-08

    When encoding visual targets using various lagged versions of a pseudorandom binary sequence of luminance changes, the EEG signal recorded over the viewer's occipital pole exhibits so-called code-modulated visual evoked potentials (cVEPs), the phase lags of which can be tied to these targets. The cVEP paradigm has enjoyed interest in the brain-computer interfacing (BCI) community for the reported high information transfer rates (ITR, in bits/min). In this study, we introduce a novel decoding algorithm based on spatiotemporal beamforming, and show that this algorithm is able to accurately identify the gazed target. Especially for a small number of repetitions of the coding sequence, our beamforming approach significantly outperforms an optimised support vector machine (SVM)-based classifier, which is considered state-of-the-art in cVEP-based BCI. In addition to the traditional 60 Hz stimulus presentation rate for the coding sequence, we also explore the 120 Hz rate, and show that the latter enables faster communication, with a maximal median ITR of 172.87 bits/min. Finally, we also report on a transition effect in the EEG signal following the onset of the stimulus sequence, and recommend to exclude the first 150 ms of the trials from decoding when relying on a single presentation of the stimulus sequence.

  16. Multiple Nucleosome Positioning Sites Regulate the CTCF-Mediated Insulator Function of the H19 Imprinting Control Region†

    PubMed Central

    Kanduri, Meena; Kanduri, Chandrasekhar; Mariano, Piero; Vostrov, Alexander A.; Quitschke, Wolfgang; Lobanenkov, Victor; Ohlsson, Rolf

    2002-01-01

    The 5′ region of the H19 gene harbors a methylation-sensitive chromatin insulator within an imprinting control region (ICR). Insertional mutagenesis in combination with episomal assays identified nucleosome positioning sequences (NPSs) that set the stage for the remarkably precise distribution of the four target sites for the chromatin insulator protein CTCF to nucleosome linker sequences in the H19 ICR. Changing positions of the NPSs resulted in loss of both CTCF target site occupancy and insulator function, suggesting that the NPSs optimize the fidelity of the insulator function. We propose that the NPSs ensure the fidelity of the repressed status of the maternal Igf2 allele during development by constitutively maintaining availability of the CTCF target sites. PMID:11971967

  17. The Pediatric Cancer Genome Project

    PubMed Central

    Downing, James R; Wilson, Richard K; Zhang, Jinghui; Mardis, Elaine R; Pui, Ching-Hon; Ding, Li; Ley, Timothy J; Evans, William E

    2013-01-01

    The St. Jude Children’s Research Hospital–Washington University Pediatric Cancer Genome Project (PCGP) is participating in the international effort to identify somatic mutations that drive cancer. These cancer genome sequencing efforts will not only yield an unparalleled view of the altered signaling pathways in cancer but should also identify new targets against which novel therapeutics can be developed. Although these projects are still deep in the phase of generating primary DNA sequence data, important results are emerging and valuable community resources are being generated that should catalyze future cancer research. We describe here the rationale for conducting the PCGP, present some of the early results of this project and discuss the major lessons learned and how these will affect the application of genomic sequencing in the clinic. PMID:22641210

  18. Personalized Oncology Through Integrative High-Throughput Sequencing: A Pilot Study

    PubMed Central

    Roychowdhury, Sameek; Iyer, Matthew K.; Robinson, Dan R.; Lonigro, Robert J.; Wu, Yi-Mi; Cao, Xuhong; Kalyana-Sundaram, Shanker; Sam, Lee; Balbin, O. Alejandro; Quist, Michael J.; Barrette, Terrence; Everett, Jessica; Siddiqui, Javed; Kunju, Lakshmi P.; Navone, Nora; Araujo, John C.; Troncoso, Patricia; Logothetis, Christopher J.; Innis, Jeffrey W.; Smith, David C.; Lao, Christopher D.; Kim, Scott Y.; Roberts, J. Scott; Gruber, Stephen B.; Pienta, Kenneth J.; Talpaz, Moshe; Chinnaiyan, Arul M.

    2012-01-01

    Individual cancers harbor a set of genetic aberrations that can be informative for identifying rational therapies currently available or in clinical trials. We implemented a pilot study to explore the practical challenges of applying high-throughput sequencing in clinical oncology. We enrolled patients with advanced or refractory cancer who were eligible for clinical trials. For each patient, we performed whole-genome sequencing of the tumor, targeted whole-exome sequencing of tumor and normal DNA, and transcriptome sequencing (RNA-Seq) of the tumor to identify potentially informative mutations in a clinically relevant time frame of 3 to 4 weeks. With this approach, we detected several classes of cancer mutations including structural rearrangements, copy number alterations, point mutations, and gene expression alterations. A multidisciplinary Sequencing Tumor Board (STB) deliberated on the clinical interpretation of the sequencing results obtained. We tested our sequencing strategy on human prostate cancer xenografts. Next, we enrolled two patients into the clinical protocol and were able to review the results at our STB within 24 days of biopsy. The first patient had metastatic colorectal cancer in which we identified somatic point mutations in NRAS, TP53, AURKA, FAS, and MYH11, plus amplification and overexpression of cyclin-dependent kinase 8 (CDK8). The second patient had malignant melanoma, in which we identified a somatic point mutation in HRAS and a structural rearrangement affecting CDKN2C. The STB identified the CDK8 amplification and Ras mutation as providing a rationale for clinical trials with CDK inhibitors or MEK (mitogenactivated or extracellular signal–regulated protein kinase kinase) and PI3K (phosphatidylinositol 3-kinase) inhibitors, respectively. Integrative high-throughput sequencing of patients with advanced cancer generates a comprehensive, individual mutational landscape to facilitate biomarker-driven clinical trials in oncology. PMID:22133722

  19. Soft computing model for optimized siRNA design by identifying off target possibilities using artificial neural network model.

    PubMed

    Murali, Reena; John, Philips George; Peter S, David

    2015-05-15

    The ability of small interfering RNA (siRNA) to do posttranscriptional gene regulation by knocking down targeted genes is an important research topic in functional genomics, biomedical research and in cancer therapeutics. Many tools had been developed to design exogenous siRNA with high experimental inhibition. Even though considerable amount of work has been done in designing exogenous siRNA, design of effective siRNA sequences is still a challenging work because the target mRNAs must be selected such that their corresponding siRNAs are likely to be efficient against that target and unlikely to accidentally silence other transcripts due to sequence similarity. In some cases, siRNAs may tolerate mismatches with the target mRNA, but knockdown of genes other than the intended target could make serious consequences. Hence to design siRNAs, two important concepts must be considered: the ability in knocking down target genes and the off target possibility on any nontarget genes. So before doing gene silencing by siRNAs, it is essential to analyze their off target effects in addition to their inhibition efficacy against a particular target. Only a few methods have been developed by considering both efficacy and off target possibility of siRNA against a gene. In this paper we present a new design of neural network model with whole stacking energy (ΔG) that enables to identify the efficacy and off target effect of siRNAs against target genes. The tool lists all siRNAs against a particular target with their inhibition efficacy and number of matches or sequence similarity with other genes in the database. We could achieve an excellent performance of Pearson Correlation Coefficient (R=0. 74) and Area Under Curve (AUC=0.906) when the threshold of whole stacking energy is ≥-34.6 kcal/mol. To the best of the author's knowledge, this is one of the best score while considering the "combined efficacy and off target possibility" of siRNA for silencing a gene. The proposed model shall be useful for designing exogenous siRNA for therapeutic applications and gene silencing techniques in the area of bioinformatics. The software is developed as a desktop application and available at http://opsid.in/opsid/. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Validating regulatory predictions from diverse bacteria with mutant fitness data

    DOE PAGES

    Sagawa, Shiori; Price, Morgan N.; Deutschbauer, Adam M.; ...

    2017-05-24

    Although transcriptional regulation is fundamental to understanding bacterial physiology, the targets of most bacterial transcription factors are not known. Comparative genomics has been used to identify likely targets of some of these transcription factors, but these predictions typically lack experimental support. Here, we used mutant fitness data, which measures the importance of each gene for a bacterium's growth across many conditions, to test regulatory predictions from RegPrecise, a curated collection of comparative genomics predictions. Because characterized transcription factors often have correlated fitness with one of their targets (either positively or negatively), correlated fitness patterns provide support for the comparative genomicsmore » predictions. At a false discovery rate of 3%, we identified significant cofitness for at least one target of 158 TFs in 107 ortholog groups and from 24 bacteria. Thus, high-throughput genetics can be used to identify a high-confidence subset of the sequence-based regulatory predictions.« less

  1. Validating regulatory predictions from diverse bacteria with mutant fitness data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sagawa, Shiori; Price, Morgan N.; Deutschbauer, Adam M.

    Although transcriptional regulation is fundamental to understanding bacterial physiology, the targets of most bacterial transcription factors are not known. Comparative genomics has been used to identify likely targets of some of these transcription factors, but these predictions typically lack experimental support. Here, we used mutant fitness data, which measures the importance of each gene for a bacterium's growth across many conditions, to test regulatory predictions from RegPrecise, a curated collection of comparative genomics predictions. Because characterized transcription factors often have correlated fitness with one of their targets (either positively or negatively), correlated fitness patterns provide support for the comparative genomicsmore » predictions. At a false discovery rate of 3%, we identified significant cofitness for at least one target of 158 TFs in 107 ortholog groups and from 24 bacteria. Thus, high-throughput genetics can be used to identify a high-confidence subset of the sequence-based regulatory predictions.« less

  2. Molecular characterization of oral squamous cell carcinoma using targeted next-generation sequencing.

    PubMed

    Er, Tze-Kiong; Wang, Yen-Yun; Chen, Chih-Chieh; Herreros-Villanueva, Marta; Liu, Ta-Chih; Yuan, Shyng-Shiou F

    2015-10-01

    Many genetic factors play an important role in the development of oral squamous cell carcinoma. The aim of this study was to assess the mutational profile in oral squamous cell carcinoma using formalin-fixed, paraffin-embedded tumors from a Taiwanese population by performing targeted sequencing of 26 cancer-associated genes that are frequently mutated in solid tumors. Next-generation sequencing was performed in 50 formalin-fixed, paraffin-embedded tumor specimens obtained from patients with oral squamous cell carcinoma. Genetic alterations in the 26 cancer-associated genes were detected using a deep sequencing (>1000X) approach. TP53, PIK3CA, MET, APC, CDH1, and FBXW7 were most frequently mutated genes. Most remarkably, TP53 mutations and PIK3CA mutations, which accounted for 68% and 18% of tumors, respectively, were more prevalent in a Taiwanese population. Other genes including MET (4%), APC (4%), CDH1 (2%), and FBXW7 (2%) were identified in our population. In summary, our study shows the feasibility of performing targeted sequencing using formalin-fixed, paraffin-embedded samples. Additionally, this study also reports the mutational landscape of oral squamous cell carcinoma in the Taiwanese population. We believe that this study will shed new light on fundamental aspects in understanding the molecular pathogenesis of oral squamous cell carcinoma and may aid in the development of new targeted therapies. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  3. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

    PubMed Central

    2009-01-01

    Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416

  4. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome.

    PubMed

    Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg

    2009-08-06

    Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.

  5. NIH announces the launch of 3 integrated precision medicine trials: ALCHEMIST

    Cancer.gov

    The Adjuvant Lung Cancer Enrichment Marker Identification and Sequencing Trials, or ALCHEMIST, will identify early-stage lung cancer patients with tumors that harbor certain uncommon genetic changes and evaluate whether drug treatments targeted against

  6. Long Term Follow up of the Delayed Effects of Acute Radiation Exposure in Primates

    DTIC Science & Technology

    2017-10-01

    66 of 94 We will then use shRNAs and/or CRISPR constructs targeting the gene of interest to knock down its expression in stem cells prior to...DLBCLs Mutational profiling identifies 150 driver genes Gene expression identifies sub- groups including cell of origin Unbiased CRISPR screen...Exome sequencing in 1,001 DLBCL patients comprehensively identifies 150 driver genes d Unbiased CRISPR screen in DLBCL cell lines identifies essential

  7. Identification of microRNAs involved in lipid biosynthesis and seed size in developing sea buckthorn seeds using high-throughput sequencing.

    PubMed

    Ding, Jian; Ruan, Chengjiang; Guan, Ying; Krishna, Priti

    2018-03-05

    Sea buckthorn is a plant of medicinal and nutritional importance owing in part to the high levels of essential fatty acids, linoleic (up to 42%) and α-linolenic (up to 39%) acids in the seed oil. Sea buckthorn can produce seeds either via the sexual pathway or by apomixis. The seed development and maturation programs are critically dependent on miRNAs. To understand miRNA-mediated regulation of sea buckthorn seed development, eight small RNA libraries were constructed for deep sequencing from developing seeds of a low oil content line 'SJ1' and a high oil content line 'XE3'. High-throughput sequencing identified 137 known miRNA from 27 families and 264 novel miRNAs. The potential targets of the identified miRNAs were predicted based on sequence homology. Nineteen (four known and 15 novel) and 22 (six known and 16 novel) miRNAs were found to be involved in lipid biosynthesis and seed size, respectively. An integrated analysis of mRNA and miRNA transcriptome and qRT-PCR identified some key miRNAs and their targets (miR164d-ARF2, miR168b-Δ9D, novelmiRNA-108-ACC, novelmiRNA-23-GPD1, novelmiRNA-58-DGAT1, and novelmiRNA-191-DGAT2) potentially involved in seed size and lipid biosynthesis of sea buckthorn seed. These results indicate the potential importance of miRNAs in regulating lipid biosynthesis and seed size in sea buckthorn.

  8. Targeting of Repeated Sequences Unique to a Gene Results in Significant Increases in Antisense Oligonucleotide Potency

    PubMed Central

    Vickers, Timothy A.; Freier, Susan M.; Bui, Huynh-Hoa; Watt, Andrew; Crooke, Stanley T.

    2014-01-01

    A new strategy for identifying potent RNase H-dependent antisense oligonucleotides (ASOs) is presented. Our analysis of the human transcriptome revealed that a significant proportion of genes contain unique repeated sequences of 16 or more nucleotides in length. Activities of ASOs targeting these repeated sites in several representative genes were compared to those of ASOs targeting unique single sites in the same transcript. Antisense activity at repeated sites was also evaluated in a highly controlled minigene system. Targeting both native and minigene repeat sites resulted in significant increases in potency as compared to targeting of non-repeated sites. The increased potency at these sites is a result of increased frequency of ASO/RNA interactions which, in turn, increases the probability of a productive interaction between the ASO/RNA heteroduplex and human RNase H1 in the cell. These results suggest a new, highly efficient strategy for rapid identification of highly potent ASOs. PMID:25334092

  9. Clinical applicability and cost of a 46-gene panel for genomic analysis of solid tumours: Retrospective validation and prospective audit in the UK National Health Service

    PubMed Central

    Kaur, Kulvinder; Camps, Carme; Kaisaki, Pamela; Gupta, Avinash; Talbot, Denis; Middleton, Mark; Henderson, Shirley; Cutts, Anthony; Vavoulis, Dimitrios V.; Housby, Nick; Taylor, Jenny C.; Schuh, Anna

    2017-01-01

    Background Single gene tests to predict whether cancers respond to specific targeted therapies are performed increasingly often. Advances in sequencing technology, collectively referred to as next generation sequencing (NGS), mean the entire cancer genome or parts of it can now be sequenced at speed with increased depth and sensitivity. However, translation of NGS into routine cancer care has been slow. Healthcare stakeholders are unclear about the clinical utility of NGS and are concerned it could be an expensive addition to cancer diagnostics, rather than an affordable alternative to single gene testing. Methods and findings We validated a 46-gene hotspot cancer panel assay allowing multiple gene testing from small diagnostic biopsies. From 1 January 2013 to 31 December 2013, solid tumour samples (including non-small-cell lung carcinoma [NSCLC], colorectal carcinoma, and melanoma) were sequenced in the context of the UK National Health Service from 351 consecutively submitted prospective cases for which treating clinicians thought the patient had potential to benefit from more extensive genetic analysis. Following histological assessment, tumour-rich regions of formalin-fixed paraffin-embedded (FFPE) sections underwent macrodissection, DNA extraction, NGS, and analysis using a pipeline centred on Torrent Suite software. With a median turnaround time of seven working days, an integrated clinical report was produced indicating the variants detected, including those with potential diagnostic, prognostic, therapeutic, or clinical trial entry implications. Accompanying phenotypic data were collected, and a detailed cost analysis of the panel compared with single gene testing was undertaken to assess affordability for routine patient care. Panel sequencing was successful for 97% (342/351) of tumour samples in the prospective cohort and showed 100% concordance with known mutations (detected using cobas assays). At least one mutation was identified in 87% (296/342) of tumours. A locally actionable mutation (i.e., available targeted treatment or clinical trial) was identified in 122/351 patients (35%). Forty patients received targeted treatment, in 22/40 (55%) cases solely due to use of the panel. Examination of published data on the potential efficacy of targeted therapies showed theoretically actionable mutations (i.e., mutations for which targeted treatment was potentially appropriate) in 66% (71/107) and 39% (41/105) of melanoma and NSCLC patients, respectively. At a cost of £339 (US$449) per patient, the panel was less expensive locally than performing more than two or three single gene tests. Study limitations include the use of FFPE samples, which do not always provide high-quality DNA, and the use of “real world” data: submission of cases for sequencing did not always follow clinical guidelines, meaning that when mutations were detected, patients were not always eligible for targeted treatments on clinical grounds. Conclusions This study demonstrates that more extensive tumour sequencing can identify mutations that could improve clinical decision-making in routine cancer care, potentially improving patient outcomes, at an affordable level for healthcare providers. PMID:28196074

  10. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum)

    PubMed Central

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156–5p, vco-miR156–3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs. PMID:25763692

  11. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum).

    PubMed

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156-5p, vco-miR156-3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs.

  12. Selective ribosome profiling as a tool to study the interaction of chaperones and targeting factors with nascent polypeptide chains and ribosomes

    PubMed Central

    Becker, Annemarie H.; Oh, Eugene; Weissman, Jonathan S.; Kramer, Günter; Bukau, Bernd

    2014-01-01

    A plethora of factors is involved in the maturation of newly synthesized proteins, including chaperones, membrane targeting factors, and enzymes. Many factors act cotranslationally through association with ribosome-nascent chain complexes (RNCs), but their target specificities and modes of action remain poorly understood. We developed selective ribosome profiling (SeRP) to identify substrate pools and points of RNC engagement of these factors. SeRP is based on sequencing mRNA fragments covered by translating ribosomes (general ribosome profiling, RP), combined with a procedure to selectively isolate RNCs whose nascent polypeptides are associated with the factor of interest. Factor–RNC interactions are stabilized by crosslinking, the resulting factor–RNC adducts are then nuclease-treated to generate monosomes, and affinity-purified. The ribosome-extracted mRNA footprints are converted to DNA libraries for deep sequencing. The protocol is specified for general RP and SeRP in bacteria. It was first applied to the chaperone trigger factor and is readily adaptable to other cotranslationally acting factors, including eukaryotic factors. Factor–RNC purification and sequencing library preparation takes 7–8 days, sequencing and data analysis can be completed in 5–6 days. PMID:24136347

  13. Mutation detection in the human HSP70B′ gene by denaturing high-performance liquid chromatography

    PubMed Central

    Hecker, Karl H.; Asea, Alexzander; Kobayashi, Kaoru; Green, Stacy; Tang, Dan; Calderwood, Stuart K.

    2000-01-01

    Variances, particularly single nucleotide polymorphisms (SNP), in the genomic sequence of individuals are the primary key to understanding gene function as it relates to differences in the susceptibility to disease, environmental influences, and therapy. In this report, the HSP70B′ gene is the target sequence for mutation detection in biopsy samples from human prostate cancer patients undergoing combined hyperthermia and radiation therapy at the Dana-Farber Cancer Institute, using temperature-modulated heteroduplex analysis (TMHA). The underlying principles of TMHA for mutation detection using DHPLC technology are discussed. The procedures involved in amplicon design for mutation analysis by DHPLC are detailed. The melting behavior of the complete coding sequence of the target gene is characterized using WAVEMAKERTM software. Four overlapping amplicons, which span the complete coding region of the HSP70B′ gene, amenable to mutation detection by DHPLC were identified based on the software-predicted melting profile of the target sequence. TMHA was performed on PCR products of individual amplicons of the HSP70B′ gene on the WAVE® Nucleic Acid Fragment Analysis System. The criteria for mutation calling by comparing wild-type and mutant chromatographic patterns are discussed. PMID:11189446

  14. Mutation detection in the human HSP7OB' gene by denaturing high-performance liquid chromatography.

    PubMed

    Hecker, K H; Asea, A; Kobayashi, K; Green, S; Tang, D; Calderwood, S K

    2000-11-01

    Variances, particularly single nucleotide polymorphisms (SNP), in the genomic sequence of individuals are the primary key to understanding gene function as it relates to differences in the susceptibility to disease, environmental influences, and therapy. In this report, the HSP70B' gene is the target sequence for mutation detection in biopsy samples from human prostate cancer patients undergoing combined hyperthermia and radiation therapy at the Dana-Farber Cancer Institute, using temperature-modulated heteroduplex analysis (TMHA). The underlying principles of TMHA for mutation detection using DHPLC technology are discussed. The procedures involved in amplicon design for mutation analysis by DHPLC are detailed. The melting behavior of the complete coding sequence of the target gene is characterized using WAVEMAKER software. Four overlapping amplicons, which span the complete coding region of the HSP70B' gene, amenable to mutation detection by DHPLC were identified based on the software-predicted melting profile of the target sequence. TMHA was performed on PCR products of individual amplicons of the HSP70B' gene on the WAVE Nucleic Acid Fragment Analysis System. The criteria for mutation calling by comparing wild-type and mutant chromatographic patterns are discussed.

  15. Resistance gene enrichment sequencing (RenSeq) enables reannotation of the NB-LRR gene family from sequenced plant genomes and rapid mapping of resistance loci in segregating populations

    PubMed Central

    Jupe, Florian; Witek, Kamil; Verweij, Walter; Śliwka, Jadwiga; Pritchard, Leighton; Etherington, Graham J; Maclean, Dan; Cock, Peter J; Leggett, Richard M; Bryan, Glenn J; Cardle, Linda; Hein, Ingo; Jones, Jonathan DG

    2013-01-01

    Summary RenSeq is a NB-LRR (nucleotide binding-site leucine-rich repeat) gene-targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB-LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB-LRRs and can be accessed through a genome browser that we provide. We compared these NB-LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ∼80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum ‘Heinz 1706’ extended the NB-LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co-segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi-ber2) and S. ruiz-ceballosii (Rpi-rzc1), we were able to apply RenSeq successfully to identify markers that co-segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy-to-adapt Galaxy pipelines. PMID:23937694

  16. Novel mutations target distinct subgroups of medulloblastoma

    PubMed Central

    Robinson, Giles; Parker, Matthew; Kranenburg, Tanya A.; Lu, Charles; Chen, Xiang; Ding, Li; Phoenix, Timothy N.; Hedlund, Erin; Wei, Lei; Zhu, Xiaoyan; Chalhoub, Nader; Baker, Suzanne J.; Huether, Robert; Kriwacki, Richard; Curley, Natasha; Thiruvenkatam, Radhika; Wang, Jianmin; Wu, Gang; Rusch, Michael; Hong, Xin; Beckford, Jared; Gupta, Pankaj; Ma, Jing; Easton, John; Vadodaria, Bhavin; Onar-Thomas, Arzu; Lin, Tong; Li, Shaoyi; Pounds, Stanley; Paugh, Steven; Zhao, David; Kawauchi, Daisuke; Roussel, Martine F.; Finkelstein, David; Ellison, David W.; Lau, Ching C.; Bouffet, Eric; Hassall, Tim; Gururangan, Sridharan; Cohn, Richard; Fulton, Robert S.; Fulton, Lucinda L.; Dooling, David J.; Ochoa, Kerri; Gajjar, Amar; Mardis, Elaine R.; Wilson, Richard K.; Downing, James R.; Zhang, Jinghui; Gilbertson, Richard J.

    2012-01-01

    Summary Medulloblastoma is a malignant childhood brain tumour comprising four discrete subgroups. To identify mutations that drive medulloblastoma we sequenced the entire genomes of 37 tumours and matched normal blood. One hundred and thirty-six genes harbouring somatic mutations in this discovery set were sequenced in an additional 56 medulloblastomas. Recurrent mutations were detected in 41 genes not yet implicated in medulloblastoma: several target distinct components of the epigenetic machinery in different disease subgroups, e.g., regulators of H3K27 and H3K4 trimethylation in subgroup-3 and 4 (e.g., KDM6A and ZMYM3), and CTNNB1-associated chromatin remodellers in WNT-subgroup tumours (e.g., SMARCA4 and CREBBP). Modelling of mutations in mouse lower rhombic lip progenitors that generate WNT-subgroup tumours, identified genes that maintain this cell lineage (DDX3X) as well as mutated genes that initiate (CDH1) or cooperate (PIK3CA) in tumourigenesis. These data provide important new insights into the pathogenesis of medulloblastoma subgroups and highlight targets for therapeutic development. PMID:22722829

  17. Computational Identification of MicroRNAs and Their Targets from Finger Millet (Eleusine coracana).

    PubMed

    Usha, S; Jyothi, M N; Suchithra, B; Dixit, Rekha; Rai, D V; Nagesh Babu, R

    2017-03-01

    MicroRNAs are endogenous small RNAs regulating intrinsic normal growth and development of plant. Discovering miRNAs, their targets and further inferring their functions had become routine process to comprehend the normal biological processes of miRNAs and their roles in plant development. In this study, we used homology-based analysis with available expressed sequence tag of finger millet (Eleusine coracana) to predict conserved miRNAs. Three potent miRNAs targeting 88 genes were identified. The newly identified miRNAs were found to be homologous with miR166 and miR1310. The targets recognized were transcription factors and enzymes, and GO analysis showed these miRNAs played varied roles in gene regulation. The identification of miRNAs and their targets is anticipated to hasten the pace of key epigenetic regulators in plant development.

  18. Discovery and Annotation of Plant Endogenous Target Mimicry Sequences from Public Transcriptome Libraries: A Case Study of Prunus persica.

    PubMed

    Karakülah, Gökhan

    2017-06-28

    Novel transcript discovery through RNA sequencing has substantially improved our understanding of the transcriptome dynamics of biological systems. Endogenous target mimicry (eTM) transcripts, a novel class of regulatory molecules, bind to their target microRNAs (miRNAs) by base pairing and block their biological activity. The objective of this study was to provide a computational analysis framework for the prediction of putative eTM sequences in plants, and as an example, to discover previously un-annotated eTMs in Prunus persica (peach) transcriptome. Therefore, two public peach transcriptome libraries downloaded from Sequence Read Archive (SRA) and a previously published set of long non-coding RNAs (lncRNAs) were investigated with multi-step analysis pipeline, and 44 putative eTMs were found. Additionally, an eTM-miRNA-mRNA regulatory network module associated with peach fruit organ development was built via integration of the miRNA target information and predicted eTM-miRNA interactions. My findings suggest that one of the most widely expressed miRNA families among diverse plant species, miR156, might be potentially sponged by seven putative eTMs. Besides, the study indicates eTMs potentially play roles in the regulation of development processes in peach fruit via targeting specific miRNAs. In conclusion, by following the step-by step instructions provided in this study, novel eTMs can be identified and annotated effectively in public plant transcriptome libraries.

  19. The Aquaporin Channel Repertoire of the Tardigrade Milnesium tardigradum

    PubMed Central

    Grohme, Markus A.; Mali, Brahim; Wełnicz, Weronika; Michel, Stephanie; Schill, Ralph O.; Frohme, Marcus

    2013-01-01

    Limno-terrestrial tardigrades are small invertebrates that are subjected to periodic drought of their micro-environment. They have evolved to cope with these unfavorable conditions by anhydrobiosis, an ametabolic state of low cellular water. During drying and rehydration, tardigrades go through drastic changes in cellular water content. By our transcriptome sequencing effort of the limno-terrestrial tardigrade Milnesium tardigradum and by a combination of cloning and targeted sequence assembly, we identified transcripts encoding eleven putative aquaporins. Analysis of these sequences proposed 2 classical aquaporins, 8 aquaglyceroporins and a single potentially intracellular unorthodox aquaporin. Using quantitative real-time PCR we analyzed aquaporin transcript expression in the anhydrobiotic context. We have identified additional unorthodox aquaporins in various insect genomes and have identified a novel common conserved structural feature in these proteins. Analysis of the genomic organization of insect aquaporin genes revealed several conserved gene clusters. PMID:23761966

  20. Seed-effect modeling improves the consistency of genome-wide loss-of-function screens and identifies synthetic lethal vulnerabilities in cancer cells.

    PubMed

    Jaiswal, Alok; Peddinti, Gopal; Akimov, Yevhen; Wennerberg, Krister; Kuznetsov, Sergey; Tang, Jing; Aittokallio, Tero

    2017-06-01

    Genome-wide loss-of-function profiling is widely used for systematic identification of genetic dependencies in cancer cells; however, the poor reproducibility of RNA interference (RNAi) screens has been a major concern due to frequent off-target effects. Currently, a detailed understanding of the key factors contributing to the sub-optimal consistency is still a lacking, especially on how to improve the reliability of future RNAi screens by controlling for factors that determine their off-target propensity. We performed a systematic, quantitative analysis of the consistency between two genome-wide shRNA screens conducted on a compendium of cancer cell lines, and also compared several gene summarization methods for inferring gene essentiality from shRNA level data. We then devised novel concepts of seed essentiality and shRNA family, based on seed region sequences of shRNAs, to study in-depth the contribution of seed-mediated off-target effects to the consistency of the two screens. We further investigated two seed-sequence properties, seed pairing stability, and target abundance in terms of their capability to minimize the off-target effects in post-screening data analysis. Finally, we applied this novel methodology to identify genetic interactions and synthetic lethal partners of cancer drivers, and confirmed differential essentiality phenotypes by detailed CRISPR/Cas9 experiments. Using the novel concepts of seed essentiality and shRNA family, we demonstrate how genome-wide loss-of-function profiling of a common set of cancer cell lines can be actually made fairly reproducible when considering seed-mediated off-target effects. Importantly, by excluding shRNAs having higher propensity for off-target effects, based on their seed-sequence properties, one can remove noise from the genome-wide shRNA datasets. As a translational application case, we demonstrate enhanced reproducibility of genetic interaction partners of common cancer drivers, as well as identify novel synthetic lethal partners of a major oncogenic driver, PIK3CA, supported by a complementary CRISPR/Cas9 experiment. We provide practical guidelines for improved design and analysis of genome-wide loss-of-function profiling and demonstrate how this novel strategy can be applied towards improved mapping of genetic dependencies of cancer cells to aid development of targeted anticancer treatments.

  1. RET fusion as a novel driver of medullary thyroid carcinoma.

    PubMed

    Grubbs, Elizabeth G; Ng, Patrick Kwok-Shing; Bui, Jacquelin; Busaidy, Naifa L; Chen, Ken; Lee, Jeffrey E; Lu, Xinyan; Lu, Hengyu; Meric-Bernstam, Funda; Mills, Gordon B; Palmer, Gary; Perrier, Nancy D; Scott, Kenneth L; Shaw, Kenna R; Waguespack, Steven G; Williams, Michelle D; Yelensky, Roman; Cote, Gilbert J

    2015-03-01

    Oncogenic RET tyrosine kinase gene fusions and activating mutations have recently been identified in lung cancers, prompting initiation of targeted therapy trials in this disease. Although RET point mutation has been identified as a driver of tumorigenesis in medullary thyroid carcinoma (MTC), no fusions have been described to date. We evaluated the role of RET fusion as an oncogenic driver in MTC. We describe a patient who died from aggressive sporadic MTC < 10 months after diagnosis. Her tumor was evaluated by means of next-generation sequencing, including an intronic capture strategy. A reciprocal translocation involving RET intron 12 was identified. The fusion was validated using a targeted break apart fluorescence in situ hybridization probe, and RNA sequencing confirmed the existence of an in-frame fusion transcript joining MYH13 exon 35 with RET exon 12. Ectopic expression of fusion product in a murine Ba/F3 cell reporter model established strong oncogenicity. Three tyrosine kinase inhibitors currently used to treat MTC in clinical practice blocked tumorigenic cell growth. This finding represents the report of a novel RET fusion, the first of its kind described in MTC. The finding of this potential novel oncogenic mechanism has clear implications for sporadic MTC, which in the majority of cases has no driver mutation identified. The presence of a RET fusion also provides a plausible target for RET tyrosine kinase inhibitor therapies.

  2. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  3. Unconventional P-35S sequence identified in genetically modified maize

    PubMed Central

    Al-Hmoud, Nisreen; Al-Husseini, Nawar; Ibrahim-Alobaide, Mohammed A; Kübler, Eric; Farfoura, Mahmoud; Alobydi, Hytham; Al-Rousan, Hiyam

    2014-01-01

    The Cauliflower Mosaic Virus 35S promoter sequence, CaMV P-35S, is one of several commonly used genetic targets to detect genetically modified maize and is found in most GMOs. In this research we report the finding of an alternative P-35S sequence and its incidence in GM maize marketed in Jordan. The primer pair normally used to amplify a 123 bp DNA fragment of the CaMV P-35S promoter in GMOs also amplified a previously undetected alternative sequence of CaMV P-35S in GM maize samples which we term V3. The amplified V3 sequence comprises 386 base pairs and was not found in the standard wild-type maize, MON810 and MON 863 GM maize. The identified GM maize samples carrying the V3 sequence were found free of CaMV when compared with CaMV infected brown mustard sample. The data of sequence alignment analysis of the V3 genetic element showed 90% similarity with the matching P-35S sequence of the cauliflower mosaic virus isolate CabbB-JI and 99% similarity with matching P-35S sequences found in several binary plant vectors, of which the binary vector locus JQ693018 is one example. The current study showed an increase of 44% in the incidence of the identified 386 bp sequence in GM maize sold in Jordan’s markets during the period 2009 and 2012. PMID:24495911

  4. A Next-Generation Sequencing Strategy for Evaluating the Most Common Genetic Abnormalities in Multiple Myeloma.

    PubMed

    Jiménez, Cristina; Jara-Acevedo, María; Corchete, Luis A; Castillo, David; Ordóñez, Gonzalo R; Sarasquete, María E; Puig, Noemí; Martínez-López, Joaquín; Prieto-Conde, María I; García-Álvarez, María; Chillón, María C; Balanzategui, Ana; Alcoceba, Miguel; Oriol, Albert; Rosiñol, Laura; Palomera, Luis; Teruel, Ana I; Lahuerta, Juan J; Bladé, Joan; Mateos, María V; Orfão, Alberto; San Miguel, Jesús F; González, Marcos; Gutiérrez, Norma C; García-Sanz, Ramón

    2017-01-01

    Identification and characterization of genetic alterations are essential for diagnosis of multiple myeloma and may guide therapeutic decisions. Currently, genomic analysis of myeloma to cover the diverse range of alterations with prognostic impact requires fluorescence in situ hybridization (FISH), single nucleotide polymorphism arrays, and sequencing techniques, which are costly and labor intensive and require large numbers of plasma cells. To overcome these limitations, we designed a targeted-capture next-generation sequencing approach for one-step identification of IGH translocations, V(D)J clonal rearrangements, the IgH isotype, and somatic mutations to rapidly identify risk groups and specific targetable molecular lesions. Forty-eight newly diagnosed myeloma patients were tested with the panel, which included IGH and six genes that are recurrently mutated in myeloma: NRAS, KRAS, HRAS, TP53, MYC, and BRAF. We identified 14 of 17 IGH translocations previously detected by FISH and three confirmed translocations not detected by FISH, with the additional advantage of breakpoint identification, which can be used as a target for evaluating minimal residual disease. IgH subclass and V(D)J rearrangements were identified in 77% and 65% of patients, respectively. Mutation analysis revealed the presence of missense protein-coding alterations in at least one of the evaluating genes in 16 of 48 patients (33%). This method may represent a time- and cost-effective diagnostic method for the molecular characterization of multiple myeloma. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  5. Identification of Loci Associated with Drought Resistance Traits in Heterozygous Autotetraploid Alfalfa (Medicago sativa L.) Using Genome-Wide Association Studies with Genotyping by Sequencing.

    PubMed

    Zhang, Tiejun; Yu, Long-Xi; Zheng, Ping; Li, Yajun; Rivera, Martha; Main, Dorrie; Greene, Stephanie L

    2015-01-01

    Drought resistance is an important breeding target for enhancing alfalfa productivity in arid and semi-arid regions. Identification of genes involved in drought tolerance will facilitate breeding for improving drought resistance and water use efficiency in alfalfa. Our objective was to use a diversity panel of alfalfa accessions comprised of 198 cultivars and landraces to identify genes involved in drought tolerance. The panel was selected from the USDA-ARS National Plant Germplasm System alfalfa collection and genotyped using genotyping by sequencing. A greenhouse procedure was used for phenotyping two important traits associated with drought tolerance: drought resistance index (DRI) and relative leaf water content (RWC). Marker-trait association identified nineteen and fifteen loci associated with DRI and RWC, respectively. Alignments of target sequences flanking to the resistance loci against the reference genome of M. truncatula revealed multiple chromosomal locations. Markers associated with DRI are located on all chromosomes while markers associated with RWC are located on chromosomes 1, 2, 3, 4, 5, 6 and 7. Co-localizations of significant markers between DRI and RWC were found on chromosomes 3, 5 and 7. Most loci associated with DRI in this work overlap with the reported QTLs associated with biomass under drought in alfalfa. Additional significant markers were targeted to several contigs with unknown chromosomal locations. BLAST search using their flanking sequences revealed homology to several annotated genes with functions in stress tolerance. With further validation, these markers may be used for marker-assisted breeding new alfalfa varieties with drought resistance and enhanced water use efficiency.

  6. Cloning and Identification of Recombinant Argonaute-Bound Small RNAs Using Next-Generation Sequencing.

    PubMed

    Gangras, Pooja; Dayeh, Daniel M; Mabin, Justin W; Nakanishi, Kotaro; Singh, Guramrit

    2018-01-01

    Argonaute proteins (AGOs) are loaded with small RNAs as guides to recognize target mRNAs. Since the target specificity heavily depends on the base complementarity between two strands, it is important to identify small guide and long target RNAs bound to AGOs. For this purpose, next-generation sequencing (NGS) technologies have extended our appreciation truly to the nucleotide level. However, the identification of RNAs via NGS from scarce RNA samples remains a challenge. Further, most commercial and published methods are compatible with either small RNAs or long RNAs, but are not equally applicable to both. Therefore, a single method that yields quantitative, bias-free NGS libraries to identify small and long RNAs from low levels of input will be of wide interest. Here, we introduce such a procedure that is based on several modifications of two published protocols and allows robust, sensitive, and reproducible cloning and sequencing of small amounts of RNAs of variable lengths. The method was applied to the identification of small RNAs bound to a purified eukaryotic AGO. Following ligation of a DNA adapter to RNA 3'-end, the key feature of this method is to use the adapter for priming reverse transcription (RT) wherein biotinylated deoxyribonucleotides specifically incorporated into the extended complementary DNA. Such RT products are enriched on streptavidin beads, circularized while immobilized on beads and directly used for PCR amplification. We provide a stepwise guide to generate RNA-Seq libraries, their purification, quantification, validation, and preparation for next-generation sequencing. We also provide basic steps in post-NGS data analyses using Galaxy, an open-source, web-based platform.

  7. CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites

    PubMed Central

    Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko

    2015-01-01

    Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360

  8. Targeted next-generation sequencing identifies a homozygous nonsense mutation in ABHD12, the gene underlying PHARC, in a family clinically diagnosed with Usher syndrome type 3.

    PubMed

    Eisenberger, Tobias; Slim, Rima; Mansour, Ahmad; Nauck, Markus; Nürnberg, Gudrun; Nürnberg, Peter; Decker, Christian; Dafinger, Claudia; Ebermann, Inga; Bergmann, Carsten; Bolz, Hanno Jörn

    2012-09-02

    Usher syndrome (USH) is an autosomal recessive genetically heterogeneous disorder with congenital sensorineural hearing impairment and retinitis pigmentosa (RP). We have identified a consanguineous Lebanese family with two affected members displaying progressive hearing loss, RP and cataracts, therefore clinically diagnosed as USH type 3 (USH3). Our study was aimed at the identification of the causative mutation in this USH3-like family. Candidate loci were identified using genomewide SNP-array-based homozygosity mapping followed by targeted enrichment and next-generation sequencing. Using a capture array targeting the three identified homozygosity-by-descent regions on chromosomes 1q43-q44, 20p13-p12.2 and 20p11.23-q12, we identified a homozygous nonsense mutation, p.Arg65X, in ABHD12 segregating with the phenotype. Mutations of ABHD12, an enzyme hydrolyzing an endocannabinoid lipid transmitter, cause PHARC (polyneuropathy, hearing loss, ataxia, retinitis pigmentosa, and early-onset cataract). After the identification of the ABHD12 mutation in this family, one patient underwent neurological examination which revealed ataxia, but no polyneuropathy. ABHD12 is not known to be related to the USH protein interactome. The phenotype of our patient represents a variant of PHARC, an entity that should be taken into account as differential diagnosis for USH3. Our study demonstrates the potential of comprehensive genetic analysis for improving the clinical diagnosis.

  9. Targeted next-generation sequencing identifies a homozygous nonsense mutation in ABHD12, the gene underlying PHARC, in a family clinically diagnosed with Usher syndrome type 3

    PubMed Central

    2012-01-01

    Background Usher syndrome (USH) is an autosomal recessive genetically heterogeneous disorder with congenital sensorineural hearing impairment and retinitis pigmentosa (RP). We have identified a consanguineous Lebanese family with two affected members displaying progressive hearing loss, RP and cataracts, therefore clinically diagnosed as USH type 3 (USH3). Our study was aimed at the identification of the causative mutation in this USH3-like family. Methods Candidate loci were identified using genomewide SNP-array-based homozygosity mapping followed by targeted enrichment and next-generation sequencing. Results Using a capture array targeting the three identified homozygosity-by-descent regions on chromosomes 1q43-q44, 20p13-p12.2 and 20p11.23-q12, we identified a homozygous nonsense mutation, p.Arg65X, in ABHD12 segregating with the phenotype. Conclusion Mutations of ABHD12, an enzyme hydrolyzing an endocannabinoid lipid transmitter, cause PHARC (polyneuropathy, hearing loss, ataxia, retinitis pigmentosa, and early-onset cataract). After the identification of the ABHD12 mutation in this family, one patient underwent neurological examination which revealed ataxia, but no polyneuropathy. ABHD12 is not known to be related to the USH protein interactome. The phenotype of our patient represents a variant of PHARC, an entity that should be taken into account as differential diagnosis for USH3. Our study demonstrates the potential of comprehensive genetic analysis for improving the clinical diagnosis. PMID:22938382

  10. Target mimics: an embedded layer of microRNA-involved gene regulatory networks in plants.

    PubMed

    Meng, Yijun; Shao, Chaogang; Wang, Huizhong; Jin, Yongfeng

    2012-05-21

    MicroRNAs (miRNAs) play an essential role in gene regulation in plants. At the same time, the expression of miRNA genes is also tightly controlled. Recently, a novel mechanism called "target mimicry" was discovered, providing another layer for modulating miRNA activities. However, except for the artificial target mimics manipulated for functional studies on certain miRNA genes, only one example, IPS1 (Induced by Phosphate Starvation 1)-miR399 was experimentally confirmed in planta. To date, few analyses for comprehensive identification of natural target mimics have been performed in plants. Thus, limited evidences are available to provide detailed information for interrogating the questionable issue whether target mimicry was widespread in planta, and implicated in certain biological processes. In this study, genome-wide computational prediction of endogenous miRNA mimics was performed in Arabidopsis and rice, and dozens of target mimics were identified. In contrast to a recent report, the densities of target mimic sites were found to be much higher within the untranslated regions (UTRs) when compared to those within the coding sequences (CDSs) in both plants. Some novel sequence characteristics were observed for the miRNAs that were potentially regulated by the target mimics. GO (Gene Ontology) term enrichment analysis revealed some functional insights into the predicted mimics. After degradome sequencing data-based identification of miRNA targets, the regulatory networks constituted by target mimics, miRNAs and their downstream targets were constructed, and some intriguing subnetworks were further exploited. These results together suggest that target mimicry may be widely implicated in regulating miRNA activities in planta, and we hope this study could expand the current understanding of miRNA-involved regulatory networks.

  11. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    DOE PAGES

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; ...

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  12. Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

    PubMed

    Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

    2010-07-01

    We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.

  13. Draft Genome Sequences of Two Species of "Difficult-to-Identify" Human-Pathogenic Corynebacteria: Implications for Better Identification Tests.

    PubMed

    Pacheco, Luis G C; Mattos-Guaraldi, Ana L; Santos, Carolina S; Veras, Adonney A O; Guimarães, Luis C; Abreu, Vinícius; Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Carvalho, Alex F; Leal, Carlos G; Figueiredo, Henrique C P; Ramos, Juliana N; Vieira, Veronica V; Farfour, Eric; Guiso, Nicole; Hirata, Raphael; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

    2015-01-01

    Non-diphtheriae Corynebacterium species have been increasingly recognized as the causative agents of infections in humans. Differential identification of these bacteria in the clinical microbiology laboratory by the most commonly used biochemical tests is challenging, and normally requires additional molecular methods. Herein, we present the annotated draft genome sequences of two isolates of "difficult-to-identify" human-pathogenic corynebacterial species: C. xerosis and C. minutissimum. The genome sequences of ca. 2.7 Mbp, with a mean number of 2,580 protein encoding genes, were also compared with the publicly available genome sequences of strains of C. amycolatum and C. striatum. These results will aid the exploration of novel biochemical reactions to improve existing identification tests as well as the development of more accurate molecular identification methods through detection of species-specific target genes for isolate's identification or drug susceptibility profiling.

  14. Multiplex PCR identification of Taenia spp. in rodents and carnivores.

    PubMed

    Al-Sabi, Mohammad N S; Kapel, Christian M O

    2011-11-01

    The genus Taenia includes several species of veterinary and public health importance, but diagnosis of the etiological agent in definitive and intermediate hosts often relies on labor intensive and few specific morphometric criteria, especially in immature worms and underdeveloped metacestodes. In the present study, a multiplex PCR, based on five primers targeting the 18S rDNA and ITS2 sequences, produced a species-specific banding patterns for a range of Taenia spp. Species typing by the multiplex PCR was compared to morphological identification and sequencing of cox1 and/or 12S rDNA genes. As compared to sequencing, the multiplex PCR identified 31 of 32 Taenia metacestodes from rodents, whereas only 14 cysts were specifically identified by morphology. Likewise, the multiplex PCR identified 108 of 130 adult worms, while only 57 were identified to species by morphology. The tested multiplex PCR system may potentially be used for studies of Taenia spp. transmitted between rodents and carnivores.

  15. ICO amplicon NGS data analysis: a Web tool for variant detection in common high-risk hereditary cancer genes analyzed by amplicon GS Junior next-generation sequencing.

    PubMed

    Lopez-Doriga, Adriana; Feliubadaló, Lídia; Menéndez, Mireia; Lopez-Doriga, Sergio; Morón-Duran, Francisco D; del Valle, Jesús; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Campos, Olga; Gómez, Carolina; Pineda, Marta; González, Sara; Moreno, Victor; Capellá, Gabriel; Lázaro, Conxi

    2014-03-01

    Next-generation sequencing (NGS) has revolutionized genomic research and is set to have a major impact on genetic diagnostics thanks to the advent of benchtop sequencers and flexible kits for targeted libraries. Among the main hurdles in NGS are the difficulty of performing bioinformatic analysis of the huge volume of data generated and the high number of false positive calls that could be obtained, depending on the NGS technology and the analysis pipeline. Here, we present the development of a free and user-friendly Web data analysis tool that detects and filters sequence variants, provides coverage information, and allows the user to customize some basic parameters. The tool has been developed to provide accurate genetic analysis of targeted sequencing of common high-risk hereditary cancer genes using amplicon libraries run in a GS Junior System. The Web resource is linked to our own mutation database, to assist in the clinical classification of identified variants. We believe that this tool will greatly facilitate the use of the NGS approach in routine laboratories.

  16. Tandem mass spectrometry for the detection of plant pathogenic fungi and the effects of database composition on protein inferences.

    PubMed

    Padliya, Neerav D; Garrett, Wesley M; Campbell, Kimberly B; Tabb, David L; Cooper, Bret

    2007-11-01

    LC-MS/MS has demonstrated potential for detecting plant pathogens. Unlike PCR or ELISA, LC-MS/MS does not require pathogen-specific reagents for the detection of pathogen-specific proteins and peptides. However, the MS/MS approach we and others have explored does require a protein sequence reference database and database-search software to interpret tandem mass spectra. To evaluate the limitations of database composition on pathogen identification, we analyzed proteins from cultured Ustilago maydis, Phytophthora sojae, Fusarium graminearum, and Rhizoctonia solani by LC-MS/MS. When the search database did not contain sequences for a target pathogen, or contained sequences to related pathogens, target pathogen spectra were reliably matched to protein sequences from nontarget organisms, giving an illusion that proteins from nontarget organisms were identified. Our analysis demonstrates that when database-search software is used as part of the identification process, a paradox exists whereby additional sequences needed to detect a wide variety of possible organisms may lead to more cross-species protein matches and misidentification of pathogens.

  17. ALCHEMIST: Adjuvant Lung Cancer Enrichment Marker Identification and Sequencing Trials

    Cancer.gov

    ALCHEMIST represents three integrated, precision medicine trials that are designed to identify people with early-stage lung cancer who have tumors that harbor certain uncommon genetic changes and evaluate whether drug treatments targeted against those mol

  18. Proteomic identification of aldolase A as an autoantibody target in patients with atypical movement disorders.

    PubMed

    Privitera, Daniela; Corti, Valeria; Alessio, Massimo; Volontè, Maria Antonietta; Volontè, Antonietta; Lampasona, Vito; Comi, Giancarlo; Martino, Gianvito; Franciotta, Diego; Furlan, Roberto; Fazio, Raffaella

    2013-03-01

    We tried to identify the target/s of autoantibodies to basal ganglia neurons found in a patient with hyperkinetic movement disorders (HMD) characterized by rapid, rhythmic involuntary movements or spasms in both face and neck. Patient and control sera were used in Western blot to probe mouse brain homogenates. Two-dimensional gel electrophoresis (2-DE) SDS-PAGE protein spots recognized by the patient's antibodies were excised and sequenced by mass spectrometry analysis, and the glycolytic enzyme aldolase A was identified as the antigen recognized by the patient's autoantibodies. To assess relevance and specificity of these antibodies to the identified targets as biomarkers of autoimmunity in movement disorders, autoantibody responses to the identified target were then measured by ELISA in various diseases of the central nervous system. Anti-aldolase A autoantibodies were associated mainly with HMD (7/17, 41%) and Parkinson's disease (4/30, 13%) patients, and undetectable in subjects with other inflammatory and non-inflammatory central nervous system diseases. We, thus, identified aldolase A as an autoantigen in a sub-group of patients with HMD, a clinically ill-defined syndrome. Anti-aldolase A antibodies may represent a useful biomarker of autoimmunity in HMD patients.

  19. Landscape of somatic mutations and clonal evolution in mantle cell lymphoma.

    PubMed

    Beà, Sílvia; Valdés-Mas, Rafael; Navarro, Alba; Salaverria, Itziar; Martín-Garcia, David; Jares, Pedro; Giné, Eva; Pinyol, Magda; Royo, Cristina; Nadeu, Ferran; Conde, Laura; Juan, Manel; Clot, Guillem; Vizán, Pedro; Di Croce, Luciano; Puente, Diana A; López-Guerra, Mónica; Moros, Alexandra; Roue, Gael; Aymerich, Marta; Villamor, Neus; Colomo, Lluís; Martínez, Antonio; Valera, Alexandra; Martín-Subero, José I; Amador, Virginia; Hernández, Luis; Rozman, Maria; Enjuanes, Anna; Forcada, Pilar; Muntañola, Ana; Hartmann, Elena M; Calasanz, María J; Rosenwald, Andreas; Ott, German; Hernández-Rivas, Jesús M; Klapper, Wolfram; Siebert, Reiner; Wiestner, Adrian; Wilson, Wyndham H; Colomer, Dolors; López-Guillermo, Armando; López-Otín, Carlos; Puente, Xose S; Campo, Elías

    2013-11-05

    Mantle cell lymphoma (MCL) is an aggressive tumor, but a subset of patients may follow an indolent clinical course. To understand the mechanisms underlying this biological heterogeneity, we performed whole-genome and/or whole-exome sequencing on 29 MCL cases and their respective matched normal DNA, as well as 6 MCL cell lines. Recurrently mutated genes were investigated by targeted sequencing in an independent cohort of 172 MCL patients. We identified 25 significantly mutated genes, including known drivers such as ataxia-telangectasia mutated (ATM), cyclin D1 (CCND1), and the tumor suppressor TP53; mutated genes encoding the anti-apoptotic protein BIRC3 and Toll-like receptor 2 (TLR2); and the chromatin modifiers WHSC1, MLL2, and MEF2B. We also found NOTCH2 mutations as an alternative phenomenon to NOTCH1 mutations in aggressive tumors with a dismal prognosis. Analysis of two simultaneous or subsequent MCL samples by whole-genome/whole-exome (n = 8) or targeted (n = 19) sequencing revealed subclonal heterogeneity at diagnosis in samples from different topographic sites and modulation of the initial mutational profile at the progression of the disease. Some mutations were predominantly clonal or subclonal, indicating an early or late event in tumor evolution, respectively. Our study identifies molecular mechanisms contributing to MCL pathogenesis and offers potential targets for therapeutic intervention.

  20. The role of the RAS pathway in iAMP21-ALL

    PubMed Central

    Ryan, S L; Matheson, E; Grossmann, V; Sinclair, P; Bashton, M; Schwab, C; Towers, W; Partington, M; Elliott, A; Minto, L; Richardson, S; Rahman, T; Keavney, B; Skinner, R; Bown, N; Haferlach, T; Vandenberghe, P; Haferlach, C; Santibanez-Koref, M; Moorman, A V; Kohlmann, A; Irving, J A E; Harrison, C J

    2016-01-01

    Intrachromosomal amplification of chromosome 21 (iAMP21) identifies a high-risk subtype of acute lymphoblastic leukaemia (ALL), requiring intensive treatment to reduce their relapse risk. Improved understanding of the genomic landscape of iAMP21-ALL will ascertain whether these patients may benefit from targeted therapy. We performed whole-exome sequencing of eight iAMP21-ALL samples. The mutation rate was dramatically disparate between cases (average 24.9, range 5–51) and a large number of novel variants were identified, including frequent mutation of the RAS/MEK/ERK pathway. Targeted sequencing of a larger cohort revealed that 60% (25/42) of diagnostic iAMP21-ALL samples harboured 42 distinct RAS pathway mutations. High sequencing coverage demonstrated heterogeneity in the form of multiple RAS pathway mutations within the same sample and diverse variant allele frequencies (VAFs) (2–52%), similar to other subtypes of ALL. Constitutive RAS pathway activation was observed in iAMP21 samples that harboured mutations in the predominant clone (⩾35% VAF). Viable iAMP21 cells from primary xenografts showed reduced viability in response to the MEK1/2 inhibitor, selumetinib, in vitro. As clonal (⩾35% VAF) mutations were detected in 26% (11/42) of iAMP21-ALL, this evidence of response to RAS pathway inhibitors may offer the possibility to introduce targeted therapy to improve therapeutic efficacy in these high-risk patients. PMID:27168466

  1. Cistrome of the aldosterone-activated mineralocorticoid receptor in human renal cells.

    PubMed

    Le Billan, Florian; Khan, Junaid A; Lamribet, Khadija; Viengchareun, Say; Bouligand, Jérôme; Fagart, Jérôme; Lombès, Marc

    2015-09-01

    Aldosterone exerts its effects mainly by activating the mineralocorticoid receptor (MR), a transcription factor that regulates gene expression through complex and dynamic interactions with coregulators and transcriptional machinery, leading to fine-tuned control of vectorial ionic transport in the distal nephron. To identify genome-wide aldosterone-regulated MR targets in human renal cells, we set up a chromatin immunoprecipitation (ChIP) assay by using a specific anti-MR antibody in a differentiated human renal cell line expressing green fluorescent protein (GFP)-MR. This approach, coupled with high-throughput sequencing, allowed identification of 974 genomic MR targets. Computational analysis identified an MR response element (MRE) including single or multiple half-sites and palindromic motifs in which the AGtACAgxatGTtCt sequence was the most prevalent motif. Most genomic MR-binding sites (MBSs) are located >10 kb from the transcriptional start sites of target genes (84%). Specific aldosterone-induced recruitment of MR on the first most relevant genomic sequences was further validated by ChIP-quantitative (q)PCR and correlated with concomitant and positive aldosterone-activated transcriptional regulation of the corresponding gene, as assayed by RT-qPCR. It was notable that most MBSs lacked MREs but harbored DNA recognition motifs for other transcription factors (FOX, EGR1, AP1, PAX5) suggesting functional interaction. This work provides new insights into aldosterone MR-mediated renal signaling and opens relevant perspectives for mineralocorticoid-related pathophysiology. © FASEB.

  2. Site-targeted mutagenesis for stabilization of recombinant monoclonal antibody expressed in tobacco (Nicotiana tabacum) plants

    PubMed Central

    Hehle, Verena K.; Paul, Matthew J.; Roberts, Victoria A.; van Dolleweerd, Craig J.; Ma, Julian K.-C.

    2016-01-01

    This study examined the degradation pattern of a murine IgG1κ monoclonal antibody expressed in and extracted from transformed Nicotiana tabacum. Gel electrophoresis of leaf extracts revealed a consistent pattern of recombinant immunoglobulin bands, including intact and full-length antibody, as well as smaller antibody fragments. N-terminal sequencing revealed these smaller fragments to be proteolytic cleavage products and identified a limited number of protease-sensitive sites in the antibody light and heavy chain sequences. No strictly conserved target sequence was evident, although the peptide bonds that were susceptible to proteolysis were predominantly and consistently located within or near to the interdomain or solvent-exposed regions in the antibody structure. Amino acids surrounding identified cleavage sites were mutated in an attempt to increase resistance. Different Guy’s 13 antibody heavy and light chain mutant combinations were expressed transiently in N. tabacum and demonstrated intensity shifts in the fragmentation pattern, resulting in alterations to the full-length antibody-to-fragment ratio. The work strengthens the understanding of proteolytic cleavage of antibodies expressed in plants and presents a novel approach to stabilize full-length antibody by site-directed mutagenesis.—Hehle, V. K., Paul, M. J., Roberts, V. A., van Dolleweerd, C. J., Ma, J. K.-C. Site-targeted mutagenesis for stabilization of recombinant monoclonal antibody expressed in tobacco (Nicotiana tabacum) plants. PMID:26712217

  3. Targeted next generation sequencing of well-differentiated/dedifferentiated liposarcoma reveals novel gene amplifications and mutations.

    PubMed

    Somaiah, Neeta; Beird, Hannah C; Barbo, Andrea; Song, Juhee; Mills Shaw, Kenna R; Wang, Wei-Lien; Eterovic, Karina; Chen, Ken; Lazar, Alexander; Conley, Anthony P; Ravi, Vinod; Hwu, Patrick; Futreal, Andrew; Simon, George; Meric-Bernstam, Funda; Hong, David

    2018-04-13

    Well-differentiated/dedifferentiated liposarcoma is a common soft tissue sarcoma with approximately 1500 new cases per year. Surgery is the mainstay of treatment but recurrences are frequent and systemic options are limited. 'Tumor genotyping' is becoming more common in clinical practice as it offers the hope of personalized targeted therapy. We wanted to evaluate the results and the clinical utility of available next-generation sequencing panels in WD/DD liposarcoma. Patients who had their tumor sequenced by either FoundationOne ( n = 13) or the institutional T200/T200.1 panels ( n = 7) were included in this study. Significant copy number alterations were identified, but mutations were infrequent. Out of the 27 mutations detected in 7 samples, 8 ( CTNNB1, MECOM, ZNF536, EGFR, EML4, CSMD3, PBRM1, PPP1R3A ) were identified as deleterious (on Condel, PolyPhen and SIFT) and a truncating mutation was found in NF2 . Of these, EGFR and NF2 are potential driver mutations and have not been reported previously in liposarcoma. MDM2 and CDK4 amplification was universally present in all the tested samples and multiple other recurrent genes with high amplification or high deletion were detected. Many of these targets are potentially actionable. Eight patients went on to receive an MDM2 inhibitor with a median time to progression of 23 months (95% CI: 10-83 months).

  4. Targeted cancer exome sequencing reveals recurrent mutations in myeloproliferative neoplasms

    PubMed Central

    Tenedini, E; Bernardis, I; Artusi, V; Artuso, L; Roncaglia, E; Guglielmelli, P; Pieri, L; Bogani, C; Biamonte, F; Rotunno, G; Mannarelli, C; Bianchi, E; Pancrazzi, A; Fanelli, T; Malagoli Tagliazucchi, G; Ferrari, S; Manfredini, R; Vannucchi, A M; Tagliafico, E

    2014-01-01

    With the intent of dissecting the molecular complexity of Philadelphia-negative myeloproliferative neoplasms (MPN), we designed a target enrichment panel to explore, using next-generation sequencing (NGS), the mutational status of an extensive list of 2000 cancer-associated genes and microRNAs. The genomic DNA of granulocytes and in vitro-expanded CD3+T-lymphocytes, as a germline control, was target-enriched and sequenced in a learning cohort of 20 MPN patients using Roche 454 technology. We identified 141 genuine somatic mutations, most of which were not previously described. To test the frequency of the identified variants, a larger validation cohort of 189 MPN patients was additionally screened for these mutations using Ion Torrent AmpliSeq NGS. Excluding the genes already described in MPN, for 8 genes (SCRIB, MIR662, BARD1, TCF12, FAT4, DAP3, POLG and NRAS), we demonstrated a mutation frequency between 3 and 8%. We also found that mutations at codon 12 of NRAS (NRASG12V and NRASG12D) were significantly associated, for primary myelofibrosis (PMF), with highest dynamic international prognostic scoring system (DIPSS)-plus score categories. This association was then confirmed in 66 additional PMF patients composing a final dataset of 168 PMF showing a NRAS mutation frequency of 4.7%, which was associated with a worse outcome, as defined by the DIPSS plus score. PMID:24150215

  5. Novel genomic findings in multiple myeloma identified through routine diagnostic sequencing.

    PubMed

    Ryland, Georgina L; Jones, Kate; Chin, Melody; Markham, John; Aydogan, Elle; Kankanige, Yamuna; Caruso, Marisa; Guinto, Jerick; Dickinson, Michael; Prince, H Miles; Yong, Kwee; Blombery, Piers

    2018-05-14

    Multiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre. A cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline. At least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed. Our results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  6. Species identification in mixed tuna samples with next-generation sequencing targeting two short cytochrome b gene fragments.

    PubMed

    Kappel, Kristina; Haase, Ilka; Käppel, Christine; Sotelo, Carmen G; Schröder, Ute

    2017-11-01

    Conventional Sanger sequencing of PCR products is the gold standard for species authentication of seafood products. However, this method is inappropriate for the analysis of products that might contain mixtures of species, such as tinned tuna. The purpose of this study was to test whether next-generation sequencing (NGS) can be a solution for the authentication of mixed products. Nine tuna samples containing mixtures of up to four species were prepared and subjected to an NGS approach targeting two short cytochrome b gene (cytb) fragments on the Illumina MiSeq platform. Sequence recovery was precise and admixtures of as low as 1% could be identified, depending on the species composition of the mixtures. Duplicate samples as well as two individual NGS runs produced very similar results. A first test of three commercial tinned tuna samples indicated the presence of different species in the same tin, although this is forbidden by EU law. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Genome-wide identification and characterization of Notch transcription complex-binding sequence paired sites in leukemia cells

    PubMed Central

    Severson, Eric; Arnett, Kelly L.; Wang, Hongfang; Zang, Chongzhi; Taing, Len; Liu, Hudan; Pear, Warren S.; Liu, X. Shirley; Blacklow, Stephen C.; Aster, Jon C.

    2018-01-01

    Notch transcription complexes (NTCs) drive target gene expression by binding to two distinct types of genomic response elements, NTC monomer-binding sites and sequence-paired sites (SPSs) that bind NTC dimers. SPSs are conserved and are linked to the Notch-responsiveness of a few genes, but their overall contribution to Notch-dependent gene regulation is unknown. To address this issue, we determined the DNA sequence requirements for NTC dimerization using a fluorescence resonance energy transfer (FRET) assay, and applied insights from these in vitro studies to Notch-“addicted” leukemia cells. We find that SPSs contribute to the regulation of approximately a third of direct Notch target genes. While originally described in promoters, SPSs are present mainly in long-range enhancers, including an enhancer containing a newly described SPS that regulates HES5. Our work provides a general method for identifying sequence-paired sites in genome-wide data sets and highlights the widespread role of NTC dimerization in Notch-transformed leukemia cells. PMID:28465412

  8. Combining functional genomics and chemical biology to identify targets of bioactive compounds.

    PubMed

    Ho, Cheuk Hei; Piotrowski, Jeff; Dixon, Scott J; Baryshnikova, Anastasia; Costanzo, Michael; Boone, Charles

    2011-02-01

    Genome sequencing projects have revealed thousands of suspected genes, challenging researchers to develop efficient large-scale functional analysis methodologies. Determining the function of a gene product generally requires a means to alter its function. Genetically tractable model organisms have been widely exploited for the isolation and characterization of activating and inactivating mutations in genes encoding proteins of interest. Chemical genetics represents a complementary approach involving the use of small molecules capable of either inactivating or activating their targets. Saccharomyces cerevisiae has been an important test bed for the development and application of chemical genomic assays aimed at identifying targets and modes of action of known and uncharacterized compounds. Here we review yeast chemical genomic assays strategies for drug target identification. Copyright © 2010 Elsevier Ltd. All rights reserved.

  9. Identify mutation in amyotrophic lateral sclerosis cases using HaloPlex target enrichment system.

    PubMed

    Liu, Zhi-Jun; Li, Hong-Fu; Tan, Guo-He; Tao, Qing-Qing; Ni, Wang; Cheng, Xue-Wen; Xiong, Zhi-Qi; Wu, Zhi-Ying

    2014-12-01

    To date, at least 18 causative genes have been identified in amyotrophic lateral sclerosis (ALS). Because of the clinical and genetic heterogeneity, molecular diagnosis for ALS faces great challenges. HaloPlex target enrichment system is a new targeted sequencing approach, which can detect already known mutations or candidate genes. We performed this approach to screen 18 causative genes of ALS, including SOD1, SETX, FUS, ANG, TARDBP, ALS2, FIG4, VAPB, OPTN, DAO, VCP, UBQLN2, SPG11, SIGMAR1, DCTN1, SQSTM1, PFN1, and CHMP2B in 8 ALS probands. Using this approach, we got an average of 9.5 synonymous or missense mutations per sample. After validation by Sanger sequencing, we identified 3 documented SOD1 mutations (p.F21C, p.G148D, and p.C147R) and 1 novel DCTN1 p.G59R mutation in 4 probands. The novel DCTN1 mutation appeared to segregate with the disease in the pedigree and was absent in 200 control subjects. The high throughput and efficiency of this approach indicated that it could be applied to diagnose ALS and other inherited diseases with multiple causative genes in clinical practice. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Next-generation sequencing to solve complex inherited retinal dystrophy: A case series of multiple genes contributing to disease in extended families.

    PubMed

    Jones, Kaylie D; Wheaton, Dianna K; Bowne, Sara J; Sullivan, Lori S; Birch, David G; Chen, Rui; Daiger, Stephen P

    2017-01-01

    With recent availability of next-generation sequencing (NGS), it is becoming more common to pursue disease-targeted panel testing rather than traditional sequential gene-by-gene dideoxy sequencing. In this report, we describe using NGS to identify multiple disease-causing mutations that contribute concurrently or independently to retinal dystrophy in three relatively small families. Family members underwent comprehensive visual function evaluations, and genetic counseling including a detailed family history. A preliminary genetic inheritance pattern was assigned and updated as additional family members were tested. Family 1 (FAM1) and Family 2 (FAM2) were clinically diagnosed with retinitis pigmentosa (RP) and had a suspected autosomal dominant pedigree with non-penetrance (n.p.). Family 3 (FAM3) consisted of a large family with a diagnosis of RP and an overall dominant pedigree, but the proband had phenotypically cone-rod dystrophy. Initial genetic analysis was performed on one family member with traditional Sanger single gene sequencing and/or panel-based testing, and ultimately, retinal gene-targeted NGS was required to identify the underlying cause of disease for individuals within the three families. Results obtained in these families necessitated further genetic and clinical testing of additional family members to determine the complex genetic and phenotypic etiology of each family. Genetic testing of FAM1 (n = 4 affected; 1 n.p.) identified a dominant mutation in RP1 (p.Arg677Ter) that was present for two of the four affected individuals but absent in the proband and the presumed non-penetrant individual. Retinal gene-targeted NGS in the fourth affected family member revealed compound heterozygous mutations in USH2A (p. Cys419Phe, p.Glu767Serfs*21). Genetic testing of FAM2 (n = 3 affected; 1 n.p.) identified three retinal dystrophy genes ( PRPH2 , PRPF8 , and USH2A ) with disease-causing mutations in varying combinations among the affected family members. Genetic testing of FAM3 (n = 7 affected) identified a mutation in PRPH2 (p.Pro216Leu) tracking with disease in six of the seven affected individuals. Additional retinal gene-targeted NGS testing determined that the proband also harbored a multiple exon deletion in the CRX gene likely accounting for her cone-rod phenotype; her son harbored only the mutation in CRX , not the familial mutation in PRPH2 . Multiple genes contributing to the retinal dystrophy genotypes within a family were discovered using retinal gene-targeted NGS. Families with noted examples of phenotypic variation or apparent non-penetrant individuals may offer a clue to suspect complex inheritance. Furthermore, this finding underscores that caution should be taken when attributing a single gene disease-causing mutation (or inheritance pattern) to a family as a whole. Identification of a disease-causing mutation in a proband, even with a clear inheritance pattern in hand, may not be sufficient for targeted, known mutation analysis in other family members.

  11. Kangaroo – A pattern-matching program for biological sequences

    PubMed Central

    2002-01-01

    Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718

  12. Whole-Exome Sequencing to Identify Novel Biological Pathways Associated With Infertility After Pelvic Inflammatory Disease.

    PubMed

    Taylor, Brandie D; Zheng, Xiaojing; Darville, Toni; Zhong, Wujuan; Konganti, Kranti; Abiodun-Ojo, Olayinka; Ness, Roberta B; O'Connell, Catherine M; Haggerty, Catherine L

    2017-01-01

    Ideal management of sexually transmitted infections (STI) may require risk markers for pathology or vaccine development. Previously, we identified common genetic variants associated with chlamydial pelvic inflammatory disease (PID) and reduced fecundity. As this explains only a proportion of the long-term morbidity risk, we used whole-exome sequencing to identify biological pathways that may be associated with STI-related infertility. We obtained stored DNA from 43 non-Hispanic black women with PID from the PID Evaluation and Clinical Health Study. Infertility was assessed at a mean of 84 months. Principal component analysis revealed no population stratification. Potential covariates did not significantly differ between groups. Sequencing kernel association test was used to examine associations between aggregates of variants on a single gene and infertility. The results from the sequencing kernel association test were used to choose "focus genes" (P < 0.01; n = 150) for subsequent Ingenuity Pathway Analysis to identify "gene sets" that are enriched in biologically relevant pathways. Pathway analysis revealed that focus genes were enriched in canonical pathways including, IL-1 signaling, P2Y purinergic receptor signaling, and bone morphogenic protein signaling. Focus genes were enriched in pathways that impact innate and adaptive immunity, protein kinase A activity, cellular growth, and DNA repair. These may alter host resistance or immunopathology after infection. Targeted sequencing of biological pathways identified in this study may provide insight into STI-related infertility.

  13. An Outbreak of Streptococcus pyogenes in a Mental Health Facility: Advantage of Well-Timed Whole-Genome Sequencing Over emm Typing.

    PubMed

    Bergin, Sarah M; Periaswamy, Balamurugan; Barkham, Timothy; Chua, Hong Choon; Mok, Yee Ming; Fung, Daniel Shuen Sheng; Su, Alex Hsin Chuan; Lee, Yen Ling; Chua, Ming Lai Ivan; Ng, Poh Yong; Soon, Wei Jia Wendy; Chu, Collins Wenhan; Tan, Siyun Lucinda; Meehan, Mary; Ang, Brenda Sze Peng; Leo, Yee Sin; Holden, Matthew T G; De, Partha; Hsu, Li Yang; Chen, Swaine L; de Sessions, Paola Florez; Marimuthu, Kalisvar

    2018-05-09

    OBJECTIVEWe report the utility of whole-genome sequencing (WGS) conducted in a clinically relevant time frame (ie, sufficient for guiding management decision), in managing a Streptococcus pyogenes outbreak, and present a comparison of its performance with emm typing.SETTINGA 2,000-bed tertiary-care psychiatric hospital.METHODSActive surveillance was conducted to identify new cases of S. pyogenes. WGS guided targeted epidemiological investigations, and infection control measures were implemented. Single-nucleotide polymorphism (SNP)-based genome phylogeny, emm typing, and multilocus sequence typing (MLST) were performed. We compared the ability of WGS and emm typing to correctly identify person-to-person transmission and to guide the management of the outbreak.RESULTSThe study included 204 patients and 152 staff. We identified 35 patients and 2 staff members with S. pyogenes. WGS revealed polyclonal S. pyogenes infections with 3 genetically distinct phylogenetic clusters (C1-C3). Cluster C1 isolates were all emm type 4, sequence type 915 and had pairwise SNP differences of 0-5, which suggested recent person-to-person transmissions. Epidemiological investigation revealed that cluster C1 was mediated by dermal colonization and transmission of S. pyogenes in a male residential ward. Clusters C2 and C3 were genomically diverse, with pairwise SNP differences of 21-45 and 26-58, and emm 11 and mostly emm120, respectively. Clusters C2 and C3, which may have been considered person-to-person transmissions by emm typing, were shown by WGS to be unlikely by integrating pairwise SNP differences with epidemiology.CONCLUSIONSWGS had higher resolution than emm typing in identifying clusters with recent and ongoing person-to-person transmissions, which allowed implementation of targeted intervention to control the outbreak.Infect Control Hosp Epidemiol 2018;1-9.

  14. Approaching the taxonomic affiliation of unidentified sequences in public databases--an example from the mycorrhizal fungi.

    PubMed

    Nilsson, R Henrik; Kristiansson, Erik; Ryberg, Martin; Larsson, Karl-Henrik

    2005-07-18

    During the last few years, DNA sequence analysis has become one of the primary means of taxonomic identification of species, particularly so for species that are minute or otherwise lack distinct, readily obtainable morphological characters. Although the number of sequences available for comparison in public databases such as GenBank increases exponentially, only a minuscule fraction of all organisms have been sequenced, leaving taxon sampling a momentous problem for sequence-based taxonomic identification. When querying GenBank with a set of unidentified sequences, a considerable proportion typically lack fully identified matches, forming an ever-mounting pile of sequences that the researcher will have to monitor manually in the hope that new, clarifying sequences have been submitted by other researchers. To alleviate these concerns, a project to automatically monitor select unidentified sequences in GenBank for taxonomic progress through repeated local BLAST searches was initiated. Mycorrhizal fungi--a field where species identification often is prohibitively complex--and the much used ITS locus were chosen as test bed. A Perl script package called emerencia is presented. On a regular basis, it downloads select sequences from GenBank, separates the identified sequences from those insufficiently identified, and performs BLAST searches between these two datasets, storing all results in an SQL database. On the accompanying web-service http://emerencia.math.chalmers.se, users can monitor the taxonomic progress of insufficiently identified sequences over time, either through active searches or by signing up for e-mail notification upon disclosure of better matches. Other search categories, such as listing all insufficiently identified sequences (and their present best fully identified matches) publication-wise, are also available. The ever-increasing use of DNA sequences for identification purposes largely falls back on the assumption that public sequence databases contain a thorough sampling of taxonomically well-annotated sequences. Taxonomy, held by some to be an old-fashioned trade, has accordingly never been more important. emerencia does not automate the taxonomic process, but it does allow researchers to focus their efforts elsewhere than countless manual BLAST runs and arduous sieving of BLAST hit lists. The emerencia system is available on an open source basis for local installation with any organism and gene group as targets.

  15. Method to amplify variable sequences without imposing primer sequences

    DOEpatents

    Bradbury, Andrew M.; Zeytun, Ahmet

    2006-11-14

    The present invention provides methods of amplifying target sequences without including regions flanking the target sequence in the amplified product or imposing amplification primer sequences on the amplified product. Also provided are methods of preparing a library from such amplified target sequences.

  16. It’s More Than Stamp Collecting: How Genome Sequencing Can Unify Biological Research

    PubMed Central

    Richards, Stephen

    2015-01-01

    The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, whilst the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to “Big Science” survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. PMID:26003218

  17. It's more than stamp collecting: how genome sequencing can unify biological research.

    PubMed

    Richards, Stephen

    2015-07-01

    The availability of reference genome sequences, especially the human reference, has revolutionized the study of biology. However, while the genomes of some species have been fully sequenced, a wide range of biological problems still cannot be effectively studied for lack of genome sequence information. Here, I identify neglected areas of biology and describe how both targeted species sequencing and more broad taxonomic surveys of the tree of life can address important biological questions. I enumerate the significant benefits that would accrue from sequencing a broader range of taxa, as well as discuss the technical advances in sequencing and assembly methods that would allow for wide-ranging application of whole-genome analysis. Finally, I suggest that in addition to 'big science' survey initiatives to sequence the tree of life, a modified infrastructure-funding paradigm would better support reference genome sequence generation for research communities most in need. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Lewis Y Antigen as a Target for Breast Cancer Therapy

    DTIC Science & Technology

    1996-09-01

    have shown that a synthetic peptide can mimic the capsular polysaccharide of N. meningitis serogroup C (MCP) in that it induces an anti-MCP immune...intervening residue. All these sequences resemble the peptide we have identified as a mimic of the group C meningococcal polysaccharide . The immunological...Group C Polysaccharide ct(2-9)sialic acid The sequence similarities among the putative motifs suggest that antibodies raised to this peptide set might

  19. Genome-wide identification of microRNA targets in the neglected disease pathogens of the genus Echinococcus.

    PubMed

    Macchiaroli, Natalia; Maldonado, Lucas L; Zarowiecki, Magdalena; Cucher, Marcela; Gismondi, María Inés; Kamenetzky, Laura; Rosenzvit, Mara Cecilia

    2017-06-01

    MicroRNAs (miRNAs), a class of small non-coding RNAs, are key regulators of gene expression at post-transcriptional level and play essential roles in biological processes such as development. MiRNAs silence target mRNAs by binding to complementary sequences in the 3'untranslated regions (3'UTRs). The parasitic helminths of the genus Echinococcus are the causative agents of echinococcosis, a zoonotic neglected disease. In previous work, we performed a comprehensive identification and characterization of Echinococcus miRNAs. However, current knowledge about their targets is limited. Since target prediction algorithms rely on complementarity between 3'UTRs and miRNA sequences, a major limitation is the lack of accurate sequence information of 3'UTR for most species including parasitic helminths. We performed RNA-seq and developed a pipeline that integrates the transcriptomic data with available genomic data of this parasite in order to identify 3'UTRs of Echinococcus canadensis. The high confidence set of 3'UTRs obtained allowed the prediction of miRNA targets in Echinococcus through a bioinformatic approach. We performed for the first time a comparative analysis of miRNA targets in Echinococcus and Taenia. We found that many evolutionarily conserved target sites in Echinococcus and Taenia may be functional and under selective pressure. Signaling pathways such as MAPK and Wnt were among the most represented pathways indicating miRNA roles in parasite growth and development. Genome-wide identification and characterization of miRNA target genes in Echinococcus provide valuable information to guide experimental studies in order to understand miRNA functions in the parasites biology. miRNAs involved in essential functions, especially those being absent in the host or showing sequence divergence with respect to host orthologs, might be considered as novel therapeutic targets for echinococcosis control. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. A Children's Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor. | Office of Cancer Genomics

    Cancer.gov

    We performed genome-wide sequencing and analyzed mRNA and miRNA expression, DNA copy number, and DNA methylation in 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, AMER1, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), we identified mutations in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A.

  1. Identification and characterization of circular RNAs in zebrafish.

    PubMed

    Shen, Yudong; Guo, Xianwu; Wang, Weimin

    2017-01-01

    Circular RNA (circRNA), a class of RNAs with circular structure, has received little attention until recently, when some new features and functions were discovered. In the present study, we sequenced circRNAs in zebrafish (Danio rerio) and identified 3868 circRNAs using three algorithms (find_circ, CIRI, segemehl). The analysis of microRNA target sites on circRNAs shows that some circRNAs may function as miRNA sponges. Furthermore, we identified the existence of reverse complementary sequences in the flanking regions of only 25 (2.64%) exonic circRNAs, indicating that the mechanism of zebrafish exonic circRNA biogenesis might be different from that in mammals. Moreover, 1122 (29%) zebrafish circRNA sequences showed homology with human, mouse and coelacanth circRNAs. © 2016 Federation of European Biochemical Societies.

  2. Inforna 2.0: A Platform for the Sequence-Based Design of Small Molecules Targeting Structured RNAs.

    PubMed

    Disney, Matthew D; Winkelsas, Audrey M; Velagapudi, Sai Pradeep; Southern, Mark; Fallahi, Mohammad; Childs-Disney, Jessica L

    2016-06-17

    The development of small molecules that target RNA is challenging yet, if successful, could advance the development of chemical probes to study RNA function or precision therapeutics to treat RNA-mediated disease. Previously, we described Inforna, an approach that can mine motifs (secondary structures) within target RNAs, which is deduced from the RNA sequence, and compare them to a database of known RNA motif-small molecule binding partners. Output generated by Inforna includes the motif found in both the database and the desired RNA target, lead small molecules for that target, and other related meta-data. Lead small molecules can then be tested for binding and affecting cellular (dys)function. Herein, we describe Inforna 2.0, which incorporates all known RNA motif-small molecule binding partners reported in the scientific literature, a chemical similarity searching feature, and an improved user interface and is freely available via an online web server. By incorporation of interactions identified by other laboratories, the database has been doubled, containing 1936 RNA motif-small molecule interactions, including 244 unique small molecules and 1331 motifs. Interestingly, chemotype analysis of the compounds that bind RNA in the database reveals features in small molecule chemotypes that are privileged for binding. Further, this updated database expanded the number of cellular RNAs to which lead compounds can be identified.

  3. Selection, Characterization and Interaction Studies of a DNA Aptamer for the Detection of Bifidobacterium bifidum

    PubMed Central

    Hu, Lujun; Wang, Linlin; Lu, Wenwei; Zhao, Jianxin; Zhang, Hao; Chen, Wei

    2017-01-01

    A whole-bacterium-based SELEX (Systematic Evolution of Ligands by Exponential Enrichment) procedure was adopted in this study for the selection of an ssDNA aptamer that binds to Bifidobacterium bifidum. After 12 rounds of selection targeted against B. bifidum, 30 sequences were obtained and divided into seven families according to primary sequence homology and similarity of secondary structure. Four FAM (fluorescein amidite) labeled aptamer sequences from different families were selected for further characterization by flow cytometric analysis. The results reveal that the aptamer sequence CCFM641-5 demonstrated high-affinity and specificity for B. bifidum compared with the other sequences tested, and the estimated Kd value was 10.69 ± 0.89 nM. Additionally, sequence truncation experiments of the aptamer CCFM641-5 led to the conclusion that the 5′-primer and 3′-primer binding sites were essential for aptamer-target binding. In addition, the possible component of the target B. bifidum, bound by the aptamer CCFM641-5, was identified as a membrane protein by treatment with proteinase. Furthermore, to prove the potential application of the aptamer CCFM641-5, a colorimetric bioassay of the sandwich-type structure was used to detect B. bifidum. The assay had a linear range of 104 to 107 cfu/mL (R2 = 0.9834). Therefore, the colorimetric bioassay appears to be a promising method for the detection of B. bifidum based on the aptamer CCFM641-5. PMID:28441340

  4. Sequencing of the variable region of rpsB to discriminate between Streptococcus pneumoniae and other streptococcal species.

    PubMed

    Wyllie, Anne L; Pannekoek, Yvonne; Bovenkerk, Sandra; van Engelsdorp Gastelaars, Jody; Ferwerda, Bart; van de Beek, Diederik; Sanders, Elisabeth A M; Trzciński, Krzysztof; van der Ende, Arie

    2017-09-01

    The vast majority of streptococci colonizing the human upper respiratory tract are commensals, only sporadically implicated in disease. Of these, the most pathogenic is Mitis group member, Streptococcus pneumoniae Phenotypic and genetic similarities between streptococci can cause difficulties in species identification. Using ribosomal S2-gene sequences extracted from whole-genome sequences published from 501 streptococci, we developed a method to identify streptococcal species. We validated this method on non-pneumococcal isolates cultured from cases of severe streptococcal disease ( n = 101) and from carriage ( n = 103), and on non-typeable pneumococci from asymptomatic individuals ( n = 17) and on whole-genome sequences of 1157 pneumococcal isolates from meningitis in the Netherlands. Following this, we tested 221 streptococcal isolates in molecular assays originally assumed specific for S. pneumoniae , targeting cpsA , lytA , piaB , ply , Spn9802, zmpC and capsule-type-specific genes. Cluster analysis of S2-sequences showed grouping according to species in line with published phylogenies of streptococcal core genomes. S2-typing convincingly distinguished pneumococci from non-pneumococcal species (99.2% sensitivity, 100% specificity). Molecular assays targeting regions of lytA and piaB were 100% specific for S. pneumoniae , whereas assays targeting cpsA , ply , Spn9802, zmpC and selected serotype-specific assays (but not capsular sequence typing) showed a lack of specificity. False positive results were over-represented in species associated with carriage, although no particular confounding signal was unique for carriage isolates. © 2017 The Authors.

  5. Sequencing of the variable region of rpsB to discriminate between Streptococcus pneumoniae and other streptococcal species

    PubMed Central

    Pannekoek, Yvonne; Bovenkerk, Sandra; van Engelsdorp Gastelaars, Jody; Ferwerda, Bart; van de Beek, Diederik; Sanders, Elisabeth A. M.; Trzciński, Krzysztof; van der Ende, Arie

    2017-01-01

    The vast majority of streptococci colonizing the human upper respiratory tract are commensals, only sporadically implicated in disease. Of these, the most pathogenic is Mitis group member, Streptococcus pneumoniae. Phenotypic and genetic similarities between streptococci can cause difficulties in species identification. Using ribosomal S2-gene sequences extracted from whole-genome sequences published from 501 streptococci, we developed a method to identify streptococcal species. We validated this method on non-pneumococcal isolates cultured from cases of severe streptococcal disease (n = 101) and from carriage (n = 103), and on non-typeable pneumococci from asymptomatic individuals (n = 17) and on whole-genome sequences of 1157 pneumococcal isolates from meningitis in the Netherlands. Following this, we tested 221 streptococcal isolates in molecular assays originally assumed specific for S. pneumoniae, targeting cpsA, lytA, piaB, ply, Spn9802, zmpC and capsule-type-specific genes. Cluster analysis of S2-sequences showed grouping according to species in line with published phylogenies of streptococcal core genomes. S2-typing convincingly distinguished pneumococci from non-pneumococcal species (99.2% sensitivity, 100% specificity). Molecular assays targeting regions of lytA and piaB were 100% specific for S. pneumoniae, whereas assays targeting cpsA, ply, Spn9802, zmpC and selected serotype-specific assays (but not capsular sequence typing) showed a lack of specificity. False positive results were over-represented in species associated with carriage, although no particular confounding signal was unique for carriage isolates. PMID:28931649

  6. Identification of estrogen-responsive genes using a genome-wide analysis of promoter elements for transcription factor binding sites.

    PubMed

    Kamalakaran, Sitharthan; Radhakrishnan, Senthil K; Beck, William T

    2005-06-03

    We developed a pipeline to identify novel genes regulated by the steroid hormone-dependent transcription factor, estrogen receptor, through a systematic analysis of upstream regions of all human and mouse genes. We built a data base of putative promoter regions for 23,077 human and 19,984 mouse transcripts from National Center for Biotechnology Information annotation and 8793 human and 6785 mouse promoters from the Data Base of Transcriptional Start Sites. We used this data base of putative promoters to identify potential targets of estrogen receptor by identifying estrogen response elements (EREs) in their promoters. Our program correctly identified EREs in genes known to be regulated by estrogen in addition to several new genes whose putative promoters contained EREs. We validated six genes (KIAA1243, NRIP1, MADH9, NME3, TPD52L, and ABCG2) to be estrogen-responsive in MCF7 cells using reverse transcription PCR. To allow for extensibility of our program in identifying targets of other transcription factors, we have built a Web interface to access our data base and programs. Our Web-based program for Promoter Analysis of Genome, PAGen@UIC, allows a user to identify putative target genes for vertebrate transcription factors through the analysis of their upstream sequences. The interface allows the user to search the human and mouse promoter data bases for potential target genes containing one or more listed transcription factor binding sites (TFBSs) in their upstream elements, using either regular expression-based consensus or position weight matrices. The data base can also be searched for promoters harboring user-defined TFBSs given as a consensus or a position weight matrix. Furthermore, the user can retrieve putative promoter sequences for any given gene together with identified TFBSs located on its promoter. Orthologous promoters are also analyzed to determine conserved elements.

  7. Messenger RNA biomarker signatures for forensic body fluid identification revealed by targeted RNA sequencing.

    PubMed

    Hanson, E; Ingold, S; Haas, C; Ballantyne, J

    2018-05-01

    The recovery of a DNA profile from the perpetrator or victim in criminal investigations can provide valuable 'source level' information for investigators. However, a DNA profile does not reveal the circumstances by which biological material was transferred. Some contextual information can be obtained by a determination of the tissue or fluid source of origin of the biological material as it is potentially indicative of some behavioral activity on behalf of the individual that resulted in its transfer from the body. Here, we sought to improve upon established RNA based methods for body fluid identification by developing a targeted multiplexed next generation mRNA sequencing assay comprising a panel of approximately equal sized gene amplicons. The multiplexed biomarker panel includes several highly specific gene targets with the necessary specificity to definitively identify most forensically relevant biological fluids and tissues (blood, semen, saliva, vaginal secretions, menstrual blood and skin). In developing the biomarker panel we evaluated 66 gene targets, with a progressive iteration of testing target combinations that exhibited optimal sensitivity and specificity using a training set of forensically relevant body fluid samples. The current assay comprises 33 targets: 6 blood, 6 semen, 6 saliva, 4 vaginal secretions, 5 menstrual blood and 6 skin markers. We demonstrate the sensitivity and specificity of the assay and the ability to identify body fluids in single source and admixed stains. A 16 sample blind test was carried out by one lab with samples provided by the other participating lab. The blinded lab correctly identified the body fluids present in 15 of the samples with the major component identified in the 16th. Various classification methods are being investigated to permit inference of the body fluid/tissue in dried physiological stains. These include the percentage of reads in a sample that are due to each of the 6 tissues/body fluids tested and inter-sample differential gene expression revealed by agglomerative hierarchical clustering. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Identification of genes associated with reproduction in the Mud Crab (Scylla olivacea) and their differential expression following serotonin stimulation.

    PubMed

    Kornthong, Napamanee; Cummins, Scott F; Chotwiwatthanakun, Charoonroj; Khornchatri, Kanjana; Engsusophon, Attakorn; Hanna, Peter J; Sobhon, Prasert

    2014-01-01

    The central nervous system (CNS) is often intimately involved in reproduction control and is therefore a target organ for transcriptomic investigations to identify reproduction-associated genes. In this study, 454 transcriptome sequencing was performed on pooled brain and ventral nerve cord of the female mud crab (Scylla olivacea) following serotonin injection (5 µg/g BW). A total of 197,468 sequence reads was obtained with an average length of 828 bp. Approximately 38.7% of 2,183 isotigs matched with significant similarity (E value < 1e-4) to sequences within the Genbank non-redundant (nr) database, with most significant matches being to crustacean and insect sequences. Approximately 32 putative neuropeptide genes were identified from nonmatching blast sequences. In addition, we identified full-length transcripts for crustacean reproductive-related genes, namely farnesoic acid o-methyltransferase (FAMeT), estrogen sulfotransferase (ESULT) and prostaglandin F synthase (PGFS). Following serotonin injection, which would normally initiate reproductive processes, we found up-regulation of FAMeT, ESULT and PGFS expression in the female CNS and ovary. Our data here provides an invaluable new resource for understanding the molecular role of the CNS on reproduction in S. olivacea.

  9. Whole-exome sequencing reveals the spectrum of gene mutations and the clonal evolution patterns in paediatric acute myeloid leukaemia.

    PubMed

    Shiba, Norio; Yoshida, Kenichi; Shiraishi, Yuichi; Okuno, Yusuke; Yamato, Genki; Hara, Yusuke; Nagata, Yasunobu; Chiba, Kenichi; Tanaka, Hiroko; Terui, Kiminori; Kato, Motohiro; Park, Myoung-Ja; Ohki, Kentaro; Shimada, Akira; Takita, Junko; Tomizawa, Daisuke; Kudo, Kazuko; Arakawa, Hirokazu; Adachi, Souichi; Taga, Takashi; Tawa, Akio; Ito, Etsuro; Horibe, Keizo; Sanada, Masashi; Miyano, Satoru; Ogawa, Seishi; Hayashi, Yasuhide

    2016-11-01

    Acute myeloid leukaemia (AML) is a molecularly and clinically heterogeneous disease. Targeted sequencing efforts have identified several mutations with diagnostic and prognostic values in KIT, NPM1, CEBPA and FLT3 in both adult and paediatric AML. In addition, massively parallel sequencing enabled the discovery of recurrent mutations (i.e. IDH1/2 and DNMT3A) in adult AML. In this study, whole-exome sequencing (WES) of 22 paediatric AML patients revealed mutations in components of the cohesin complex (RAD21 and SMC3), BCORL1 and ASXL2 in addition to previously known gene mutations. We also revealed intratumoural heterogeneities in many patients, implicating multiple clonal evolution events in the development of AML. Furthermore, targeted deep sequencing in 182 paediatric AML patients identified three major categories of recurrently mutated genes: cohesion complex genes [STAG2, RAD21 and SMC3 in 17 patients (8·3%)], epigenetic regulators [ASXL1/ASXL2 in 17 patients (8·3%), BCOR/BCORL1 in 7 patients (3·4%)] and signalling molecules. We also performed WES in four patients with relapsed AML. Relapsed AML evolved from one of the subclones at the initial phase and was accompanied by many additional mutations, including common driver mutations that were absent or existed only with lower allele frequency in the diagnostic samples, indicating a multistep process causing leukaemia recurrence. © 2016 John Wiley & Sons Ltd.

  10. Next generation sequencing of extraskeletal myxoid chondrosarcoma.

    PubMed

    Davis, Elizabeth J; Wu, Yi-Mi; Robinson, Dan; Schuetze, Scott M; Baker, Laurence H; Athanikar, Jyoti; Cao, Xuhong; Kunju, Lakshmi P; Chinnaiyan, Arul M; Chugh, Rashmi

    2017-03-28

    Extraskeletal myxoid chondrosarcoma (EMC) is an indolent translocation-associated soft tissue sarcoma with a high propensity for metastases. Using a clinical sequencing approach, we genomically profiled patients with metastatic EMC to elucidate the molecular biology and identify potentially actionable mutations. We also evaluated potential predictive factors of benefit to sunitinib, a multi-targeted tyrosine kinase inhibitor with reported activity in a subset of EMC patients. Between January 31, 2012 and April 15, 2016, six patients with EMC participated in the clinical sequencing research study. High quality DNA and RNA was isolated and matched normal samples underwent comprehensive next generation sequencing (whole or OncoSeq capture exome of tumor and normal, tumor PolyA+ and capture transcriptome). The expression levels of sunitinib targeted-kinases were measured by transcriptome sequencing for KDR, PDGFRA/B, KIT, RET, FLT1, and FLT4. The previously reported EWSR1-NR4A3 translocation was identified in all patient tumors; however, other recurring genomic abnormalities were not detected. RET expression was significantly greater in patients with EMC relative to other types of sarcomas except for liposarcoma (p<0.0002). The folate receptor was overexpressed in two patients. Our study demonstrated that similar to other translocation-associated sarcomas, the mutational profile of metastatic EMC is limited beyond the pathognomonic translocation. The clinical significance of RET expression in EMC should be explored. Additional pre-clinical investigations of EMC may help elucidate molecular mechanisms contributing to EMC tumorigenesis that could be translated to the clinical setting.

  11. Implementation of Amplicon Parallel Sequencing Leads to Improvement of Diagnosis and Therapy of Lung Cancer Patients.

    PubMed

    König, Katharina; Peifer, Martin; Fassunke, Jana; Ihle, Michaela A; Künstlinger, Helen; Heydt, Carina; Stamm, Katrin; Ueckeroth, Frank; Vollbrecht, Claudia; Bos, Marc; Gardizi, Masyar; Scheffler, Matthias; Nogova, Lucia; Leenders, Frauke; Albus, Kerstin; Meder, Lydia; Becker, Kerstin; Florin, Alexandra; Rommerscheidt-Fuss, Ursula; Altmüller, Janine; Kloth, Michael; Nürnberg, Peter; Henkel, Thomas; Bikár, Sven-Ernö; Sos, Martin L; Geese, William J; Strauss, Lewis; Ko, Yon-Dschun; Gerigk, Ulrich; Odenthal, Margarete; Zander, Thomas; Wolf, Jürgen; Merkelbach-Bruse, Sabine; Buettner, Reinhard; Heukamp, Lukas C

    2015-07-01

    The Network Genomic Medicine Lung Cancer was set up to rapidly translate scientific advances into early clinical trials of targeted therapies in lung cancer performing molecular analyses of more than 3500 patients annually. Because sequential analysis of the relevant driver mutations on fixated samples is challenging in terms of workload, tissue availability, and cost, we established multiplex parallel sequencing in routine diagnostics. The aim was to analyze all therapeutically relevant mutations in lung cancer samples in a high-throughput fashion while significantly reducing turnaround time and amount of input DNA compared with conventional dideoxy sequencing of single polymerase chain reaction amplicons. In this study, we demonstrate the feasibility of a 102 amplicon multiplex polymerase chain reaction followed by sequencing on an Illumina sequencer on formalin-fixed paraffin-embedded tissue in routine diagnostics. Analysis of a validation cohort of 180 samples showed this approach to require significantly less input material and to be more reliable, robust, and cost-effective than conventional dideoxy sequencing. Subsequently, 2657 lung cancer patients were analyzed. We observed that comprehensive biomarker testing provided novel information in addition to histological diagnosis and clinical staging. In 2657 consecutively analyzed lung cancer samples, we identified driver mutations at the expected prevalence. Furthermore we found potentially targetable DDR2 mutations at a frequency of 3% in both adenocarcinomas and squamous cell carcinomas. Overall, our data demonstrate the utility of systematic sequencing analysis in a clinical routine setting and highlight the dramatic impact of such an approach on the availability of therapeutic strategies for the targeted treatment of individual cancer patients.

  12. Genome-wide association study using high-density single nucleotide polymorphism arrays and whole-genome sequences for clinical mastitis traits in dairy cattle.

    PubMed

    Sahana, G; Guldbrandtsen, B; Thomsen, B; Holm, L-E; Panitz, F; Brøndum, R F; Bendixen, C; Lund, M S

    2014-11-01

    Mastitis is a mammary disease that frequently affects dairy cattle. Despite considerable research on the development of effective prevention and treatment strategies, mastitis continues to be a significant issue in bovine veterinary medicine. To identify major genes that affect mastitis in dairy cattle, 6 chromosomal regions on Bos taurus autosome (BTA) 6, 13, 16, 19, and 20 were selected from a genome scan for 9 mastitis phenotypes using imputed high-density single nucleotide polymorphism arrays. Association analyses using sequence-level variants for the 6 targeted regions were carried out to map causal variants using whole-genome sequence data from 3 breeds. The quantitative trait loci (QTL) discovery population comprised 4,992 progeny-tested Holstein bulls, and QTL were confirmed in 4,442 Nordic Red and 1,126 Jersey cattle. The targeted regions were imputed to the sequence level. The highest association signal for clinical mastitis was observed on BTA 6 at 88.97 Mb in Holstein cattle and was confirmed in Nordic Red cattle. The peak association region on BTA 6 contained 2 genes: vitamin D-binding protein precursor (GC) and neuropeptide FF receptor 2 (NPFFR2), which, based on known biological functions, are good candidates for affecting mastitis. However, strong linkage disequilibrium in this region prevented conclusive determination of the causal gene. A different QTL on BTA 6 located at 88.32 Mb in Holstein cattle affected mastitis. In addition, QTL on BTA 13 and 19 were confirmed to segregate in Nordic Red cattle and QTL on BTA 16 and 20 were confirmed in Jersey cattle. Although several candidate genes were identified in these targeted regions, it was not possible to identify a gene or polymorphism as the causal factor for any of these regions. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  13. Draft sequencing and comparative genomics of Xylella fastidiosa strains reveal novel biological insights.

    PubMed

    Bhattacharyya, Anamitra; Stilwagen, Stephanie; Reznik, Gary; Feil, Helene; Feil, William S; Anderson, Iain; Bernal, Axel; D'Souza, Mark; Ivanova, Natalia; Kapatral, Vinayak; Larsen, Niels; Los, Tamara; Lykidis, Athanasios; Selkov, Eugene; Walunas, Theresa L; Purcell, Alexander; Edwards, Rob A; Hawkins, Trevor; Haselkorn, Robert; Overbeek, Ross; Kyrpides, Nikos C; Predki, Paul F

    2002-10-01

    Draft sequencing is a rapid and efficient method for determining the near-complete sequence of microbial genomes. Here we report a comparative analysis of one complete and two draft genome sequences of the phytopathogenic bacterium, Xylella fastidiosa, which causes serious disease in plants, including citrus, almond, and oleander. We present highlights of an in silico analysis based on a comparison of reconstructions of core biological subsystems. Cellular pathway reconstructions have been used to identify a small number of genes, which are likely to reside within the draft genomes but are not captured in the draft assembly. These represented only a small fraction of all genes and were predominantly large and small ribosomal subunit protein components. By using this approach, some of the inherent limitations of draft sequence can be significantly reduced. Despite the incomplete nature of the draft genomes, it is possible to identify several phage-related genes, which appear to be absent from the draft genomes and not the result of insufficient sequence sampling. This region may therefore identify potential host-specific functions. Based on this first functional reconstruction of a phytopathogenic microbe, we spotlight an unusual respiration machinery as a potential target for biological control. We also predicted and developed a new defined growth medium for Xylella.

  14. Utilizing the Dog Genome in the Search for Novel Candidate Genes Involved in Glioma Development—Genome Wide Association Mapping followed by Targeted Massive Parallel Sequencing Identifies a Strongly Associated Locus

    PubMed Central

    Dickinson, Peter; Xiong, Anqi; York, Daniel; Jayashankar, Kartika; Pielberg, Gerli; Koltookian, Michele; Murén, Eva; Fuxelius, Hans-Henrik; Weishaupt, Holger; Andersson, Göran; Hedhammar, Åke; Bongcam-Rudloff, Erik; Forsberg-Nilsson, Karin

    2016-01-01

    Gliomas are the most common form of malignant primary brain tumors in humans and second most common in dogs, occurring with similar frequencies in both species. Dogs are valuable spontaneous models of human complex diseases including cancers and may provide insight into disease susceptibility and oncogenesis. Several brachycephalic breeds such as Boxer, Bulldog and Boston Terrier have an elevated risk of developing glioma, but others, including Pug and Pekingese, are not at higher risk. To identify glioma-associated genetic susceptibility factors, an across-breed genome-wide association study (GWAS) was performed on 39 dog glioma cases and 141 controls from 25 dog breeds, identifying a genome-wide significant locus on canine chromosome (CFA) 26 (p = 2.8 x 10−8). Targeted re-sequencing of the 3.4 Mb candidate region was performed, followed by genotyping of the 56 SNVs that best fit the association pattern between the re-sequenced cases and controls. We identified three candidate genes that were highly associated with glioma susceptibility: CAMKK2, P2RX7 and DENR. CAMKK2 showed reduced expression in both canine and human brain tumors, and a non-synonymous variant in P2RX7, previously demonstrated to have a 50% decrease in receptor function, was also associated with disease. Thus, one or more of these genes appear to affect glioma susceptibility. PMID:27171399

  15. OP17MICRORNA PROFILING USING SMALL RNA-SEQ IN PAEDIATRIC LOW GRADE GLIOMAS

    PubMed Central

    Jeyapalan, Jennie N.; Jones, Tania A.; Tatevossian, Ruth G.; Qaddoumi, Ibrahim; Ellison, David W.; Sheer, Denise

    2014-01-01

    INTRODUCTION: MicroRNAs regulate gene expression by targeting mRNAs for translational repression or degradation at the post-transcriptional level. In paediatric low-grade gliomas a few key genetic mutations have been identified, including BRAF fusions, FGFR1 duplications and MYB rearrangements. Our aim in the current study is to profile aberrant microRNA expression in paediatric low-grade gliomas and determine the role of epigenetic changes in the aetiology and behaviour of these tumours. METHOD: MicroRNA profiling of tumour samples (6 pilocytic, 2 diffuse, 2 pilomyxoid astrocytomas) and normal brain controls (4 adult normal brain samples and a primary glial progenitor cell-line) was performed using small RNA sequencing. Bioinformatic analysis included sequence alignment, analysis of the number of reads (CPM, counts per million) and differential expression. RESULTS: Sequence alignment identified 695 microRNAs, whose expression was compared in tumours v. normal brain. PCA and hierarchical clustering showed separate groups for tumours and normal brain. Computational analysis identified approximately 400 differentially expressed microRNAs in the tumours compared to matched location controls. Our findings will then be validated and integrated with extensive genetic and epigenetic information we have previously obtained for the full tumour cohort. CONCLUSION: We have identified microRNAs that are differentially expressed in paediatric low-grade gliomas. As microRNAs are known to target genes involved in the initiation and progression of cancer, they provide critical information on tumour pathogenesis and are an important class of biomarkers.

  16. Comprehensive identification and profiling of host miRNAs in response to Singapore grouper iridovirus (SGIV) infection in grouper (Epinephelus coioides).

    PubMed

    Guo, Chuanyu; Cui, Huachun; Ni, Songwei; Yan, Yang; Qin, Qiwei

    2015-10-01

    microRNAs (miRNAs) are an evolutionarily conserved class of non-coding RNA molecules that participate in various biological processes. Employment of high-throughput screening strategies greatly prompts the investigation and profiling of miRNAs in diverse species. In recent years, grouper (Epinephelus spp.) aquaculture was severely affected by iridoviral diseases. However, knowledge regarding the host immune responses to viral infection, especially the miRNA-mediated immune regulatory roles, is rather limited. In this study, by employing Solexa deep sequencing approach, we identified 116 grouper miRNAs from grouper spleen-derived cells (GS). As expected, these miRNAs shared high sequence similarity with miRNAs identified in zebrafish (Danio rerio), pufferfish (Fugu rubripes), and other higher vertebrates. In the process of Singapore grouper iridovirus (SGIV) infection, 45 and 43 miRNAs with altered expression (>1.5-fold) were identified by miRNA microarray assays in grouper spleen tissues and GS cells, respectively. Furthermore, target prediction revealed 189 putative targets of these grouper miRNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Computational analysis of ribonomics datasets identifies long non-coding RNA targets of γ-herpesviral miRNAs.

    PubMed

    Sethuraman, Sunantha; Thomas, Merin; Gay, Lauren A; Renne, Rolf

    2018-05-29

    Ribonomics experiments involving crosslinking and immuno-precipitation (CLIP) of Ago proteins have expanded the understanding of the miRNA targetome of several organisms. These techniques, collectively referred to as CLIP-seq, have been applied to identifying the mRNA targets of miRNAs expressed by Kaposi's Sarcoma-associated herpes virus (KSHV) and Epstein-Barr virus (EBV). However, these studies focused on identifying only those RNA targets of KSHV and EBV miRNAs that are known to encode proteins. Recent studies have demonstrated that long non-coding RNAs (lncRNAs) are also targeted by miRNAs. In this study, we performed a systematic re-analysis of published datasets from KSHV- and EBV-driven cancers. We used CLIP-seq data from lymphoma cells or EBV-transformed B cells, and a crosslinking, ligation and sequencing of hybrids dataset from KSHV-infected endothelial cells, to identify novel lncRNA targets of viral miRNAs. Here, we catalog the lncRNA targetome of KSHV and EBV miRNAs, and provide a detailed in silico analysis of lncRNA-miRNA binding interactions. Viral miRNAs target several hundred lncRNAs, including a subset previously shown to be aberrantly expressed in human malignancies. In addition, we identified thousands of lncRNAs to be putative targets of human miRNAs, suggesting that miRNA-lncRNA interactions broadly contribute to the regulation of gene expression.

  18. Analysis of microRNA profile of Anopheles sinensis by deep sequencing and bioinformatic approaches.

    PubMed

    Feng, Xinyu; Zhou, Xiaojian; Zhou, Shuisen; Wang, Jingwen; Hu, Wei

    2018-03-12

    microRNAs (miRNAs) are small non-coding RNAs widely identified in many mosquitoes. They are reported to play important roles in development, differentiation and innate immunity. However, miRNAs in Anopheles sinensis, one of the Chinese malaria mosquitoes, remain largely unknown. We investigated the global miRNA expression profile of An. sinensis using Illumina Hiseq 2000 sequencing. Meanwhile, we applied a bioinformatic approach to identify potential miRNAs in An. sinensis. The identified miRNA profiles were compared and analyzed by two approaches. The selected miRNAs from the sequencing result and the bioinformatic approach were confirmed with qRT-PCR. Moreover, target prediction, GO annotation and pathway analysis were carried out to understand the role of miRNAs in An. sinensis. We identified 49 conserved miRNAs and 12 novel miRNAs by next-generation high-throughput sequencing technology. In contrast, 43 miRNAs were predicted by the bioinformatic approach, of which two were assigned as novel. Comparative analysis of miRNA profiles by two approaches showed that 21 miRNAs were shared between them. Twelve novel miRNAs did not match any known miRNAs of any organism, indicating that they are possibly species-specific. Forty miRNAs were found in many mosquito species, indicating that these miRNAs are evolutionally conserved and may have critical roles in the process of life. Both the selected known and novel miRNAs (asi-miR-281, asi-miR-184, asi-miR-14, asi-miR-nov5, asi-miR-nov4, asi-miR-9383, and asi-miR-2a) could be detected by quantitative real-time PCR (qRT-PCR) in the sequenced sample, and the expression patterns of these miRNAs measured by qRT-PCR were in concordance with the original miRNA sequencing data. The predicted targets for the known and the novel miRNAs covered many important biological roles and pathways indicating the diversity of miRNA functions. We also found 21 conserved miRNAs and eight counterparts of target immune pathway genes in An. sinensis based on the analysis of An. gambiae. Our results provide the first lead to the elucidation of the miRNA profile in An. sinensis. Unveiling the roles of mosquito miRNAs will undoubtedly lead to a better understanding of mosquito biology and mosquito-pathogen interactions. This work lays the foundation for the further functional study of An. sinensis miRNAs and will facilitate their application in vector control.

  19. A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.

    PubMed

    Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A

    1999-12-20

    The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.

  20. Targeting neuropilin-1 in human leukemia and lymphoma.

    PubMed

    Karjalainen, Katja; Jaalouk, Diana E; Bueso-Ramos, Carlos E; Zurita, Amado J; Kuniyasu, Akihiko; Eckhardt, Bedrich L; Marini, Frank C; Lichtiger, Benjamin; O'Brien, Susan; Kantarjian, Hagop M; Cortes, Jorge E; Koivunen, Erkki; Arap, Wadih; Pasqualini, Renata

    2011-01-20

    Targeted drug delivery offers an opportunity for the development of safer and more effective therapies for the treatment of cancer. In this study, we sought to identify short, cell-internalizing peptide ligands that could serve as directive agents for specific drug delivery in hematologic malignancies. By screening of human leukemia cells with a combinatorial phage display peptide library, we isolated a peptide motif, sequence Phe-Phe/Tyr-Any-Leu-Arg-Ser (F(F)/(Y)XLRS), which bound to different leukemia cell lines and to patient-derived bone marrow samples. The motif was internalized through a receptor-mediated pathway, and we next identified the corresponding receptor as the transmembrane glycoprotein neuropilin-1 (NRP-1). Moreover, we observed a potent anti-leukemia cell effect when the targeting motif was synthesized in tandem to the pro-apoptotic sequence (D)(KLAKLAK)₂. Finally, our results confirmed increased expression of NRP-1 in representative human leukemia and lymphoma cell lines and in a panel of bone marrow specimens obtained from patients with acute lymphoblastic leukemia or acute myelogenous leukemia compared with normal bone marrow. These results indicate that NRP-1 could potentially be used as a target for ligand-directed therapy in human leukemias and lymphomas and that the prototype CGFYWLRSC-GG-(D)(KLAKLAK)₂ is a promising drug candidate in this setting.

  1. The Alveolate Perkinsus marinus: Biological Insights from EST Gene Discovery

    PubMed Central

    2010-01-01

    Background Perkinsus marinus, a protozoan parasite of the eastern oyster Crassostrea virginica, has devastated natural and farmed oyster populations along the Atlantic and Gulf coasts of the United States. It is classified as a member of the Perkinsozoa, a recently established phylum considered close to the ancestor of ciliates, dinoflagellates, and apicomplexans, and a key taxon for understanding unique adaptations (e.g. parasitism) within the Alveolata. Despite intense parasite pressure, no disease-resistant oysters have been identified and no effective therapies have been developed to date. Results To gain insight into the biological basis of the parasite's virulence and pathogenesis mechanisms, and to identify genes encoding potential targets for intervention, we generated >31,000 5' expressed sequence tags (ESTs) derived from four trophozoite libraries generated from two P. marinus strains. Trimming and clustering of the sequence tags yielded 7,863 unique sequences, some of which carry a spliced leader. Similarity searches revealed that 55% of these had hits in protein sequence databases, of which 1,729 had their best hit with proteins from the chromalveolates (E-value ≤ 1e-5). Some sequences are similar to those proven to be targets for effective intervention in other protozoan parasites, and include not only proteases, antioxidant enzymes, and heat shock proteins, but also those associated with relict plastids, such as acetyl-CoA carboxylase and methyl erythrithol phosphate pathway components, and those involved in glycan assembly, protein folding/secretion, and parasite-host interactions. Conclusions Our transcriptome analysis of P. marinus, the first for any member of the Perkinsozoa, contributes new insight into its biology and taxonomic position. It provides a very informative, albeit preliminary, glimpse into the expression of genes encoding functionally relevant proteins as potential targets for chemotherapy, and evidence for the presence of a relict plastid. Further, although P. marinus sequences display significant similarity to those from both apicomplexans and dinoflagellates, the presence of trans-spliced transcripts confirms the previously established affinities with the latter. The EST analysis reported herein, together with the recently completed sequence of the P. marinus genome and the development of transfection methodology, should result in improved intervention strategies against dermo disease. PMID:20374649

  2. Specific and Modular Binding Code for Cytosine Recognition in Pumilio/FBF (PUF) RNA-binding Domains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dong, Shuyun; Wang, Yang; Cassidy-Amstutz, Caleb

    2011-10-28

    Pumilio/fem-3 mRNA-binding factor (PUF) proteins possess a recognition code for bases A, U, and G, allowing designed RNA sequence specificity of their modular Pumilio (PUM) repeats. However, recognition side chains in a PUM repeat for cytosine are unknown. Here we report identification of a cytosine-recognition code by screening random amino acid combinations at conserved RNA recognition positions using a yeast three-hybrid system. This C-recognition code is specific and modular as specificity can be transferred to different positions in the RNA recognition sequence. A crystal structure of a modified PUF domain reveals specific contacts between an arginine side chain and themore » cytosine base. We applied the C-recognition code to design PUF domains that recognize targets with multiple cytosines and to generate engineered splicing factors that modulate alternative splicing. Finally, we identified a divergent yeast PUF protein, Nop9p, that may recognize natural target RNAs with cytosine. This work deepens our understanding of natural PUF protein target recognition and expands the ability to engineer PUF domains to recognize any RNA sequence.« less

  3. Species-specific identification of Dekkera/Brettanomyces yeasts by fluorescently labeled DNA probes targeting the 26S rRNA.

    PubMed

    Röder, Christoph; König, Helmut; Fröhlich, Jürgen

    2007-09-01

    Sequencing of the complete 26S rRNA genes of all Dekkera/Brettanomyces species colonizing different beverages revealed the potential for a specific primer and probe design to support diagnostic PCR approaches and FISH. By analysis of the complete 26S rRNA genes of all five currently known Dekkera/Brettanomyces species (Dekkera bruxellensis, D. anomala, Brettanomyces custersianus, B. nanus and B. naardenensis), several regions with high nucleotide sequence variability yet distinct from the D1/D2 domains were identified. FISH species-specific probes targeting the 26S rRNA gene's most variable regions were designed. Accessibility of probe targets for hybridization was facilitated by the construction of partially complementary 'side'-labeled probes, based on secondary structure models of the rRNA sequences. The specificity and routine applicability of the FISH-based method for yeast identification were tested by analyzing different wine isolates. Investigation of the prevalence of Dekkera/Brettanomyces yeasts in the German viticultural regions Wonnegau, Nierstein and Bingen (Rhinehesse, Rhineland-Palatinate) resulted in the isolation of 37 D. bruxellensis strains from 291 wine samples.

  4. TARGET Research Goals

    Cancer.gov

    TARGET researchers use various sequencing and array-based methods to examine the genomes, transcriptomes, and for some diseases epigenomes of select childhood cancers. This “multi-omic” approach generates a comprehensive profile of molecular alterations for each cancer type. Alterations are changes in DNA or RNA, such as rearrangements in chromosome structure or variations in gene expression, respectively. Through computational analyses and assays to validate biological function, TARGET researchers predict which alterations disrupt the function of a gene or pathway and promote cancer growth, progression, and/or survival. Researchers identify candidate therapeutic targets and/or prognostic markers from the cancer-associated alterations.

  5. Targeted sequencing identifies novel variants involved in autosomal recessive hereditary hearing loss in Qatari families.

    PubMed

    Alkowari, Moza K; Vozzi, Diego; Bhagat, Shruti; Krishnamoorthy, Navaneethakrishnan; Morgan, Anna; Hayder, Yousra; Logendra, Barathy; Najjar, Nehal; Gandin, Ilaria; Gasparini, Paolo; Badii, Ramin; Girotto, Giorgia; Abdulhadi, Khalid

    2017-08-01

    Hereditary hearing loss is characterized by a very high genetic heterogeneity. In the Qatari population the role of GJB2, the worldwide HHL major player, seems to be quite limited compared to Caucasian populations. In this study we analysed 18 Qatari families affected by non-syndromic hearing loss using a targeted sequencing approach that allowed us to analyse 81 genes simultaneously. Thanks to this approach, 50% of these families (9 out of 18) resulted positive for the presence of likely causative alleles in 6 different genes: CDH23, MYO6, GJB6, OTOF, TMC1 and OTOA. In particular, 4 novel alleles were detected while the remaining ones were already described to be associated to HHL in other ethnic groups. Molecular modelling has been used to further investigate the role of novel alleles identified in CDH23 and TMC1 genes demonstrating their crucial role in Ca2+ binding and therefore possible functional role in proteins. Present study showed that an accurate molecular diagnosis based on next generation sequencing technologies might largely improve molecular diagnostics outcome leading to benefits for both genetic counseling and definition of recurrence risk. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Spectrum of mutations in leiomyosarcomas identified by clinical targeted next-generation sequencing.

    PubMed

    Lee, Paul J; Yoo, Naomi S; Hagemann, Ian S; Pfeifer, John D; Cottrell, Catherine E; Abel, Haley J; Duncavage, Eric J

    2017-02-01

    Recurrent genomic mutations in uterine and non-uterine leiomyosarcomas have not been well established. Using a next generation sequencing (NGS) panel of common cancer-associated genes, 25 leiomyosarcomas arising from multiple sites were examined to explore genetic alterations, including single nucleotide variants (SNV), small insertions/deletions (indels), and copy number alterations (CNA). Sequencing showed 86 non-synonymous, coding region somatic variants within 151 gene targets in 21 cases, with a mean of 4.1 variants per case; 4 cases had no putative mutations in the panel of genes assayed. The most frequently altered genes were TP53 (36%), ATM and ATRX (16%), and EGFR and RB1 (12%). CNA were identified in 85% of cases, with the most frequent copy number losses observed in chromosomes 10 and 13 including PTEN and RB1; the most frequent gains were seen in chromosomes 7 and 17. Our data show that deletions in canonical cancer-related genes are common in leiomyosarcomas. Further, the spectrum of gene mutations observed shows that defects in DNA repair and chromosomal maintenance are central to the biology of leiomyosarcomas, and that activating mutations observed in other common cancer types are rare in leiomyosarcomas. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. A branch-migration based fluorescent probe for straightforward, sensitive and specific discrimination of DNA mutations

    PubMed Central

    Xiao, Xianjin; Wu, Tongbo; Xu, Lei; Chen, Wei

    2017-01-01

    Abstract Genetic mutations are important biomarkers for cancer diagnostics and surveillance. Preferably, the methods for mutation detection should be straightforward, highly specific and sensitive to low-level mutations within various sequence contexts, fast and applicable at room-temperature. Though some of the currently available methods have shown very encouraging results, their discrimination efficiency is still very low. Herein, we demonstrate a branch-migration based fluorescent probe (BM probe) which is able to identify the presence of known or unknown single-base variations at abundances down to 0.3%-1% within 5 min, even in highly GC-rich sequence regions. The discrimination factors between the perfect-match target and single-base mismatched target are determined to be 89–311 by measurement of their respective branch-migration products via polymerase elongation reactions. The BM probe not only enabled sensitive detection of two types of EGFR-associated point mutations located in GC-rich regions, but also successfully identified the BRAF V600E mutation in the serum from a thyroid cancer patient which could not be detected by the conventional sequencing method. The new method would be an ideal choice for high-throughput in vitro diagnostics and precise clinical treatment. PMID:28201758

  8. In silico re-identification of properties of drug target proteins.

    PubMed

    Kim, Baeksoo; Jo, Jihoon; Han, Jonghyun; Park, Chungoo; Lee, Hyunju

    2017-05-31

    Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions.

  9. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.

  10. UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

    PubMed

    Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier

    2016-01-04

    The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. The chaperonin-60 universal target is a barcode for bacteria that enables de novo assembly of metagenomic sequence data.

    PubMed

    Links, Matthew G; Dumonceaux, Tim J; Hemmingsen, Sean M; Hill, Janet E

    2012-01-01

    Barcoding with molecular sequences is widely used to catalogue eukaryotic biodiversity. Studies investigating the community dynamics of microbes have relied heavily on gene-centric metagenomic profiling using two genes (16S rRNA and cpn60) to identify and track Bacteria. While there have been criteria formalized for barcoding of eukaryotes, these criteria have not been used to evaluate gene targets for other domains of life. Using the framework of the International Barcode of Life we evaluated DNA barcodes for Bacteria. Candidates from the 16S rRNA gene and the protein coding cpn60 gene were evaluated. Within complete bacterial genomes in the public domain representing 983 species from 21 phyla, the largest difference between median pairwise inter- and intra-specific distances ("barcode gap") was found from cpn60. Distribution of sequence diversity along the ∼555 bp cpn60 target region was remarkably uniform. The barcode gap of the cpn60 universal target facilitated the faithful de novo assembly of full-length operational taxonomic units from pyrosequencing data from a synthetic microbial community. Analysis supported the recognition of both 16S rRNA and cpn60 as DNA barcodes for Bacteria. The cpn60 universal target was found to have a much larger barcode gap than 16S rRNA suggesting cpn60 as a preferred barcode for Bacteria. A large barcode gap for cpn60 provided a robust target for species-level characterization of data. The assembly of consensus sequences for barcodes was shown to be a reliable method for the identification and tracking of novel microbes in metagenomic studies.

  12. Identifying molecular drivers of gastric cancer through next-generation sequencing.

    PubMed

    Liang, Han; Kim, Yon Hui

    2013-11-01

    Gastric cancer is the second most common cause of cancer-related death in the world, representing a major global health issue. The high mortality rate is largely due to the lack of effective medical treatment for advanced stages of this disease. Recently next-generation sequencing (NGS) technology has become a revolutionary tool for cancer research, and several NGS studies in gastric cancer have been published. Here we review the insights gained from these studies regarding how use NGS to elucidate the molecular basis of gastric cancer and identify potential therapeutic targets. We also discuss the challenges and future directions of such efforts. Published by Elsevier Ireland Ltd.

  13. CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool

    PubMed Central

    del Sol Keyer, Maria; Wittbrodt, Joachim; Mateo, Juan L.

    2015-01-01

    Engineering of the CRISPR/Cas9 system has opened a plethora of new opportunities for site-directed mutagenesis and targeted genome modification. Fundamental to this is a stretch of twenty nucleotides at the 5’ end of a guide RNA that provides specificity to the bound Cas9 endonuclease. Since a sequence of twenty nucleotides can occur multiple times in a given genome and some mismatches seem to be accepted by the CRISPR/Cas9 complex, an efficient and reliable in silico selection and evaluation of the targeting site is key prerequisite for the experimental success. Here we present the CRISPR/Cas9 target online predictor (CCTop, http://crispr.cos.uni-heidelberg.de) to overcome limitations of already available tools. CCTop provides an intuitive user interface with reasonable default parameters that can easily be tuned by the user. From a given query sequence, CCTop identifies and ranks all candidate sgRNA target sites according to their off-target quality and displays full documentation. CCTop was experimentally validated for gene inactivation, non-homologous end-joining as well as homology directed repair. Thus, CCTop provides the bench biologist with a tool for the rapid and efficient identification of high quality target sites. PMID:25909470

  14. TargetM6A: Identifying N6-Methyladenosine Sites From RNA Sequences via Position-Specific Nucleotide Propensities and a Support Vector Machine.

    PubMed

    Li, Guang-Qing; Liu, Zi; Shen, Hong-Bin; Yu, Dong-Jun

    2016-10-01

    As one of the most ubiquitous post-transcriptional modifications of RNA, N 6 -methyladenosine ( [Formula: see text]) plays an essential role in many vital biological processes. The identification of [Formula: see text] sites in RNAs is significantly important for both basic biomedical research and practical drug development. In this study, we designed a computational-based method, called TargetM6A, to rapidly and accurately target [Formula: see text] sites solely from the primary RNA sequences. Two new features, i.e., position-specific nucleotide/dinucleotide propensities (PSNP/PSDP), are introduced and combined with the traditional nucleotide composition (NC) feature to formulate RNA sequences. The extracted features are further optimized to obtain a much more compact and discriminative feature subset by applying an incremental feature selection (IFS) procedure. Based on the optimized feature subset, we trained TargetM6A on the training dataset with a support vector machine (SVM) as the prediction engine. We compared the proposed TargetM6A method with existing methods for predicting [Formula: see text] sites by performing stringent jackknife tests and independent validation tests on benchmark datasets. The experimental results show that the proposed TargetM6A method outperformed the existing methods for predicting [Formula: see text] sites and remarkably improved the prediction performances, with MCC = 0.526 and AUC = 0.818. We also provided a user-friendly web server for TargetM6A, which is publicly accessible for academic use at http://csbio.njust.edu.cn/bioinf/TargetM6A.

  15. Combined hairpin-antisense compositions and methods for modulating expression

    DOEpatents

    Shanklin, John; Nguyen, Tam

    2014-08-05

    A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.

  16. Combined hairpin-antisense compositions and methods for modulating expression

    DOEpatents

    Shanklin, John; Nguyen, Tam Huu

    2015-11-24

    A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.

  17. Exome and deep sequencing of clinically aggressive neuroblastoma reveal somatic mutations that affect key pathways involved in cancer progression

    PubMed Central

    Lasorsa, Vito Alessandro; Formicola, Daniela; Pignataro, Piero; Cimmino, Flora; Calabrese, Francesco Maria; Mora, Jaume; Esposito, Maria Rosaria; Pantile, Marcella; Zanon, Carlo; De Mariano, Marilena; Longo, Luca; Hogarty, Michael D.; de Torres, Carmen; Tonini, Gian Paolo; Iolascon, Achille; Capasso, Mario

    2016-01-01

    The spectrum of somatic mutation of the most aggressive forms of neuroblastoma is not completely determined. We sought to identify potential cancer drivers in clinically aggressive neuroblastoma. Whole exome sequencing was conducted on 17 germline and tumor DNA samples from high-risk patients with adverse events within 36 months from diagnosis (HR-Event3) to identify somatic mutations and deep targeted sequencing of 134 genes selected from the initial screening in additional 48 germline and tumor pairs (62.5% HR-Event3 and high-risk patients), 17 HR-Event3 tumors and 17 human-derived neuroblastoma cell lines. We revealed 22 significantly mutated genes, many of which implicated in cancer progression. Fifteen genes (68.2%) were highly expressed in neuroblastoma supporting their involvement in the disease. CHD9, a cancer driver gene, was the most significantly altered (4.0% of cases) after ALK. Other genes (PTK2, NAV3, NAV1, FZD1 and ATRX), expressed in neuroblastoma and involved in cell invasion and migration were mutated at frequency ranged from 4% to 2%. Focal adhesion and regulation of actin cytoskeleton pathways, were frequently disrupted (14.1% of cases) thus suggesting potential novel therapeutic strategies to prevent disease progression. Notably BARD1, CHEK2 and AXIN2 were enriched in rare, potentially pathogenic, germline variants. In summary, whole exome and deep targeted sequencing identified novel cancer genes of clinically aggressive neuroblastoma. Our analyses show pathway-level implications of infrequently mutated genes in leading neuroblastoma progression. PMID:27009842

  18. Exome and deep sequencing of clinically aggressive neuroblastoma reveal somatic mutations that affect key pathways involved in cancer progression.

    PubMed

    Lasorsa, Vito Alessandro; Formicola, Daniela; Pignataro, Piero; Cimmino, Flora; Calabrese, Francesco Maria; Mora, Jaume; Esposito, Maria Rosaria; Pantile, Marcella; Zanon, Carlo; De Mariano, Marilena; Longo, Luca; Hogarty, Michael D; de Torres, Carmen; Tonini, Gian Paolo; Iolascon, Achille; Capasso, Mario

    2016-04-19

    The spectrum of somatic mutation of the most aggressive forms of neuroblastoma is not completely determined. We sought to identify potential cancer drivers in clinically aggressive neuroblastoma.Whole exome sequencing was conducted on 17 germline and tumor DNA samples from high-risk patients with adverse events within 36 months from diagnosis (HR-Event3) to identify somatic mutations and deep targeted sequencing of 134 genes selected from the initial screening in additional 48 germline and tumor pairs (62.5% HR-Event3 and high-risk patients), 17 HR-Event3 tumors and 17 human-derived neuroblastoma cell lines.We revealed 22 significantly mutated genes, many of which implicated in cancer progression. Fifteen genes (68.2%) were highly expressed in neuroblastoma supporting their involvement in the disease. CHD9, a cancer driver gene, was the most significantly altered (4.0% of cases) after ALK.Other genes (PTK2, NAV3, NAV1, FZD1 and ATRX), expressed in neuroblastoma and involved in cell invasion and migration were mutated at frequency ranged from 4% to 2%.Focal adhesion and regulation of actin cytoskeleton pathways, were frequently disrupted (14.1% of cases) thus suggesting potential novel therapeutic strategies to prevent disease progression.Notably BARD1, CHEK2 and AXIN2 were enriched in rare, potentially pathogenic, germline variants.In summary, whole exome and deep targeted sequencing identified novel cancer genes of clinically aggressive neuroblastoma. Our analyses show pathway-level implications of infrequently mutated genes in leading neuroblastoma progression.

  19. Differential signatures of bacterial and mammalian IMP dehydrogenase enzymes.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, R.; Evans, G.; Rotella, F.

    1999-06-01

    IMP dehydrogenase (IMPDH) is an essential enzyme of de novo guanine nucleotide synthesis. IMPDH inhibitors have clinical utility as antiviral, anticancer or immunosuppressive agents. The essential nature of this enzyme suggests its therapeutic applications may be extended to the development of antimicrobial agents. Bacterial IMPDH enzymes show bio- chemical and kinetic characteristics that are different than the mammalian IMPDH enzymes, suggesting IMPDH may be an attractive target for the development of antimicrobial agents. We suggest that the biochemical and kinetic differences between bacterial and mammalian enzymes are a consequence of the variance of specific, identifiable amino acid residues. Identification ofmore » these residues or combination of residues that impart this mammalian or bacterial enzyme signature is a prerequisite for the rational identification of agents that specifically target the bacterial enzyme. We used sequence alignments of IMPDH proteins to identify sequence signatures associated with bacterial or eukaryotic IMPDH enzymes. These selections were further refined to discern those likely to have a role in catalysis using information derived from the bacterial and mammalian IMPDH crystal structures and site-specific mutagenesis. Candidate bacterial sequence signatures identified by this process include regions involved in subunit interactions, the active site flap and the NAD binding region. Analysis of sequence alignments in these regions indicates a pattern of catalytic residues conserved in all enzymes and a secondary pattern of amino acid conservation associated with the major phylogenetic groups. Elucidation of the basis for this mammalian/bacterial IMPDH signature will provide insight into the catalytic mechanism of this enzyme and the foundation for the development of highly specific inhibitors.« less

  20. Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ecale Zhou, C L; Zemla, A T; Roe, D

    2005-01-29

    Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers, or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set ofmore » ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context, and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics, and vaccines.« less

  1. Characteristics of HIV-infected U.S. Army soldiers linked in molecular transmission clusters, 2001-2012

    PubMed Central

    Jagodzinski, Linda L.; Liu, Ying; Pham, Peter T.; Kijak, Gustavo H.; Tovanabutra, Sodsai; McCutchan, Francine E.; Scoville, Stephanie L.; Cersovsky, Steven B.; Michael, Nelson L.; Scott, Paul T.; Peel, Sheila A.

    2017-01-01

    Objective Recent surveillance data suggests the United States (U.S.) Army HIV epidemic is concentrated among men who have sex with men. To identify potential targets for HIV prevention strategies, the relationship between demographic and clinical factors and membership within transmission clusters based on baseline pol sequences of HIV-infected Soldiers from 2001 through 2012 were analyzed. Methods We conducted a retrospective analysis of baseline partial pol sequences, demographic and clinical characteristics available for all Soldiers in active service and newly-diagnosed with HIV-1 infection from January 1, 2001 through December 31, 2012. HIV-1 subtype designations and transmission clusters were identified from phylogenetic analysis of sequences. Univariate and multivariate logistic regression models were used to evaluate and adjust for the association between characteristics and cluster membership. Results Among 518 of 995 HIV-infected Soldiers with available partial pol sequences, 29% were members of a transmission cluster. Assignment to a southern U.S. region at diagnosis and year of diagnosis were independently associated with cluster membership after adjustment for other significant characteristics (p<0.10) of age, race, year of diagnosis, region of duty assignment, sexually transmitted infections, last negative HIV test, antiretroviral therapy, and transmitted drug resistance. Subtyping of the pol fragment indicated HIV-1 subtype B infection predominated (94%) among HIV-infected Soldiers. Conclusion These findings identify areas to explore as HIV prevention targets in the U.S. Army. An increased frequency of current force testing may be justified, especially among Soldiers assigned to duty in installations with high local HIV prevalence such as southern U.S. states. PMID:28759645

  2. Muscle RAS oncogene homolog (MRAS) recurrent mutation in Borrmann type IV gastric cancer.

    PubMed

    Yasumoto, Makiko; Sakamoto, Etsuko; Ogasawara, Sachiko; Isobe, Taro; Kizaki, Junya; Sumi, Akiko; Kusano, Hironori; Akiba, Jun; Torimura, Takuji; Akagi, Yoshito; Itadani, Hiraku; Kobayashi, Tsutomu; Hasako, Shinichi; Kumazaki, Masafumi; Mizuarai, Shinji; Oie, Shinji; Yano, Hirohisa

    2017-01-01

    The prognosis of patients with Borrmann type IV gastric cancer (Type IV) is extremely poor. Thus, there is an urgent need to elucidate the molecular mechanisms underlying the oncogenesis of Type IV and to identify new therapeutic targets. Although previous studies using whole-exome and whole-genome sequencing have elucidated genomic alterations in gastric cancer, none has focused on comprehensive genetic analysis of Type IV. To discover cancer-relevant genes in Type IV, we performed whole-exome sequencing and genome-wide copy number analysis on 13 patients with Type IV. Exome sequencing identified 178 somatic mutations in protein-coding sequences or at splice sites. Among the mutations, we found a mutation in muscle RAS oncogene homolog (MRAS), which is predicted to cause molecular dysfunction. MRAS belongs to the Ras subgroup of small G proteins, which includes the prototypic RAS oncogenes. We analyzed an additional 46 Type IV samples to investigate the frequency of MRAS mutation. There were eight nonsynonymous mutations (mutation frequency, 17%), showing that MRAS is recurrently mutated in Type IV. Copy number analysis identified six focal amplifications and one homozygous deletion, including insulin-like growth factor 1 receptor (IGF1R) amplification. The samples with IGF1R amplification had remarkably higher IGF1R mRNA and protein expression levels compared with the other samples. This is the first report of MRAS recurrent mutation in human tumor samples. Our results suggest that MRAS mutation and IGF1R amplification could drive tumorigenesis of Type IV and could be new therapeutic targets. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  3. The Genome Sequence of the Rumen Methanogen Methanobrevibacter ruminantium Reveals New Possibilities for Controlling Ruminant Methane Emissions

    PubMed Central

    Leahy, Sinead C.; Kelly, William J.; Altermann, Eric; Ronimus, Ron S.; Yeoman, Carl J.; Pacheco, Diana M.; Li, Dong; Kong, Zhanhao; McTavish, Sharla; Sang, Carrie; Lambie, Suzanne C.; Janssen, Peter H.; Dey, Debjit; Attwood, Graeme T.

    2010-01-01

    Background Methane (CH4) is a potent greenhouse gas (GHG), having a global warming potential 21 times that of carbon dioxide (CO2). Methane emissions from agriculture represent around 40% of the emissions produced by human-related activities, the single largest source being enteric fermentation, mainly in ruminant livestock. Technologies to reduce these emissions are lacking. Ruminant methane is formed by the action of methanogenic archaea typified by Methanobrevibacter ruminantium, which is present in ruminants fed a wide variety of diets worldwide. To gain more insight into the lifestyle of a rumen methanogen, and to identify genes and proteins that can be targeted to reduce methane production, we have sequenced the 2.93 Mb genome of M. ruminantium M1, the first rumen methanogen genome to be completed. Methodology/Principal Findings The M1 genome was sequenced, annotated and subjected to comparative genomic and metabolic pathway analyses. Conserved and methanogen-specific gene sets suitable as targets for vaccine development or chemogenomic-based inhibition of rumen methanogens were identified. The feasibility of using a synthetic peptide-directed vaccinology approach to target epitopes of methanogen surface proteins was demonstrated. A prophage genome was described and its lytic enzyme, endoisopeptidase PeiR, was shown to lyse M1 cells in pure culture. A predicted stimulation of M1 growth by alcohols was demonstrated and microarray analyses indicated up-regulation of methanogenesis genes during co-culture with a hydrogen (H2) producing rumen bacterium. We also report the discovery of non-ribosomal peptide synthetases in M. ruminantium M1, the first reported in archaeal species. Conclusions/Significance The M1 genome sequence provides new insights into the lifestyle and cellular processes of this important rumen methanogen. It also defines vaccine and chemogenomic targets for broad inhibition of rumen methanogens and represents a significant contribution to worldwide efforts to mitigate ruminant methane emissions and reduce production of anthropogenic greenhouse gases. PMID:20126622

  4. In silico genome wide mining of conserved and novel miRNAs in the brain and pineal gland of Danio rerio using small RNA sequencing data.

    PubMed

    Agarwal, Suyash; Nagpure, Naresh Sahebrao; Srivastava, Prachi; Kushwaha, Basdeo; Kumar, Ravindra; Pandey, Manmohan; Srivastava, Shreya

    2016-03-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that bind to the mRNA of the target genes and regulate the expression of the gene at the post-transcriptional level. Zebrafish is an economically important freshwater fish species globally considered as a good predictive model for studying human diseases and development. The present study focused on uncovering known as well as novel miRNAs, target prediction of the novel miRNAs and the differential expression of the known miRNA using the small RNA sequencing data of the brain and pineal gland (dark and light treatments) obtained from NCBI SRA. A total of 165, 151 and 145 known zebrafish miRNAs were found in the brain, pineal gland (dark treatment) and pineal gland (light treatment), respectively. Chromosomes 4 and 5 of zebrafish reference assembly GRCz10 were found to contain maximum number of miR genes. The miR-181a and miR-182 were found to be highly expressed in terms of number of reads in the brain and pineal gland, respectively. Other ncRNAs, such as tRNA, rRNA and snoRNA, were curated against Rfam. Using GRCz10 as reference, the subsequent bioinformatic analyses identified 25, 19 and 9 novel miRNAs from the brain, pineal gland (dark treatment) and pineal gland (light treatment), respectively. Targets of the novel miRNAs were identified, based on sequence complementarity between miRNAs and mRNA, by searching for antisense hits in the 3'-UTR of reference RNA sequences of the zebrafish. The discovery of novel miRNAs and their targets in the zebrafish genome can be a valuable scientific resource for further functional studies not only in zebrafish but also in other economically important fishes.

  5. Whole-Exome Sequencing in Two Extreme Phenotypes of Response to VEGF-Targeted Therapies in Patients With Metastatic Clear Cell Renal Cell Carcinoma.

    PubMed

    Fay, Andre P; de Velasco, Guillermo; Ho, Thai H; Van Allen, Eliezer M; Murray, Bradley; Albiges, Laurence; Signoretti, Sabina; Hakimi, A Ari; Stanton, Melissa L; Bellmunt, Joaquim; McDermott, David F; Atkins, Michael B; Garraway, Levi A; Kwiatkowski, David J; Choueiri, Toni K

    2016-07-01

    Advances in next-generation sequencing have provided a unique opportunity to understand the biology of disease and mechanisms of sensitivity or resistance to specific agents. Renal cell carcinoma (RCC) is a heterogeneous disease and highly variable clinical responses have been observed with vascular endothelial growth factor (VEGF)-targeted therapy (VEGF-TT). We hypothesized that whole-exome sequencing analysis might identify genotypes associated with extreme response or resistance to VEGF-TT in metastatic (mRCC). Patients with mRCC who had received first-line sunitinib or pazopanib and were in 2 extreme phenotypes of response were identified. Extreme responders (ERs) were defined as those with partial response or complete response for 3 or more years (n=13) and primary refractory patients (PRPs) were defined as those with progressive disease within the first 3 months of therapy (n=14). International Metastatic RCC Database Consortium prognostic scores were not significantly different between the groups (P=.67). Considering the genes known to be mutated in RCC at significant frequency, PBRM1 mutations were identified in 7 ERs (54%) versus 1 PRP (7%) (P=.01). In addition, mutations in TP53 (n=4) were found only in PRPs (P=.09). Our data suggest that mutations in some genes in RCC may impact response to VEGF-TT. Copyright © 2016 by the National Comprehensive Cancer Network.

  6. Discriminative prediction of mammalian enhancers from DNA sequence

    PubMed Central

    Lee, Dongwon; Karchin, Rachel; Beer, Michael A.

    2011-01-01

    Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers. PMID:21875935

  7. Identification of a novel MYO7A mutation in Usher syndrome type 1.

    PubMed

    Cheng, Ling; Yu, Hongsong; Jiang, Yan; He, Juan; Pu, Sisi; Li, Xin; Zhang, Li

    2018-01-05

    Usher syndrome (USH) is an autosomal recessive disease characterized by deafness and retinitis pigmentosa. In view of the high phenotypic and genetic heterogeneity in USH, performing genetic screening with traditional methods is impractical. In the present study, we carried out targeted next-generation sequencing (NGS) to uncover the underlying gene in an USH family (2 USH patients and 15 unaffected relatives). One hundred and thirty-five genes associated with inherited retinal degeneration were selected for deep exome sequencing. Subsequently, variant analysis, Sanger validation and segregation tests were utilized to identify the disease-causing mutations in this family. All affected individuals had a classic USH type I (USH1) phenotype which included deafness, vestibular dysfunction and retinitis pigmentosa. Targeted NGS and Sanger sequencing validation suggested that USH1 patients carried an unreported splice site mutation, c.5168+1G>A, as a compound heterozygous mutation with c.6070C>T (p.R2024X) in the MYO7A gene. A functional study revealed decreased expression of the MYO7A gene in the individuals carrying heterozygous mutations. In conclusion, targeted next-generation sequencing provided a comprehensive and efficient diagnosis for USH1. This study revealed the genetic defects in the MYO7A gene and expanded the spectrum of clinical phenotypes associated with USH1 mutations.

  8. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease.

    PubMed

    Ellingford, Jamie M; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G; Sergouniotis, Panagiotis I; O'Sullivan, James; Lamb, Janine A; Perveen, Rahat; Hall, Georgina; Newman, William G; Bishop, Paul N; Roberts, Stephen A; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C; Nemeth, Andrea H; Black, Graeme C M

    2016-05-01

    To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Case series. A total of 562 patients diagnosed with IRD. We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Diagnostic yield of genomic testing. Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15-45) uplift in diagnostic yield. We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  9. Identification of the Quorum-Sensing Target DNA Sequence and N-Acyl Homoserine Lactone Responsiveness of the Brucella abortus virB promoter▿

    PubMed Central

    Arocena, Gastón M.; Sieira, Rodrigo; Comerci, Diego J.; Ugalde, Rodolfo A.

    2010-01-01

    VjbR is a LuxR-type quorum-sensing (QS) regulator that plays an essential role in the virulence of the intracellular facultative pathogen Brucella, the causative agent of brucellosis. It was previously described that VjbR regulates a diverse group of genes, including the virB operon. The latter codes for a type IV secretion system (T4SS) that is central for the pathogenesis of Brucella. Although the regulatory role of VjbR on the virB promoter (PvirB) was extensively studied by different groups, the VjbR-binding site had not been identified so far. Here, we identified the target DNA sequence of VjbR in PvirB by DNase I footprinting analyses. Surprisingly, we observed that VjbR specifically recognizes a sequence that is identical to a half-binding site of the QS-related regulator MrtR of Mesorhizobium tianshanense. As shown by DNase I footprinting and electrophoretic mobility shift assays, generation of a palindromic MrtR-like-binding site in PvirB increased both the affinity and the stability of the VjbR-DNA complex, which confirmed that the QS regulator of Brucella is highly related to that of M. tianshanense. The addition of N-dodecanoyl homoserine lactone dissociated VjbR from the promoter, which confirmed previous reports that indicated a negative effect of this signal on the VjbR-mediated activation of PvirB. Our results provide new molecular evidence for the structure of the virB promoter and reveal unusual features of the QS target DNA sequence of the main regulator of virulence in Brucella. PMID:20400542

  10. Whole-genome sequencing of an aggressive BRAF wild-type papillary thyroid cancer identified EML4-ALK translocation as a therapeutic target.

    PubMed

    Demeure, Michael J; Aziz, Meraj; Rosenberg, Richard; Gurley, Steven D; Bussey, Kimberly J; Carpten, John D

    2014-06-01

    Recent advances in the treatment of cancer have focused on targeting genomic aberrations with selective therapeutic agents. In radioiodine resistant aggressive papillary thyroid cancers, there remain few effective therapeutic options. A 62-year-old man who underwent multiple operations for papillary thyroid cancer and whose metastases progressed despite standard treatments provided tumor tissue. We analyzed tumor and whole blood DNA by whole genome sequencing, achieving 80× or greater coverage over 94 % of the exome and 90 % of the genome. We determined somatic mutations and structural alterations. We found a total of 57 somatic mutations in 55 genes of the cancer genome. There was notably a lack of mutations in NRAS and BRAF, and no RET/PTC rearrangement. There was a mutation in the TRAPP oncogene and a loss of heterozygosity of the p16, p18, and RB1 tumor suppressor genes. The oncogenic driver for this tumor is a translocation involving the genes for anaplastic lymphoma receptor tyrosine kinase (ALK) and echinoderm microtubule associated protein like 4 (EML4). The EML4-ALK translocation has been reported in approximately 5 % of lung cancers, as well as in pediatric neuroblastoma, and is a therapeutic target for crizotinib. This is the first report of the whole genomic sequencing of a papillary thyroid cancer in which we identified an EML4-ALK translocation of a TRAPP oncogene mutation. These findings suggest that this tumor has a more distinct oncogenesis than BRAF mutant papillary thyroid cancer. Whole genome sequencing can elucidate an oncogenic context and expose potential therapeutic vulnerabilities in rare cancers.

  11. Single-cell RNA-sequencing reveals a distinct population of proglucagon-expressing cells specific to the mouse upper small intestine.

    PubMed

    Glass, Leslie L; Calero-Nieto, Fernando J; Jawaid, Wajid; Larraufie, Pierre; Kay, Richard G; Göttgens, Berthold; Reimann, Frank; Gribble, Fiona M

    2017-10-01

    To identify sub-populations of intestinal preproglucagon-expressing (PPG) cells producing Glucagon-like Peptide-1, and their associated expression profiles of sensory receptors, thereby enabling the discovery of therapeutic strategies that target these cell populations for the treatment of diabetes and obesity. We performed single cell RNA sequencing of PPG-cells purified by flow cytometry from the upper small intestine of 3 GLU-Venus mice. Cells from 2 mice were sequenced at low depth, and from the third mouse at high depth. High quality sequencing data from 234 PPG-cells were used to identify clusters by tSNE analysis. qPCR was performed to compare the longitudinal and crypt/villus locations of cluster-specific genes. Immunofluorescence and mass spectrometry were used to confirm protein expression. PPG-cells formed 3 major clusters: a group with typical characteristics of classical L-cells, including high expression of Gcg and Pyy (comprising 51% of all PPG-cells); a cell type overlapping with Gip-expressing K-cells (14%); and a unique cluster expressing Tph1 and Pzp that was predominantly located in proximal small intestine villi and co-produced 5-HT (35%). Expression of G-protein coupled receptors differed between clusters, suggesting the cell types are differentially regulated and would be differentially targetable. Our findings support the emerging concept that many enteroendocrine cell populations are highly overlapping, with individual cells producing a range of peptides previously assigned to distinct cell types. Different receptor expression profiles across the clusters highlight potential drug targets to increase gut hormone secretion for the treatment of diabetes and obesity. Copyright © 2017 The Authors. Published by Elsevier GmbH.. All rights reserved.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andersen, G.L.; He, Z.; DeSantis, T.Z.

    Microarrays have proven to be a useful and high-throughput method to provide targeted DNA sequence information for up to many thousands of specific genetic regions in a single test. A microarray consists of multiple DNA oligonucleotide probes that, under high stringency conditions, hybridize only to specific complementary nucleic acid sequences (targets). A fluorescent signal indicates the presence and, in many cases, the abundance of genetic regions of interest. In this chapter we will look at how microarrays are used in microbial ecology, especially with the recent increase in microbial community DNA sequence data. Of particular interest to microbial ecologists, phylogeneticmore » microarrays are used for the analysis of phylotypes in a community and functional gene arrays are used for the analysis of functional genes, and, by inference, phylotypes in environmental samples. A phylogenetic microarray that has been developed by the Andersen laboratory, the PhyloChip, will be discussed as an example of a microarray that targets the known diversity within the 16S rRNA gene to determine microbial community composition. Using multiple, confirmatory probes to increase the confidence of detection and a mismatch probe for every perfect match probe to minimize the effect of cross-hybridization by non-target regions, the PhyloChip is able to simultaneously identify any of thousands of taxa present in an environmental sample. The PhyloChip is shown to reveal greater diversity within a community than rRNA gene sequencing due to the placement of the entire gene product on the microarray compared with the analysis of up to thousands of individual molecules by traditional sequencing methods. A functional gene array that has been developed by the Zhou laboratory, the GeoChip, will be discussed as an example of a microarray that dynamically identifies functional activities of multiple members within a community. The recent version of GeoChip contains more than 24,000 50mer oligonucleotide probes and covers more than 10,000 gene sequences in 150 gene categories involved in carbon, nitrogen, sulfur, and phosphorus cycling, metal resistance and reduction, and organic contaminant degradation. GeoChip can be used as a generic tool for microbial community analysis, and also link microbial community structure to ecosystem functioning. Examples of the application of both arrays in different environmental samples will be described in the two subsequent sections.« less

  13. Sensitive and Specific Target Sequences Selected from Retrotransposons of Schistosoma japonicum for the Diagnosis of Schistosomiasis

    PubMed Central

    Xu, Jing; Zhu, Xing-Quan; Wang, Sheng-Yue; Xia, Chao-Ming

    2012-01-01

    Background Schistosomiasis japonica is a serious debilitating and sometimes fatal disease. Accurate diagnostic tests play a key role in patient management and control of the disease. However, currently available diagnostic methods are not ideal, and the detection of the parasite DNA in blood samples has turned out to be one of the most promising tools for the diagnosis of schistosomiasis. In our previous investigations, a 230-bp sequence from the highly repetitive retrotransposon SjR2 was identified and it showed high sensitivity and specificity for detecting Schistosoma japonicum DNA in the sera of rabbit model and patients. Recently, 29 retrotransposons were found in S. japonicum genome by our group. The present study highlighted the key factors for selecting a new perspective sensitive target DNA sequence for the diagnosis of schistosomiasis, which can serve as example for other parasitic pathogens. Methodology/Principal Findings In this study, we demonstrated that the key factors based on the bioinformatic analysis for selecting target sequence are the higher genome proportion, repetitive complete copies and partial copies, and active ESTs than the others in the chromosome genome. New primers based on 25 novel retrotransposons and SjR2 were designed and their sensitivity and specificity for detecting S. japonicum DNA were compared. The results showed that a new 303-bp sequence from non-long terminal repeat (LTR) retrotransposon (SjCHGCS19) had high sensitivity and specificity. The 303-bp target sequence was amplified from the sera of rabbit model at 3 d post-infection by nested-PCR and it became negative at 17 weeks post-treatment. Furthermore, the percentage sensitivity of the nested-PCR was 97.67% in 43 serum samples of S. japonicum-infected patients. Conclusions/Significance Our findings highlighted the key factors based on the bioinformatic analysis for selecting target sequence from S. japonicum genome, which provide basis for establishing powerful molecular diagnostic techniques that can be used for monitoring early infection and therapy efficacy to support schistosomiasis control programs. PMID:22479661

  14. Identification of functional features of synthetic SINEUPs, antisense lncRNAs that specifically enhance protein translation

    PubMed Central

    Kozhuharova, Ana; Sharma, Harshita; Ohyama, Takako; Fasolo, Francesca; Yamazaki, Toshio; Cotella, Diego; Santoro, Claudio; Zucchelli, Silvia; Gustincich, Stefano; Carninci, Piero

    2018-01-01

    SINEUPs are antisense long noncoding RNAs, in which an embedded SINE B2 element UP-regulates translation of partially overlapping target sense mRNAs. SINEUPs contain two functional domains. First, the binding domain (BD) is located in the region antisense to the target, providing specific targeting to the overlapping mRNA. Second, the inverted SINE B2 represents the effector domain (ED) and enhances translation. To adapt SINEUP technology to a broader number of targets, we took advantage of a high-throughput, semi-automated imaging system to optimize synthetic SINEUP BD and ED design in HEK293T cell lines. Using SINEUP-GFP as a model SINEUP, we extensively screened variants of the BD to map features needed for optimal design. We found that most active SINEUPs overlap an AUG-Kozak sequence. Moreover, we report our screening of the inverted SINE B2 sequence to identify active sub-domains and map the length of the minimal active ED. Our synthetic SINEUP-GFP screening of both BDs and EDs constitutes a broad test with flexible applications to any target gene of interest. PMID:29414979

  15. Programmable RNA Cleavage and Recognition by a Natural CRISPR-Cas9 System from Neisseria meningitidis.

    PubMed

    Rousseau, Beth A; Hou, Zhonggang; Gramelspacher, Max J; Zhang, Yan

    2018-03-01

    The microbial CRISPR systems enable adaptive defense against mobile elements and also provide formidable tools for genome engineering. The Cas9 proteins are type II CRISPR-associated, RNA-guided DNA endonucleases that identify double-stranded DNA targets by sequence complementarity and protospacer adjacent motif (PAM) recognition. Here we report that the type II-C CRISPR-Cas9 from Neisseria meningitidis (Nme) is capable of programmable, RNA-guided, site-specific cleavage and recognition of single-stranded RNA targets and that this ribonuclease activity is independent of the PAM sequence. We define the mechanistic feature and specificity constraint for RNA cleavage by NmeCas9 and also show that nuclease null dNmeCas9 binds to RNA target complementary to CRISPR RNA. Finally, we demonstrate that NmeCas9-catalyzed RNA cleavage can be blocked by three families of type II-C anti-CRISPR proteins. These results fundamentally expand the targeting capacities of CRISPR-Cas9 and highlight the potential utility of NmeCas9 as a single platform to target both RNA and DNA. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers

    PubMed Central

    Sahl, Jason W.; Vazquez, Adam J.; Hall, Carina M.; Busch, Joseph D.; Tuanyok, Apichai; Mayo, Mark; Schupp, James M.; Lummis, Madeline; Pearson, Talima; Shippy, Kenzie; Allender, Christopher J.; Theobald, Vanessa; Hutcheson, Alex; Korlach, Jonas; LiPuma, John J.; Ladner, Jason; Lovett, Sean; Koroleva, Galina; Palacios, Gustavo; Limmathurotsakul, Direk; Wuthiekanun, Vanaporn; Wongsuwan, Gumphol; Currie, Bart J.

    2016-01-01

    ABSTRACT Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. PMID:27651357

  17. Strain/species identification in metagenomes using genome-specific markers

    PubMed Central

    Tu, Qichao; He, Zhili; Zhou, Jizhong

    2014-01-01

    Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing. PMID:24523352

  18. Conserved sequences in the current strains of HIV-1 subtype A in Russia are effectively targeted by artificial RNAi in vitro.

    PubMed

    Tchurikov, Nickolai A; Fedoseeva, Daria M; Gashnikova, Natalya M; Sosin, Dmitri V; Gorbacheva, Maria A; Alembekov, Ildar R; Chechetkin, Vladimir R; Kravatsky, Yuri V; Kretova, Olga V

    2016-05-25

    Highly active antiretroviral therapy has greatly reduced the morbidity and mortality of AIDS. However, many of the antiretroviral drugs are toxic with long-term use, and all currently used anti-HIV agents generate drug-resistant mutants. Therefore, there is a great need for new approaches to AIDS therapy. RNAi is a powerful means of inhibiting HIV-1 production in human cells. We propose to use RNAi for gene therapy of HIV/AIDS. Previously we identified a number of new biologically active siRNAs targeting several moderately conserved regions in HIV-1 transcripts. Here we analyze the heterogeneity of nucleotide sequences in three RNAi targets in sequences encoding the reverse transcriptase and integrase domains of current isolates of HIV-1 subtype A in Russia. These data were used to generate genetic constructs expressing short hairpin RNAs 28-30-bp in length that could be processed in cells into siRNAs. After transfection of the constructs we observed siRNAs that efficiently attacked the selected targets. We expect that targeting several viral genes important for HIV-1 reproduction will help overcome the problem of viral adaptation and will prevent the appearance of RNAi escape mutants in current virus strains, an important feature of gene therapy of HIV/AIDS. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. CREBBP mutations in relapsed acute lymphoblastic leukaemia

    PubMed Central

    Mullighan, Charles G.; Zhang, Jinghui; Kasper, Lawryn H.; Lerach, Stephanie; Payne-Turner, Debbie; Phillips, Letha A.; Heatley, Sue L.; Holmfeldt, Linda; Collins-Underwood, J. Racquel; Ma, Jing; Buetow, Kenneth H.; Pui, Ching-Hon; Baker, Sharyn D.; Brindle, Paul K.; Downing, James R.

    2010-01-01

    Relapsed acute lymphoblastic leukaemia (ALL) is a leading cause of death due to disease in young people, but the biologic determinants of treatment failure remain poorly understood. Recent genome-wide profiling of structural DNA alterations in ALL have identified multiple submicroscopic somatic mutations targeting key cellular pathways1,2, and have demonstrated substantial evolution in genetic alterations from diagnosis to relapse3. However, detailed analysis of sequence mutations in ALL has not been performed. To identify novel mutations in relapsed ALL, we resequenced 300 genes in matched diagnosis and relapse samples from 23 patients with ALL. This identified 52 somatic non-synonymous mutations in 32 genes, many of which were novel, including the transcriptional coactivators CREBBP and NCOR1, the transcription factors ERG, SPI1, TCF4 and TCF7L2, components of the Ras signalling pathway, histone genes, genes involved in histone modification (CREBBP and CTCF), and genes previously shown to be targets of recurring DNA copy number alteration in ALL. Analysis of an extended cohort of 71 diagnosis-relapse cases and 270 acute leukaemia cases that did not relapse found that 18.3% of relapse cases had sequence or deletion mutations of CREBBP, which encodes the transcriptional coactivator and histone acetyltransferase (HAT) CREB-binding protein (CBP)4. The mutations were either present at diagnosis or acquired at relapse, and resulted in truncated alleles or deleterious substitutions in conserved residues of the HAT domain. Functionally, the mutations impaired histone acetylation and transcriptional regulation of CREBBP targets, including glucocorticoid responsive genes. Several mutations acquired at relapse were detected in subclones at diagnosis, suggesting that the mutations may confer resistance to therapy. These results extend the landscape of genetic alterations in leukaemia, and identify mutations targeting transcriptional and epigenetic regulation as a mechanism of resistance in ALL. PMID:21390130

  20. Computational exploration of microRNAs from expressed sequence tags of Humulus lupulus, target predictions and expression analysis.

    PubMed

    Mishra, Ajay Kumar; Duraisamy, Ganesh Selvaraj; Týcová, Anna; Matoušek, Jaroslav

    2015-12-01

    Among computationally predicted and experimentally validated plant miRNAs, several are conserved across species boundaries in the plant kingdom. In this study, a combined experimental-in silico computational based approach was adopted for the identification and characterization of miRNAs in Humulus lupulus (hop), which is widely cultivated for use by the brewing industry and apart from, used as a medicinal herb. A total of 22 miRNAs belonging to 17 miRNA families were identified in hop following comparative computational approach and EST-based homology search according to a series of filtering criteria. Selected miRNAs were validated by end-point PCR and quantitative reverse transcription-polymerase chain reaction (qRT-PCR), confirmed the existence of conserved miRNAs in hop. Based on the characteristic that miRNAs exhibit perfect or nearly perfect complementarity with their targeted mRNA sequences, a total of 47 potential miRNA targets were identified in hop. Strikingly, the majority of predicted targets were belong to transcriptional factors which could regulate hop growth and development, including leaf, root and even cone development. Moreover, the identified miRNAs may also be involved in other cellular and metabolic processes, such as stress response, signal transduction, and other physiological processes. The cis-regulatory elements relevant to biotic and abiotic stress, plant hormone response, flavonoid biosynthesis were identified in the promoter regions of those miRNA genes. Overall, findings from this study will accelerate the way for further researches of miRNAs, their functions in hop and shows a path for the prediction and analysis of miRNAs to those species whose genomes are not available. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    PubMed

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  2. An integrated structure- and system-based framework to identify new targets of metabolites and known drugs

    PubMed Central

    Naveed, Hammad; Hameed, Umar S.; Harrus, Deborah; Bourguet, William; Arold, Stefan T.; Gao, Xin

    2015-01-01

    Motivation: The inherent promiscuity of small molecules towards protein targets impedes our understanding of healthy versus diseased metabolism. This promiscuity also poses a challenge for the pharmaceutical industry as identifying all protein targets is important to assess (side) effects and repositioning opportunities for a drug. Results: Here, we present a novel integrated structure- and system-based approach of drug-target prediction (iDTP) to enable the large-scale discovery of new targets for small molecules, such as pharmaceutical drugs, co-factors and metabolites (collectively called ‘drugs’). For a given drug, our method uses sequence order–independent structure alignment, hierarchical clustering and probabilistic sequence similarity to construct a probabilistic pocket ensemble (PPE) that captures promiscuous structural features of different binding sites on known targets. A drug’s PPE is combined with an approximation of its delivery profile to reduce false positives. In our cross-validation study, we use iDTP to predict the known targets of 11 drugs, with 63% sensitivity and 81% specificity. We then predicted novel targets for these drugs—two that are of high pharmacological interest, the peroxisome proliferator-activated receptor gamma and the oncogene B-cell lymphoma 2, were successfully validated through in vitro binding experiments. Our method is broadly applicable for the prediction of protein-small molecule interactions with several novel applications to biological research and drug development. Availability and implementation: The program, datasets and results are freely available to academic users at http://sfb.kaust.edu.sa/Pages/Software.aspx. Contact: xin.gao@kaust.edu.sa and stefan.arold@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26286808

  3. Mapping of transcription factor binding regions in mammalian cells by ChIP: Comparison of array- and sequencing-based technologies

    PubMed Central

    Euskirchen, Ghia M.; Rozowsky, Joel S.; Wei, Chia-Lin; Lee, Wah Heng; Zhang, Zhengdong D.; Hartman, Stephen; Emanuelsson, Olof; Stolc, Viktor; Weissman, Sherman; Gerstein, Mark B.; Ruan, Yijun; Snyder, Michael

    2007-01-01

    Recent progress in mapping transcription factor (TF) binding regions can largely be credited to chromatin immunoprecipitation (ChIP) technologies. We compared strategies for mapping TF binding regions in mammalian cells using two different ChIP schemes: ChIP with DNA microarray analysis (ChIP-chip) and ChIP with DNA sequencing (ChIP-PET). We first investigated parameters central to obtaining robust ChIP-chip data sets by analyzing STAT1 targets in the ENCODE regions of the human genome, and then compared ChIP-chip to ChIP-PET. We devised methods for scoring and comparing results among various tiling arrays and examined parameters such as DNA microarray format, oligonucleotide length, hybridization conditions, and the use of competitor Cot-1 DNA. The best performance was achieved with high-density oligonucleotide arrays, oligonucleotides ≥50 bases (b), the presence of competitor Cot-1 DNA and hybridizations conducted in microfluidics stations. When target identification was evaluated as a function of array number, 80%–86% of targets were identified with three or more arrays. Comparison of ChIP-chip with ChIP-PET revealed strong agreement for the highest ranked targets with less overlap for the low ranked targets. With advantages and disadvantages unique to each approach, we found that ChIP-chip and ChIP-PET are frequently complementary in their relative abilities to detect STAT1 targets for the lower ranked targets; each method detected validated targets that were missed by the other method. The most comprehensive list of STAT1 binding regions is obtained by merging results from ChIP-chip and ChIP-sequencing. Overall, this study provides information for robust identification, scoring, and validation of TF targets using ChIP-based technologies. PMID:17568005

  4. Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

    PubMed

    Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

    2016-12-01

    High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  5. Small RNA-mediated responses to low- and high-temperature stresses in cotton

    PubMed Central

    Wang, Qiongshan; Liu, Nian; Yang, Xiyan; Tu, Lili; Zhang, Xianlong

    2016-01-01

    MicroRNAs (miRNAs) are one class of endogenous non-coding RNAs modulating the expression of target genes involved in plant development and stress tolerance, by degrading mRNA or repressing translation. In this study, small RNA and mRNA degradome sequencing were used to identify low- and high-temperature stress-responsive miRNAs and their targets in cotton (Gossypium hirsutum). Cotton seedlings were treated under different temperature conditions (4, 12, 25, 35, and 42 °C) and then the effects were investigated. In total, 319 known miRNAs and 800 novel miRNAs were identified, and 168 miRNAs were differentially expressed between different treatments. The targets of these miRNAs were further analysed by degradome sequencing. Based on studies from Gene Ontology and Kyoto Encyclopedia of Genes and Genomes, the majority of the miRNAs are from genes that are likely involved in response to hormone stimulus, oxidation-reduction reaction, photosynthesis, plant–pathogen interaction and plant hormone signal transduction pathways. This study provides new insight into the molecular mechanisms of plant response to extreme temperature stresses, and especially the roles of miRNAs under extreme temperatures. PMID:27752116

  6. Advances in sarcoma diagnostics and treatment

    PubMed Central

    Dancsok, Amanda R; Asleh-Aburaya, Karama; Nielsen, Torsten O

    2017-01-01

    The heterogeneity of sarcomas with regard to molecular genesis, histology, clinical characteristics, and response to treatment makes management of these rare yet diverse neoplasms particularly challenging. This review encompasses recent developments in sarcoma diagnostics and treatment, including cytotoxic, targeted, epigenetic, and immune therapy agents. In the past year, groups internationally explored the impact of adding mandatory molecular testing to histological diagnosis, reporting some changes in diagnosis and/or management; however, the impact on outcomes could not be adequately assessed. Transcriptome sequencing techniques have brought forward new diagnostic tools for identifying fusions and/or characterizing unclassified entities. Next-generation sequencing and advanced molecular techniques were also applied to identify potential targets for directed and epigenetic therapy, where preclinical studies reported results for agents active within the receptor tyrosine kinase, mTOR, Notch, Wnt, Hedgehog, Hsp90, and MDM2 signaling networks. At the level of clinical practice, modest developments were seen for some sarcoma subtypes in conventional chemotherapy and in therapies targeting the pathways activated by various receptor tyrosine kinases. In the burgeoning field of immune therapy, sarcoma work is in its infancy; however, elaborate protocols for immune stimulation are being explored, and checkpoint blockade agents advance from preclinical models to clinical studies. PMID:27732970

  7. Endogenous System Microbes as Treatment Process ...

    EPA Pesticide Factsheets

    Monitoring the efficacy of treatment strategies to remove pathogens in decentralized systems remains a challenge. Evaluating log reduction targets by measuring pathogen levels is hampered by their sporadic and low occurrence rates. Fecal indicator bacteria are used in centralized systems to indicate the presence of fecal pathogens, but are ineffective decentralized treatment process indicators as they generally occur at levels too low to assess log reduction targets. System challenge testing by spiking with high loads of fecal indicator organisms, like MS2 coliphage, has limitations, especially for large systems. Microbes that are endogenous to the decentralized system, occur in high abundances and mimic removal rates of bacterial, viral and/or parasitic protozoan pathogens during treatment could serve as alternative treatment process indicators to verify log reduction targets. To identify abundant microbes in wastewater, the bacterial and viral communities were examined using deep sequencing. Building infrastructure-associated bacteria, like Zoogloea, were observed as dominant members of the bacterial community in graywater. In blackwater, bacteriophage of the order Caudovirales constituted the majority of contiguous sequences from the viral community. This study identifies candidate treatment process indicators in decentralized systems that could be used to verify log removal during treatment. The association of the presence of treatment process indic

  8. INCENP Centromere and Spindle Targeting: Identification of Essential Conserved Motifs and Involvement of Heterochromatin Protein HP1

    PubMed Central

    Ainsztein, Alexandra M.; Kandels-Lewis, Stefanie E.; Mackay, Alastair M.; Earnshaw, William C.

    1998-01-01

    The inner centromere protein (INCENP) has a modular organization, with domains required for chromosomal and cytoskeletal functions concentrated near the amino and carboxyl termini, respectively. In this study we have identified an autonomous centromere- and midbody-targeting module in the amino-terminal 68 amino acids of INCENP. Within this module, we have identified two evolutionarily conserved amino acid sequence motifs: a 13–amino acid motif that is required for targeting to centromeres and transfer to the spindle, and an 11–amino acid motif that is required for transfer to the spindle by molecules that have targeted previously to the centromere. To begin to understand the mechanisms of INCENP function in mitosis, we have performed a yeast two-hybrid screen for interacting proteins. These and subsequent in vitro binding experiments identify a physical interaction between INCENP and heterochromatin protein HP1Hsα. Surprisingly, this interaction does not appear to be involved in targeting INCENP to the centromeric heterochromatin, but may instead have a role in its transfer from the chromosomes to the anaphase spindle. PMID:9864353

  9. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways.

    PubMed

    Cirulli, Elizabeth T; Lasseigne, Brittany N; Petrovski, Slavé; Sapp, Peter C; Dion, Patrick A; Leblond, Claire S; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E; Boone, Braden E; Wimbish, Jack R; Waite, Lindsay L; Jones, Angela L; Carulli, John P; Day-Williams, Aaron G; Staropoli, John F; Xin, Winnie W; Chesi, Alessandra; Raphael, Alya R; McKenna-Yasek, Diane; Cady, Janet; Vianney de Jong, J M B; Kenna, Kevin P; Smith, Bradley N; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E; Baloh, Robert H; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M; Gibson, Summer; Trojanowski, John Q; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Shneider, Neil A; Chung, Wendy K; Ravits, John M; Glass, Jonathan D; Sims, Katherine B; Van Deerlin, Vivianna M; Maniatis, Tom; Hayes, Sebastian D; Ordureau, Alban; Swarup, Sharan; Landers, John; Baas, Frank; Allen, Andrew S; Bedlack, Richard S; Harper, J Wade; Gitler, Aaron D; Rouleau, Guy A; Brown, Robert; Harms, Matthew B; Cooper, Gregory M; Harris, Tim; Myers, Richard M; Goldstein, David B

    2015-03-27

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS patients and 6405 controls. Several known ALS genes were found to be associated, and TBK1 (the gene encoding TANK-binding kinase 1) was identified as an ALS gene. TBK1 is known to bind to and phosphorylate a number of proteins involved in innate immunity and autophagy, including optineurin (OPTN) and p62 (SQSTM1/sequestosome), both of which have also been implicated in ALS. These observations reveal a key role of the autophagic pathway in ALS and suggest specific targets for therapeutic intervention. Copyright © 2015, American Association for the Advancement of Science.

  10. Potential functions of microRNAs in starch metabolism and development revealed by miRNA transcriptome profiling of cassava cultivars and their wild progenitor.

    PubMed

    Chen, Xin; Xia, Jing; Xia, Zhiqiang; Zhang, Hefang; Zeng, Changying; Lu, Cheng; Zhang, Weixiong; Wang, Wenquan

    2015-02-04

    MicroRNAs (miRNAs) are small (approximately 21 nucleotide) non-coding RNAs that are key post-transcriptional gene regulators in eukaryotic organisms. More than 100 cassava miRNAs have been identified in a conservation analysis and a repertoire of cassava miRNAs have also been characterised by next-generation sequencing (NGS) in recent studies. Here, using NGS, we profiled small non-coding RNAs and mRNA genes in two cassava cultivars and their wild progenitor to identify and characterise miRNAs that are potentially involved in plant growth and starch biosynthesis. Six small RNA and six mRNA libraries from leaves and roots of the two cultivars, KU50 and Arg7, and their wild progenitor, W14, were subjected to NGS. Analysis of the sequencing data revealed 29 conserved miRNA families and 33 new miRNA families. Together, these miRNAs potentially targeted a total of 360 putative target genes. Whereas 16 miRNA families were highly expressed in cultivar leaves, another 13 miRNA families were highly expressed in storage roots of cultivars. Co-expression analysis revealed that the expression level of some targets had negative relationship with their corresponding miRNAs in storage roots and leaves; these targets included MYB33, ARF10, GRF1, RD19, APL2, NF-YA3 and SPL2, which are known to be involved in plant development, starch biosynthesis and response to environmental stimuli. The identified miRNAs, target mRNAs and target gene ontology annotation all shed light on the possible functions of miRNAs in Manihot species. The differential expression of miRNAs between cultivars and their wild progenitor, together with our analysis of GO annotation and confirmation of miRNA: target pairs, might provide insight into know the differences between wild progenitor and cultivated cassava.

  11. miRNAs involved in the development and differentiation of fertile and sterile flowers in Viburnum macrocephalum f. keteleeri.

    PubMed

    Li, Weixing; He, Zhichong; Zhang, Li; Lu, Zhaogeng; Xu, Jing; Cui, Jiawen; Wang, Li; Jin, Biao

    2017-10-13

    Sterile and fertile flowers are important evolutionary developmental phenotypes in angiosperm flowers. The development of floral organs, critical in angiosperm reproduction, is regulated by microRNAs (miRNAs). However, the mechanisms underpinning the miRNA regulation of the differentiation and development of sterile and fertile flowers remain unclear. Here, based on investigations of the morphological differences between fertile and sterile flowers, we used high-throughput sequencing to characterize the miRNAs in the differentiated floral organs of Viburnum macrocephalum f. keteleeri. We identified 49 known miRNAs and 67 novel miRNAs by small RNA (sRNA) sequencing and bioinformatics analysis, and 17 of these known and novel miRNA precursors were validated by polymerase chain reaction (PCR) and Sanger sequencing. Furthermore, by comparing the sequencing results of two sRNA libraries, we found that 30 known and 39 novel miRNA sequences were differentially expressed, and 35 were upregulated and 34 downregulated in sterile compared with fertile flowers. Combined with their predicted targets, the potential roles of miRNAs in V. macrocephalum f. keteleeri flowers include involvement in floral organogenesis, cell proliferation, hormonal pathways, and stress responses. miRNA precursors and targets were further validated by quantitative real-time PCR (qRT-PCR). Specifically, miR156a-5p, miR156g, and miR156j expression levels were significantly higher in fertile flowers than in sterile flowers, while SPL genes displayed the opposite expression pattern. Considering that the targets of miR156 are predicted to be SPL genes, we propose that miR156 may be involved in the regulation of stamen development in V. macrocephalum f. keteleeri. We identified miRNAs differentially expressed between fertile and sterile flowers in V. macrocephalum f. keteleeri and provided new insights into the important regulatory roles of miRNAs in the differentiation and development of fertile and sterile flowers.

  12. Passive IFF: Autonomous Nonintrusive Rapid Identification of Friendly Assets

    NASA Technical Reports Server (NTRS)

    Moynihan, Philip; Steenburg, Robert Van; Chao, Tien-Hsin

    2004-01-01

    A proposed optoelectronic instrument would identify targets rapidly, without need to radiate an interrogating signal, apply identifying marks to the targets, or equip the targets with transponders. The instrument was conceived as an identification, friend or foe (IFF) system in a battlefield setting, where it would be part of a targeting system for weapons, by providing rapid identification for aimed weapons to help in deciding whether and when to trigger them. The instrument could also be adapted to law-enforcement and industrial applications in which it is necessary to rapidly identify objects in view. The instrument would comprise mainly an optical correlator and a neural processor (see figure). The inherent parallel-processing speed and capability of the optical correlator would be exploited to obtain rapid identification of a set of probable targets within a scene of interest and to define regions within the scene for the neural processor to analyze. The neural processor would then concentrate on each region selected by the optical correlator in an effort to identify the target. Depending on whether or not a target was recognized by comparison of its image data with data in an internal database on which the neural processor was trained, the processor would generate an identifying signal (typically, friend or foe ). The time taken for this identification process would be less than the time needed by a human or robotic gunner to acquire a view of, and aim at, a target. An optical correlator that has been under development for several years and that has been demonstrated to be capable of tracking a cruise missile might be considered a prototype of the optical correlator in the proposed IFF instrument. This optical correlator features a 512-by-512-pixel input image frame and operates at an input frame rate of 60 Hz. It includes a spatial light modulator (SLM) for video-to-optical image conversion, a pair of precise lenses to effect Fourier transforms, a filter SLM for digital-to-optical correlation-filter data conversion, and a charge-coupled device (CCD) for detection of correlation peaks. In operation, the input scene grabbed by a video sensor is streamed into the input SLM. Precomputed correlation-filter data files representative of known targets are then downloaded and sequenced into the filter SLM at a rate of 1,000 Hz. When there occurs a match between the input target data and one of the known-target data files, the CCD detects a correlation peak at the location of the target. Distortion- invariant correlation filters from a bank of such filters are then sequenced through the optical correlator for each input frame. The net result is the rapid preliminary recognition of one or a few targets.

  13. Real-time observation of DNA target interrogation and product release by the RNA-guided endonuclease CRISPR Cpf1 (Cas12a).

    PubMed

    Singh, Digvijay; Mallon, John; Poddar, Anustup; Wang, Yanbo; Tippana, Ramreddy; Yang, Olivia; Bailey, Scott; Ha, Taekjip

    2018-05-22

    CRISPR-Cas9, which imparts adaptive immunity against foreign genomic invaders in certain prokaryotes, has been repurposed for genome-engineering applications. More recently, another RNA-guided CRISPR endonuclease called Cpf1 (also known as Cas12a) was identified and is also being repurposed. Little is known about the kinetics and mechanism of Cpf1 DNA interaction and how sequence mismatches between the DNA target and guide-RNA influence this interaction. We used single-molecule fluorescence analysis and biochemical assays to characterize DNA interrogation, cleavage, and product release by three Cpf1 orthologs. Our Cpf1 data are consistent with the DNA interrogation mechanism proposed for Cas9. They both bind any DNA in search of protospacer-adjacent motif (PAM) sequences, verify the target sequence directionally from the PAM-proximal end, and rapidly reject any targets that lack a PAM or that are poorly matched with the guide-RNA. Unlike Cas9, which requires 9 bp for stable binding and ∼16 bp for cleavage, Cpf1 requires an ∼17-bp sequence match for both stable binding and cleavage. Unlike Cas9, which does not release the DNA cleavage products, Cpf1 rapidly releases the PAM-distal cleavage product, but not the PAM-proximal product. Solution pH, reducing conditions, and 5' guanine in guide-RNA differentially affected different Cpf1 orthologs. Our findings have important implications on Cpf1-based genome engineering and manipulation applications.

  14. Many Routes to an Antibody Heavy-Chain CDR3: Necessary, Yet Insufficient, for Specific Binding

    DOE PAGES

    D'Angelo, Sara; Ferrara, Fortunato; Naranjo, Leslie; ...

    2018-03-08

    Because of its great potential for diversity, the immunoglobulin heavy-chain complementarity-determining region 3 (HCDR3) is taken as an antibody molecule’s most important component in conferring binding activity and specificity. For this reason, HCDR3s have been used as unique identifiers to investigate adaptive immune responses in vivo and to characterize in vitro selection outputs where display systems were employed. Here, we show that many different HCDR3s can be identified within a target-specific antibody population after in vitro selection. For each identified HCDR3, a number of different antibodies bearing differences elsewhere can be found. In such selected populations, all antibodies with themore » same HCDR3 recognize the target, albeit at different affinities. In contrast, within unselected populations, the majority of antibodies with the same HCDR3 sequence do not bind the target. In one HCDR3 examined in depth, all target-specific antibodies were derived from the same VDJ rearrangement, while non-binding antibodies with the same HCDR3 were derived from many different V and D gene rearrangements. Careful examination of previously published in vivo datasets reveals that HCDR3s shared between, and within, different individuals can also originate from rearrangements of different V and D genes, with up to 26 different rearrangements yielding the same identical HCDR3 sequence. On the basis of these observations, we conclude that the same HCDR3 can be generated by many different rearrangements, but that specific target binding is an outcome of unique rearrangements and VL pairing: the HCDR3 is necessary, albeit insufficient, for specific antibody binding.« less

  15. Many Routes to an Antibody Heavy-Chain CDR3: Necessary, Yet Insufficient, for Specific Binding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    D'Angelo, Sara; Ferrara, Fortunato; Naranjo, Leslie

    Because of its great potential for diversity, the immunoglobulin heavy-chain complementarity-determining region 3 (HCDR3) is taken as an antibody molecule’s most important component in conferring binding activity and specificity. For this reason, HCDR3s have been used as unique identifiers to investigate adaptive immune responses in vivo and to characterize in vitro selection outputs where display systems were employed. Here, we show that many different HCDR3s can be identified within a target-specific antibody population after in vitro selection. For each identified HCDR3, a number of different antibodies bearing differences elsewhere can be found. In such selected populations, all antibodies with themore » same HCDR3 recognize the target, albeit at different affinities. In contrast, within unselected populations, the majority of antibodies with the same HCDR3 sequence do not bind the target. In one HCDR3 examined in depth, all target-specific antibodies were derived from the same VDJ rearrangement, while non-binding antibodies with the same HCDR3 were derived from many different V and D gene rearrangements. Careful examination of previously published in vivo datasets reveals that HCDR3s shared between, and within, different individuals can also originate from rearrangements of different V and D genes, with up to 26 different rearrangements yielding the same identical HCDR3 sequence. On the basis of these observations, we conclude that the same HCDR3 can be generated by many different rearrangements, but that specific target binding is an outcome of unique rearrangements and VL pairing: the HCDR3 is necessary, albeit insufficient, for specific antibody binding.« less

  16. Chemokine Receptor Signatures in Allogeneic Stem Cell Transplantation

    DTIC Science & Technology

    2014-08-01

    versus-host disease (GHVD). We use T-cell receptor deep sequencing to characterize the repertoire of effector T-cells in allogeneic hematopoietic stem ... cell transplant (HSCT) recipients and identify the role of chemokine receptors in effector cell infiltration of target organs. In the recent funding

  17. Assessment of Equine Fecal Contamination: The Search for Alternative Bacterial Source-tracking Targets

    EPA Science Inventory

    16S rDNA clone libraries were evaluated for detection of fecal source-identifying bacteria from a collapsed equine manure pile. Libraries were constructed using universal eubacterial primers and Bacteroides-Prevotella group-specific primers. Eubacterial sequences indicat...

  18. Defining RNA-Small Molecule Affinity Landscapes Enables Design of a Small Molecule Inhibitor of an Oncogenic Noncoding RNA.

    PubMed

    Velagapudi, Sai Pradeep; Luo, Yiling; Tran, Tuan; Haniff, Hafeez S; Nakai, Yoshio; Fallahi, Mohammad; Martinez, Gustavo J; Childs-Disney, Jessica L; Disney, Matthew D

    2017-03-22

    RNA drug targets are pervasive in cells, but methods to design small molecules that target them are sparse. Herein, we report a general approach to score the affinity and selectivity of RNA motif-small molecule interactions identified via selection. Named High Throughput Structure-Activity Relationships Through Sequencing (HiT-StARTS), HiT-StARTS is statistical in nature and compares input nucleic acid sequences to selected library members that bind a ligand via high throughput sequencing. The approach allowed facile definition of the fitness landscape of hundreds of thousands of RNA motif-small molecule binding partners. These results were mined against folded RNAs in the human transcriptome and identified an avid interaction between a small molecule and the Dicer nuclease-processing site in the oncogenic microRNA (miR)-18a hairpin precursor, which is a member of the miR-17-92 cluster. Application of the small molecule, Targapremir-18a, to prostate cancer cells inhibited production of miR-18a from the cluster, de-repressed serine/threonine protein kinase 4 protein (STK4), and triggered apoptosis. Profiling the cellular targets of Targapremir-18a via Chemical Cross-Linking and Isolation by Pull Down (Chem-CLIP), a covalent small molecule-RNA cellular profiling approach, and other studies showed specific binding of the compound to the miR-18a precursor, revealing broadly applicable factors that govern small molecule drugging of noncoding RNAs.

  19. Defining RNA–Small Molecule Affinity Landscapes Enables Design of a Small Molecule Inhibitor of an Oncogenic Noncoding RNA

    PubMed Central

    2017-01-01

    RNA drug targets are pervasive in cells, but methods to design small molecules that target them are sparse. Herein, we report a general approach to score the affinity and selectivity of RNA motif–small molecule interactions identified via selection. Named High Throughput Structure–Activity Relationships Through Sequencing (HiT-StARTS), HiT-StARTS is statistical in nature and compares input nucleic acid sequences to selected library members that bind a ligand via high throughput sequencing. The approach allowed facile definition of the fitness landscape of hundreds of thousands of RNA motif–small molecule binding partners. These results were mined against folded RNAs in the human transcriptome and identified an avid interaction between a small molecule and the Dicer nuclease-processing site in the oncogenic microRNA (miR)-18a hairpin precursor, which is a member of the miR-17-92 cluster. Application of the small molecule, Targapremir-18a, to prostate cancer cells inhibited production of miR-18a from the cluster, de-repressed serine/threonine protein kinase 4 protein (STK4), and triggered apoptosis. Profiling the cellular targets of Targapremir-18a via Chemical Cross-Linking and Isolation by Pull Down (Chem-CLIP), a covalent small molecule–RNA cellular profiling approach, and other studies showed specific binding of the compound to the miR-18a precursor, revealing broadly applicable factors that govern small molecule drugging of noncoding RNAs. PMID:28386598

  20. Targeted next generation sequencing of well-differentiated/dedifferentiated liposarcoma reveals novel gene amplifications and mutations

    PubMed Central

    Somaiah, Neeta; Beird, Hannah C; Barbo, Andrea; Song, Juhee; Mills Shaw, Kenna R.; Wang, Wei-Lien; Eterovic, Karina; Chen, Ken; Lazar, Alexander; Conley, Anthony P.; Ravi, Vinod; Hwu, Patrick; Futreal, Andrew; Simon, George; Meric-Bernstam, Funda; Hong, David

    2018-01-01

    Well-differentiated/dedifferentiated liposarcoma is a common soft tissue sarcoma with approximately 1500 new cases per year. Surgery is the mainstay of treatment but recurrences are frequent and systemic options are limited. ‘Tumor genotyping’ is becoming more common in clinical practice as it offers the hope of personalized targeted therapy. We wanted to evaluate the results and the clinical utility of available next-generation sequencing panels in WD/DD liposarcoma. Patients who had their tumor sequenced by either FoundationOne (n = 13) or the institutional T200/T200.1 panels (n = 7) were included in this study. Significant copy number alterations were identified, but mutations were infrequent. Out of the 27 mutations detected in 7 samples, 8 (CTNNB1, MECOM, ZNF536, EGFR, EML4, CSMD3, PBRM1, PPP1R3A) were identified as deleterious (on Condel, PolyPhen and SIFT) and a truncating mutation was found in NF2. Of these, EGFR and NF2 are potential driver mutations and have not been reported previously in liposarcoma. MDM2 and CDK4 amplification was universally present in all the tested samples and multiple other recurrent genes with high amplification or high deletion were detected. Many of these targets are potentially actionable. Eight patients went on to receive an MDM2 inhibitor with a median time to progression of 23 months (95% CI: 10-83 months). PMID:29731991

Top