Sample records for targeted deep sequencing

  1. Insights into Deep-Sea Sediment Fungal Communities from the East Indian Ocean Using Targeted Environmental Sequencing Combined with Traditional Cultivation

    PubMed Central

    Zhang, Xiao-yong; Tang, Gui-ling; Xu, Xin-ya; Nong, Xu-hua; Qi, Shu-Hua

    2014-01-01

    The fungal diversity in deep-sea environments has recently gained an increasing amount attention. Our knowledge and understanding of the true fungal diversity and the role it plays in deep-sea environments, however, is still limited. We investigated the fungal community structure in five sediments from a depth of ∼4000 m in the East India Ocean using a combination of targeted environmental sequencing and traditional cultivation. This approach resulted in the recovery of a total of 45 fungal operational taxonomic units (OTUs) and 20 culturable fungal phylotypes. This finding indicates that there is a great amount of fungal diversity in the deep-sea sediments collected in the East Indian Ocean. Three fungal OTUs and one culturable phylotype demonstrated high divergence (89%–97%) from the existing sequences in the GenBank. Moreover, 44.4% fungal OTUs and 30% culturable fungal phylotypes are new reports for deep-sea sediments. These results suggest that the deep-sea sediments from the East India Ocean can serve as habitats for new fungal communities compared with other deep-sea environments. In addition, different fungal community could be detected when using targeted environmental sequencing compared with traditional cultivation in this study, which suggests that a combination of targeted environmental sequencing and traditional cultivation will generate a more diverse fungal community in deep-sea environments than using either targeted environmental sequencing or traditional cultivation alone. This study is the first to report new insights into the fungal communities in deep-sea sediments from the East Indian Ocean, which increases our knowledge and understanding of the fungal diversity in deep-sea environments. PMID:25272044

  2. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

    PubMed

    Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

    2017-11-23

    The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  3. Analysis of Variability in HIV-1 Subtype A Strains in Russia Suggests a Combination of Deep Sequencing and Multitarget RNA Interference for Silencing of the Virus.

    PubMed

    Kretova, Olga V; Chechetkin, Vladimir R; Fedoseeva, Daria M; Kravatsky, Yuri V; Sosin, Dmitri V; Alembekov, Ildar R; Gorbacheva, Maria A; Gashnikova, Natalya M; Tchurikov, Nickolai A

    2017-02-01

    Any method for silencing the activity of the HIV-1 retrovirus should tackle the extremely high variability of HIV-1 sequences and mutational escape. We studied sequence variability in the vicinity of selected RNA interference (RNAi) targets from isolates of HIV-1 subtype A in Russia, and we propose that using artificial RNAi is a potential alternative to traditional antiretroviral therapy. We prove that using multiple RNAi targets overcomes the variability in HIV-1 isolates. The optimal number of targets critically depends on the conservation of the target sequences. The total number of targets that are conserved with a probability of 0.7-0.8 should exceed at least 2. Combining deep sequencing and multitarget RNAi may provide an efficient approach to cure HIV/AIDS.

  4. Quantitative phenotyping via deep barcode sequencing.

    PubMed

    Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

    2009-10-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.

  5. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data.

    PubMed

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.

  6. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    PubMed Central

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637

  7. A deep learning framework for improving long-range residue-residue contact prediction using a hierarchical strategy.

    PubMed

    Xiong, Dapeng; Zeng, Jianyang; Gong, Haipeng

    2017-09-01

    Residue-residue contacts are of great value for protein structure prediction, since contact information, especially from those long-range residue pairs, can significantly reduce the complexity of conformational sampling for protein structure prediction in practice. Despite progresses in the past decade on protein targets with abundant homologous sequences, accurate contact prediction for proteins with limited sequence information is still far from satisfaction. Methodologies for these hard targets still need further improvement. We presented a computational program DeepConPred, which includes a pipeline of two novel deep-learning-based methods (DeepCCon and DeepRCon) as well as a contact refinement step, to improve the prediction of long-range residue contacts from primary sequences. When compared with previous prediction approaches, our framework employed an effective scheme to identify optimal and important features for contact prediction, and was only trained with coevolutionary information derived from a limited number of homologous sequences to ensure robustness and usefulness for hard targets. Independent tests showed that 59.33%/49.97%, 64.39%/54.01% and 70.00%/59.81% of the top L/5, top L/10 and top 5 predictions were correct for CASP10/CASP11 proteins, respectively. In general, our algorithm ranked as one of the best methods for CASP targets. All source data and codes are available at http://166.111.152.91/Downloads.html . hgong@tsinghua.edu.cn or zengjy321@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  8. Quantitative phenotyping via deep barcode sequencing

    PubMed Central

    Smith, Andrew M.; Heisler, Lawrence E.; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J.; Chee, Mark; Roth, Frederick P.; Giaever, Guri; Nislow, Corey

    2009-01-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or “Bar-seq,” outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that ∼20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene–environment interactions on a genome-wide scale. PMID:19622793

  9. Development of a candidate reference material for adventitious virus detection in vaccine and biologicals manufacturing by deep sequencing

    PubMed Central

    Mee, Edward T.; Preston, Mark D.; Minor, Philip D.; Schepelmann, Silke; Huang, Xuening; Nguyen, Jenny; Wall, David; Hargrove, Stacey; Fu, Thomas; Xu, George; Li, Li; Cote, Colette; Delwart, Eric; Li, Linlin; Hewlett, Indira; Simonyan, Vahan; Ragupathy, Viswanath; Alin, Voskanian-Kordi; Mermod, Nicolas; Hill, Christiane; Ottenwälder, Birgit; Richter, Daniel C.; Tehrani, Arman; Jacqueline, Weber-Lehmann; Cassart, Jean-Pol; Letellier, Carine; Vandeputte, Olivier; Ruelle, Jean-Louis; Deyati, Avisek; La Neve, Fabio; Modena, Chiara; Mee, Edward; Schepelmann, Silke; Preston, Mark; Minor, Philip; Eloit, Marc; Muth, Erika; Lamamy, Arnaud; Jagorel, Florence; Cheval, Justine; Anscombe, Catherine; Misra, Raju; Wooldridge, David; Gharbia, Saheer; Rose, Graham; Ng, Siemon H.S.; Charlebois, Robert L.; Gisonni-Lex, Lucy; Mallet, Laurent; Dorange, Fabien; Chiu, Charles; Naccache, Samia; Kellam, Paul; van der Hoek, Lia; Cotten, Matt; Mitchell, Christine; Baier, Brian S.; Sun, Wenping; Malicki, Heather D.

    2016-01-01

    Background Unbiased deep sequencing offers the potential for improved adventitious virus screening in vaccines and biotherapeutics. Successful implementation of such assays will require appropriate control materials to confirm assay performance and sensitivity. Methods A common reference material containing 25 target viruses was produced and 16 laboratories were invited to process it using their preferred adventitious virus detection assay. Results Fifteen laboratories returned results, obtained using a wide range of wet-lab and informatics methods. Six of 25 target viruses were detected by all laboratories, with the remaining viruses detected by 4–14 laboratories. Six non-target viruses were detected by three or more laboratories. Conclusion The study demonstrated that a wide range of methods are currently used for adventitious virus detection screening in biological products by deep sequencing and that they can yield significantly different results. This underscores the need for common reference materials to ensure satisfactory assay performance and enable comparisons between laboratories. PMID:26709640

  10. A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses

    USDA-ARS?s Scientific Manuscript database

    Background: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected ...

  11. Rational Protein Engineering Guided by Deep Mutational Scanning

    PubMed Central

    Shin, HyeonSeok; Cho, Byung-Kwan

    2015-01-01

    Sequence–function relationship in a protein is commonly determined by the three-dimensional protein structure followed by various biochemical experiments. However, with the explosive increase in the number of genome sequences, facilitated by recent advances in sequencing technology, the gap between protein sequences available and three-dimensional structures is rapidly widening. A recently developed method termed deep mutational scanning explores the functional phenotype of thousands of mutants via massive sequencing. Coupled with a highly efficient screening system, this approach assesses the phenotypic changes made by the substitution of each amino acid sequence that constitutes a protein. Such an informational resource provides the functional role of each amino acid sequence, thereby providing sufficient rationale for selecting target residues for protein engineering. Here, we discuss the current applications of deep mutational scanning and consider experimental design. PMID:26404267

  12. Deep sequencing and in silico analysis of small RNA library reveals novel miRNA from leaf Persicaria minor transcriptome.

    PubMed

    Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan

    2018-03-01

    In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.

  13. Deep sequencing methods for protein engineering and design.

    PubMed

    Wrenbeck, Emily E; Faber, Matthew S; Whitehead, Timothy A

    2017-08-01

    The advent of next-generation sequencing (NGS) has revolutionized protein science, and the development of complementary methods enabling NGS-driven protein engineering have followed. In general, these experiments address the functional consequences of thousands of protein variants in a massively parallel manner using genotype-phenotype linked high-throughput functional screens followed by DNA counting via deep sequencing. We highlight the use of information rich datasets to engineer protein molecular recognition. Examples include the creation of multiple dual-affinity Fabs targeting structurally dissimilar epitopes and engineering of a broad germline-targeted anti-HIV-1 immunogen. Additionally, we highlight the generation of enzyme fitness landscapes for conducting fundamental studies of protein behavior and evolution. We conclude with discussion of technological advances. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Detection of Emerging Vaccine-Related Polioviruses by Deep Sequencing.

    PubMed

    Sahoo, Malaya K; Holubar, Marisa; Huang, ChunHong; Mohamed-Hadley, Alisha; Liu, Yuanyuan; Waggoner, Jesse J; Troy, Stephanie B; Garcia-Garcia, Lourdes; Ferreyra-Reyes, Leticia; Maldonado, Yvonne; Pinsky, Benjamin A

    2017-07-01

    Oral poliovirus vaccine can mutate to regain neurovirulence. To date, evaluation of these mutations has been performed primarily on culture-enriched isolates by using conventional Sanger sequencing. We therefore developed a culture-independent, deep-sequencing method targeting the 5' untranslated region (UTR) and P1 genomic region to characterize vaccine-related poliovirus variants. Error analysis of the deep-sequencing method demonstrated reliable detection of poliovirus mutations at levels of <1%, depending on read depth. Sequencing of viral nucleic acids from the stool of vaccinated, asymptomatic children and their close contacts collected during a prospective cohort study in Veracruz, Mexico, revealed no vaccine-derived polioviruses. This was expected given that the longest duration between sequenced sample collection and the end of the most recent national immunization week was 66 days. However, we identified many low-level variants (<5%) distributed across the 5' UTR and P1 genomic region in all three Sabin serotypes, as well as vaccine-related viruses with multiple canonical mutations associated with phenotypic reversion present at high levels (>90%). These results suggest that monitoring emerging vaccine-related poliovirus variants by deep sequencing may aid in the poliovirus endgame and efforts to ensure global polio eradication. Copyright © 2017 Sahoo et al.

  15. Targeted parallel sequencing of the Musa species: searching for an alternative model system for polyploidy studies

    USDA-ARS?s Scientific Manuscript database

    Modern day genomics holds the promise of solving the complexities of basic plant sciences, and of catalyzing practical advances in plant breeding. While contiguous, "base perfect" deep sequencing is a key module of any genome project, recent advances in parallel next generation sequencing technologi...

  16. A Template-Based Protein Structure Reconstruction Method Using Deep Autoencoder Learning.

    PubMed

    Li, Haiou; Lyu, Qiang; Cheng, Jianlin

    2016-12-01

    Protein structure prediction is an important problem in computational biology, and is widely applied to various biomedical problems such as protein function study, protein design, and drug design. In this work, we developed a novel deep learning approach based on a deeply stacked denoising autoencoder for protein structure reconstruction. We applied our approach to a template-based protein structure prediction using only the 3D structural coordinates of homologous template proteins as input. The templates were identified for a target protein by a PSI-BLAST search. 3DRobot (a program that automatically generates diverse and well-packed protein structure decoys) was used to generate initial decoy models for the target from the templates. A stacked denoising autoencoder was trained on the decoys to obtain a deep learning model for the target protein. The trained deep model was then used to reconstruct the final structural model for the target sequence. With target proteins that have highly similar template proteins as benchmarks, the GDT-TS score of the predicted structures is greater than 0.7, suggesting that the deep autoencoder is a promising method for protein structure reconstruction.

  17. Identification of microRNA-like RNAs from Curvularia lunata associated with maize leaf spot by bioinformation analysis and deep sequencing.

    PubMed

    Liu, Tong; Hu, John; Zuo, Yuhu; Jin, Yazhong; Hou, Jumei

    2016-04-01

    Deep sequencing of small RNAs is a useful tool to identify novel small RNAs that may be involved in fungal growth and pathogenesis. In this study, we used HiSeq deep sequencing to identify 747,487 unique small RNAs from Curvularia lunata. Among these small RNAs were 1012 microRNA-like RNAs (milRNAs), which are similar to other known microRNAs, and 48 potential novel milRNAs without homologs in other organisms have been identified using the miRBase© database. We used quantitative PCR to analyze the expression of four of these milRNAs from C. lunata at different developmental stages. The analysis revealed several changes associated with germinating conidia and mycelial growth, suggesting that these milRNAs may play a role in pathogen infection and mycelial growth. A total of 8334 target mRNAs for the 1012 milRNAs that were identified, and 256 target mRNAs for the 48 novel milRNAs were predicted by computational analysis. These target mRNAs of milRNAs were also performed by gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis. To our knowledge, this study is the first report of C. lunata's milRNA profiles. This information will provide a better understanding of pathogen development and infection mechanism.

  18. High-throughput sequencing of the entire genomic regions of CCM1/KRIT1, CCM2 and CCM3/PDCD10 to search for pathogenic deep-intronic splice mutations in cerebral cavernous malformations.

    PubMed

    Rath, Matthias; Jenssen, Sönke E; Schwefel, Konrad; Spiegler, Stefanie; Kleimeier, Dana; Sperling, Christian; Kaderali, Lars; Felbor, Ute

    2017-09-01

    Cerebral cavernous malformations (CCM) are vascular lesions of the central nervous system that can cause headaches, seizures and hemorrhagic stroke. Disease-associated mutations have been identified in three genes: CCM1/KRIT1, CCM2 and CCM3/PDCD10. The precise proportion of deep-intronic variants in these genes and their clinical relevance is yet unknown. Here, a long-range PCR (LR-PCR) approach for target enrichment of the entire genomic regions of the three genes was combined with next generation sequencing (NGS) to screen for coding and non-coding variants. NGS detected all six CCM1/KRIT1, two CCM2 and four CCM3/PDCD10 mutations that had previously been identified by Sanger sequencing. Two of the pathogenic variants presented here are novel. Additionally, 20 stringently selected CCM index cases that had remained mutation-negative after conventional sequencing and exclusion of copy number variations were screened for deep-intronic mutations. The combination of bioinformatics filtering and transcript analyses did not reveal any deep-intronic splice mutations in these cases. Our results demonstrate that target enrichment by LR-PCR combined with NGS can be used for a comprehensive analysis of the entire genomic regions of the CCM genes in a research context. However, its clinical utility is limited as deep-intronic splice mutations in CCM1/KRIT1, CCM2 and CCM3/PDCD10 seem to be rather rare. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  19. Clinical utility of circulating tumor DNA for molecular assessment in pancreatic cancer.

    PubMed

    Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Morizane, Chigusa; Nara, Satoshi; Hama, Natsuko; Suzuki, Masami; Furukawa, Eisaku; Kato, Mamoru; Hayashi, Hideyuki; Kohno, Takashi; Ueno, Hideki; Shimada, Kazuaki; Okusaka, Takuji; Nakagama, Hitoshi; Shibata, Tatsuhiro; Yachida, Shinichi

    2015-12-16

    Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect and monitor molecular characteristics of tumors. In the present study, we determined the mutational status of KRAS in plasma cfDNA using multiplex picoliter-droplet digital PCR in 259 patients with PDAC. We constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA and matched germline DNA samples in 48 patients who had ≥1% mutant allele frequencies of KRAS in plasma cfDNA. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2%) examined by targeted deep sequencing of cfDNA. We also analyzed somatic copy number alterations based on the targeted sequencing data using our in-house algorithm, and potentially targetable amplifications were detected. Assessment of mutations and copy number alterations in plasma cfDNA may provide a prognostic and diagnostic tool to assist decisions regarding optimal therapeutic strategies for PDAC patients.

  20. Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity.

    PubMed

    Kim, Hui Kwon; Min, Seonwoo; Song, Myungjae; Jung, Soobin; Choi, Jae Woo; Kim, Younggwang; Lee, Sangeun; Yoon, Sungroh; Kim, Hyongbum Henry

    2018-03-01

    We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.

  1. Geoseq: a tool for dissecting deep-sequencing datasets.

    PubMed

    Gurtowski, James; Cancio, Anthony; Shah, Hardik; Levovitz, Chaya; George, Ajish; Homann, Robert; Sachidanandam, Ravi

    2010-10-12

    Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  2. A comprehensive survey of 3' animal miRNA modification events and a possible role for 3' adenylation in modulating miRNA targeting effectiveness.

    PubMed

    Burroughs, A Maxwell; Ando, Yoshinari; de Hoon, Michiel J L; Tomaru, Yasuhiro; Nishibu, Takahiro; Ukekawa, Ryo; Funakoshi, Taku; Kurokawa, Tsutomu; Suzuki, Harukazu; Hayashizaki, Yoshihide; Daub, Carsten O

    2010-10-01

    Animal microRNA sequences are subject to 3' nucleotide addition. Through detailed analysis of deep-sequenced short RNA data sets, we show adenylation and uridylation of miRNA is globally present and conserved across Drosophila and vertebrates. To better understand 3' adenylation function, we deep-sequenced RNA after knockdown of nucleotidyltransferase enzymes. The PAPD4 nucleotidyltransferase adenylates a wide range of miRNA loci, but adenylation does not appear to affect miRNA stability on a genome-wide scale. Adenine addition appears to reduce effectiveness of miRNA targeting of mRNA transcripts while deep-sequencing of RNA bound to immunoprecipitated Argonaute (AGO) subfamily proteins EIF2C1-EIF2C3 revealed substantial reduction of adenine addition in miRNA associated with EIF2C2 and EIF2C3. Our findings show 3' addition events are widespread and conserved across animals, PAPD4 is a primary miRNA adenylating enzyme, and suggest a role for 3' adenine addition in modulating miRNA effectiveness, possibly through interfering with incorporation into the RNA-induced silencing complex (RISC), a regulatory role that would complement the role of miRNA uridylation in blocking DICER1 uptake.

  3. Site-directed mutagenesis in Petunia × hybrida protoplast system using direct delivery of purified recombinant Cas9 ribonucleoproteins.

    PubMed

    Subburaj, Saminathan; Chung, Sung Jin; Lee, Choongil; Ryu, Seuk-Min; Kim, Duk Hyoung; Kim, Jin-Soo; Bae, Sangsu; Lee, Geung-Joo

    2016-07-01

    Site-directed mutagenesis of nitrate reductase genes using direct delivery of purified Cas9 protein preassembled with guide RNA produces mutations efficiently in Petunia × hybrida protoplast system. The clustered, regularly interspaced, short palindromic repeat (CRISPR)-CRISPR associated endonuclease 9 (CRISPR/Cas9) system has been recently announced as a powerful molecular breeding tool for site-directed mutagenesis in higher plants. Here, we report a site-directed mutagenesis method targeting Petunia nitrate reductase (NR) gene locus. This method could create mutations efficiently using direct delivery of purified Cas9 protein and single guide RNA (sgRNA) into protoplast cells. After transient introduction of RNA-guided endonuclease (RGEN) ribonucleoproteins (RNPs) with different sgRNAs targeting NR genes, mutagenesis at the targeted loci was detected by T7E1 assay and confirmed by targeted deep sequencing. T7E1 assay showed that RGEN RNPs induced site-specific mutations at frequencies ranging from 2.4 to 21 % at four different sites (NR1, 2, 4 and 6) in the PhNR gene locus with average mutation efficiency of 14.9 ± 2.2 %. Targeted deep DNA sequencing revealed mutation rates of 5.3-17.8 % with average mutation rate of 11.5 ± 2 % at the same NR gene target sites in DNA fragments of analyzed protoplast transfectants. Further analysis from targeted deep sequencing showed that the average ratio of deletion to insertion produced collectively by the four NR-RGEN target sites (NR1, 2, 4, and 6) was about 63:37. Our results demonstrated that direct delivery of RGEN RNPs into protoplast cells of Petunia can be exploited as an efficient tool for site-directed mutagenesis of genes or genome editing in plant systems.

  4. pseudoMap: an innovative and comprehensive resource for identification of siRNA-mediated mechanisms in human transcribed pseudogenes.

    PubMed

    Chan, Wen-Ling; Yang, Wen-Kuang; Huang, Hsien-Da; Chang, Jan-Gowth

    2013-01-01

    RNA interference (RNAi) is a gene silencing process within living cells, which is controlled by the RNA-induced silencing complex with a sequence-specific manner. In flies and mice, the pseudogene transcripts can be processed into short interfering RNAs (siRNAs) that regulate protein-coding genes through the RNAi pathway. Following these findings, we construct an innovative and comprehensive database to elucidate siRNA-mediated mechanism in human transcribed pseudogenes (TPGs). To investigate TPG producing siRNAs that regulate protein-coding genes, we mapped the TPGs to small RNAs (sRNAs) that were supported by publicly deep sequencing data from various sRNA libraries and constructed the TPG-derived siRNA-target interactions. In addition, we also presented that TPGs can act as a target for miRNAs that actually regulate the parental gene. To enable the systematic compilation and updating of these results and additional information, we have developed a database, pseudoMap, capturing various types of information, including sequence data, TPG and cognate annotation, deep sequencing data, RNA-folding structure, gene expression profiles, miRNA annotation and target prediction. As our knowledge, pseudoMap is the first database to demonstrate two mechanisms of human TPGs: encoding siRNAs and decoying miRNAs that target the parental gene. pseudoMap is freely accessible at http://pseudomap.mbc.nctu.edu.tw/. Database URL: http://pseudomap.mbc.nctu.edu.tw/

  5. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

    PubMed

    Wang, Sheng; Sun, Siqi; Li, Zhen; Zhang, Renyu; Xu, Jinbo

    2017-01-01

    Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. http://raptorx.uchicago.edu/ContactMap/.

  6. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

    PubMed Central

    Li, Zhen; Zhang, Renyu

    2017-01-01

    Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then. Availability http://raptorx.uchicago.edu/ContactMap/ PMID:28056090

  7. Rapid Fine Conformational Epitope Mapping Using Comprehensive Mutagenesis and Deep Sequencing*

    PubMed Central

    Kowalsky, Caitlin A.; Faber, Matthew S.; Nath, Aritro; Dann, Hailey E.; Kelly, Vince W.; Liu, Li; Shanker, Purva; Wagner, Ellen K.; Maynard, Jennifer A.; Chan, Christina; Whitehead, Timothy A.

    2015-01-01

    Knowledge of the fine location of neutralizing and non-neutralizing epitopes on human pathogens affords a better understanding of the structural basis of antibody efficacy, which will expedite rational design of vaccines, prophylactics, and therapeutics. However, full utilization of the wealth of information from single cell techniques and antibody repertoire sequencing awaits the development of a high throughput, inexpensive method to map the conformational epitopes for antibody-antigen interactions. Here we show such an approach that combines comprehensive mutagenesis, cell surface display, and DNA deep sequencing. We develop analytical equations to identify epitope positions and show the method effectiveness by mapping the fine epitope for different antibodies targeting TNF, pertussis toxin, and the cancer target TROP2. In all three cases, the experimentally determined conformational epitope was consistent with previous experimental datasets, confirming the reliability of the experimental pipeline. Once the comprehensive library is generated, fine conformational epitope maps can be prepared at a rate of four per day. PMID:26296891

  8. Identification of novel microRNAs in Hevea brasiliensis and computational prediction of their targets

    PubMed Central

    2012-01-01

    Background Plants respond to external stimuli through fine regulation of gene expression partially ensured by small RNAs. Of these, microRNAs (miRNAs) play a crucial role. They negatively regulate gene expression by targeting the cleavage or translational inhibition of target messenger RNAs (mRNAs). In Hevea brasiliensis, environmental and harvesting stresses are known to affect natural rubber production. This study set out to identify abiotic stress-related miRNAs in Hevea using next-generation sequencing and bioinformatic analysis. Results Deep sequencing of small RNAs was carried out on plantlets subjected to severe abiotic stress using the Solexa technique. By combining the LeARN pipeline, data from the Plant microRNA database (PMRD) and Hevea EST sequences, we identified 48 conserved miRNA families already characterized in other plant species, and 10 putatively novel miRNA families. The results showed the most abundant size for miRNAs to be 24 nucleotides, except for seven families. Several MIR genes produced both 20-22 nucleotides and 23-27 nucleotides. The two miRNA class sizes were detected for both conserved and putative novel miRNA families, suggesting their functional duality. The EST databases were scanned with conserved and novel miRNA sequences. MiRNA targets were computationally predicted and analysed. The predicted targets involved in "responses to stimuli" and to "antioxidant" and "transcription activities" are presented. Conclusions Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs when the complete genome is not yet available. Our study provided additional information for evolutionary studies and revealed potentially specific regulation of the control of redox status in Hevea. PMID:22330773

  9. Identifying MicroRNAs and Transcript Targets in Jatropha Seeds

    PubMed Central

    Galli, Vanessa; Guzman, Frank; de Oliveira, Luiz F. V.; Loss-Morais, Guilherme; Körbes, Ana P.; Silva, Sérgio D. A.; Margis-Pinheiro, Márcia M. A. N.; Margis, Rogério

    2014-01-01

    MicroRNAs, or miRNAs, are endogenously encoded small RNAs that play a key role in diverse plant biological processes. Jatropha curcas L. has received significant attention as a potential oilseed crop for the production of renewable oil. Here, a sRNA library of mature seeds and three mRNA libraries from three different seed development stages were generated by deep sequencing to identify and characterize the miRNAs and pre-miRNAs of J. curcas. Computational analysis was used for the identification of 180 conserved miRNAs and 41 precursors (pre-miRNAs) as well as 16 novel pre-miRNAs. The predicted miRNA target genes are involved in a broad range of physiological functions, including cellular structure, nuclear function, translation, transport, hormone synthesis, defense, and lipid metabolism. Some pre-miRNA and miRNA targets vary in abundance between the three stages of seed development. A search for sequences that produce siRNA was performed, and the results indicated that J. curcas siRNAs play a role in nuclear functions, transport, catalytic processes and disease resistance. This study presents the first large scale identification of J. curcas miRNAs and their targets in mature seeds based on deep sequencing, and it contributes to a functional understanding of these miRNAs. PMID:24551031

  10. Clinical Utility of Circulating Tumor DNA for Molecular Assessment and Precision Medicine in Pancreatic Cancer.

    PubMed

    Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Kato, Mamoru; Shibata, Tatsuhiro; Yachida, Shinichi

    2016-01-01

    Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect molecular characteristics of tumors, supporting the concept of "liquid biopsy".We determined the mutational status of KRAS in plasma cfDNA using multiplex droplet digital PCR in 259 patients with PDAC, retrospectively. Furthermore, we constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA in 48 patients who had ≥1 % mutant allele frequencies of KRAS in plasma cfDNA.Droplet digital PCR detected KRAS mutations in plasma cfDNA in 63 of 107 (58.9 %) patients with inoperable tumors. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2 %) examined by cfDNA sequencing.Our two-step approach with plasma cfDNA, combining droplet digital PCR and targeted deep sequencing, is a feasible clinical approach. Assessment of mutations in plasma cfDNA may provide a new diagnostic tool, assisting decisions for optimal therapeutic strategies for PDAC patients.

  11. Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

    PubMed

    Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

    2017-01-03

    Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.

  12. Deep sequencing of Salmonella RNA associated with heterologous Hfq proteins in vivo reveals small RNAs as a major target class and identifies RNA processing phenotypes.

    PubMed

    Sittka, Alexandra; Sharma, Cynthia M; Rolle, Katarzyna; Vogel, Jörg

    2009-01-01

    The bacterial Sm-like protein, Hfq, is a key factor for the stability and function of small non-coding RNAs (sRNAs) in Escherichia coli. Homologues of this protein have been predicted in many distantly related organisms yet their functional conservation as sRNA-binding proteins has not entirely been clear. To address this, we expressed in Salmonella the Hfq proteins of two eubacteria (Neisseria meningitides, Aquifex aeolicus) and an archaeon (Methanocaldococcus jannaschii), and analyzed the associated RNA by deep sequencing. This in vivo approach identified endogenous Salmonella sRNAs as a major target of the foreign Hfq proteins. New Salmonella sRNA species were also identified, and some of these accumulated specifically in the presence of a foreign Hfq protein. In addition, we observed specific RNA processing defects, e.g., suppression of precursor processing of SraH sRNA by Methanocaldococcus Hfq, or aberrant accumulation of extracytoplasmic target mRNAs of the Salmonella GcvB, MicA or RybB sRNAs. Taken together, our study provides evidence of a conserved inherent sRNA-binding property of Hfq, which may facilitate the lateral transmission of regulatory sRNAs among distantly related species. It also suggests that the expression of heterologous RNA-binding proteins combined with deep sequencing analysis of RNA ligands can be used as a molecular tool to dissect individual steps of RNA metabolism in vivo.

  13. Deep learning methods for protein torsion angle prediction.

    PubMed

    Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin

    2017-09-18

    Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.

  14. The minimal amount of starting DNA for Agilent’s hybrid capture-based targeted massively parallel sequencing

    PubMed Central

    Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun

    2016-01-01

    Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682

  15. Identification and profiling of novel microRNAs in the Brassica rapa genome based on small RNA deep sequencing

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are one of the functional non-coding small RNAs involved in the epigenetic control of the plant genome. Although plants contain both evolutionary conserved miRNAs and species-specific miRNAs within their genomes, computational methods often only identify evolutionary conserved miRNAs. The recent sequencing of the Brassica rapa genome enables us to identify miRNAs and their putative target genes. In this study, we sought to provide a more comprehensive prediction of B. rapa miRNAs based on high throughput small RNA deep sequencing. Results We sequenced small RNAs from five types of tissue: seedlings, roots, petioles, leaves, and flowers. By analyzing 2.75 million unique reads that mapped to the B. rapa genome, we identified 216 novel and 196 conserved miRNAs that were predicted to target approximately 20% of the genome’s protein coding genes. Quantitative analysis of miRNAs from the five types of tissue revealed that novel miRNAs were expressed in diverse tissues but their expression levels were lower than those of the conserved miRNAs. Comparative analysis of the miRNAs between the B. rapa and Arabidopsis thaliana genomes demonstrated that redundant copies of conserved miRNAs in the B. rapa genome may have been deleted after whole genome triplication. Novel miRNA members seemed to have spontaneously arisen from the B. rapa and A. thaliana genomes, suggesting the species-specific expansion of miRNAs. We have made this data publicly available in a miRNA database of B. rapa called BraMRs. The database allows the user to retrieve miRNA sequences, their expression profiles, and a description of their target genes from the five tissue types investigated here. Conclusions This is the first report to identify novel miRNAs from Brassica crops using genome-wide high throughput techniques. The combination of computational methods and small RNA deep sequencing provides robust predictions of miRNAs in the genome. The finding of numerous novel miRNAs, many with few target genes and low expression levels, suggests the rapid evolution of miRNA genes. The development of a miRNA database, BraMRs, enables us to integrate miRNA identification, target prediction, and functional annotation of target genes. BraMRs will represent a valuable public resource with which to study the epigenetic control of B. rapa and other closely related Brassica species. The database is available at the following link: http://bramrs.rna.kr [1]. PMID:23163954

  16. Uncovering Small RNA-Mediated Responses to Cold Stress in a Wheat Thermosensitive Genic Male-Sterile Line by Deep Sequencing1[W][OA

    PubMed Central

    Tang, Zhonghui; Zhang, Liping; Xu, Chenguang; Yuan, Shaohua; Zhang, Fengting; Zheng, Yonglian; Zhao, Changping

    2012-01-01

    The male sterility of thermosensitive genic male sterile (TGMS) lines of wheat (Triticum aestivum) is strictly controlled by temperature. The early phase of anther development is especially susceptible to cold stress. MicroRNAs (miRNAs) play an important role in plant development and in responses to environmental stress. In this study, deep sequencing of small RNA (smRNA) libraries obtained from spike tissues of the TGMS line under cold and control conditions identified a total of 78 unique miRNA sequences from 30 families and trans-acting small interfering RNAs (tasiRNAs) derived from two TAS3 genes. To identify smRNA targets in the wheat TGMS line, we applied the degradome sequencing method, which globally and directly identifies the remnants of smRNA-directed target cleavage. We identified 26 targets of 16 miRNA families and three targets of tasiRNAs. Comparing smRNA sequencing data sets and TaqMan quantitative polymerase chain reaction results, we identified six miRNAs and one tasiRNA (tasiRNA-ARF [for Auxin-Responsive Factor]) as cold stress-responsive smRNAs in spike tissues of the TGMS line. We also determined the expression profiles of target genes that encode transcription factors in response to cold stress. Interestingly, the expression of cold stress-responsive smRNAs integrated in the auxin-signaling pathway and their target genes was largely noncorrelated. We investigated the tissue-specific expression of smRNAs using a tissue microarray approach. Our data indicated that miR167 and tasiRNA-ARF play roles in regulating the auxin-signaling pathway and possibly in the developmental response to cold stress. These data provide evidence that smRNA regulatory pathways are linked with male sterility in the TGMS line during cold stress. PMID:22508932

  17. ampliMethProfiler: a pipeline for the analysis of CpG methylation profiles of targeted deep bisulfite sequenced amplicons.

    PubMed

    Scala, Giovanni; Affinito, Ornella; Palumbo, Domenico; Florio, Ermanno; Monticelli, Antonella; Miele, Gennaro; Chiariotti, Lorenzo; Cocozza, Sergio

    2016-11-25

    CpG sites in an individual molecule may exist in a binary state (methylated or unmethylated) and each individual DNA molecule, containing a certain number of CpGs, is a combination of these states defining an epihaplotype. Classic quantification based approaches to study DNA methylation are intrinsically unable to fully represent the complexity of the underlying methylation substrate. Epihaplotype based approaches, on the other hand, allow methylation profiles of cell populations to be studied at the single molecule level. For such investigations, next-generation sequencing techniques can be used, both for quantitative and for epihaplotype analysis. Currently available tools for methylation analysis lack output formats that explicitly report CpG methylation profiles at the single molecule level and that have suited statistical tools for their interpretation. Here we present ampliMethProfiler, a python-based pipeline for the extraction and statistical epihaplotype analysis of amplicons from targeted deep bisulfite sequencing of multiple DNA regions. ampliMethProfiler tool provides an easy and user friendly way to extract and analyze the epihaplotype composition of reads from targeted bisulfite sequencing experiments. ampliMethProfiler is written in python language and requires a local installation of BLAST and (optionally) QIIME tools. It can be run on Linux and OS X platforms. The software is open source and freely available at http://amplimethprofiler.sourceforge.net .

  18. Magnetic resonance imaging of the subthalamic nucleus for deep brain stimulation.

    PubMed

    Chandran, Arjun S; Bynevelt, Michael; Lind, Christopher R P

    2016-01-01

    The subthalamic nucleus (STN) is one of the most important stereotactic targets in neurosurgery, and its accurate imaging is crucial. With improving MRI sequences there is impetus for direct targeting of the STN. High-quality, distortion-free images are paramount. Image reconstruction techniques appear to show the greatest promise in balancing the issue of geometrical distortion and STN edge detection. Existing spin echo- and susceptibility-based MRI sequences are compared with new image reconstruction methods. Quantitative susceptibility mapping is the most promising technique for stereotactic imaging of the STN.

  19. Action-Driven Visual Object Tracking With Deep Reinforcement Learning.

    PubMed

    Yun, Sangdoo; Choi, Jongwon; Yoo, Youngjoon; Yun, Kimin; Choi, Jin Young

    2018-06-01

    In this paper, we propose an efficient visual tracker, which directly captures a bounding box containing the target object in a video by means of sequential actions learned using deep neural networks. The proposed deep neural network to control tracking actions is pretrained using various training video sequences and fine-tuned during actual tracking for online adaptation to a change of target and background. The pretraining is done by utilizing deep reinforcement learning (RL) as well as supervised learning. The use of RL enables even partially labeled data to be successfully utilized for semisupervised learning. Through the evaluation of the object tracking benchmark data set, the proposed tracker is validated to achieve a competitive performance at three times the speed of existing deep network-based trackers. The fast version of the proposed method, which operates in real time on graphics processing unit, outperforms the state-of-the-art real-time trackers with an accuracy improvement of more than 8%.

  20. Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy

    PubMed Central

    Matkovich, Scot J.; Dorn, Gerald W.

    2018-01-01

    Summary MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicates purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses. PMID:25836573

  1. Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy.

    PubMed

    Matkovich, Scot J; Dorn, Gerald W

    2015-01-01

    MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicate purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses.

  2. Deep sequencing identifies circulating mouse miRNAs that are functionally implicated in manifestations of aging and responsive to calorie restriction.

    PubMed

    Dhahbi, Joseph M; Spindler, Stephen R; Atamna, Hani; Yamakawa, Amy; Guerrero, Noel; Boffelli, Dario; Mote, Patricia; Martin, David I K

    2013-02-01

    MicroRNAs (miRNAs) function to modulate gene expression, and through this property they regulate a broad spectrum of cellular processes. They can circulate in blood and thereby mediate cell-to-cell communication. Aging involves changes in many cellular processes that are potentially regulated by miRNAs, and some evidence has implicated circulating miRNAs in the aging process. In order to initiate a comprehensive assessment of the role of circulating miRNAs in aging, we have used deep sequencing to characterize circulating miRNAs in the serum of young mice, old mice, and old mice maintained on calorie restriction (CR). Deep sequencing identifies a set of novel miRNAs, and also accurately measures all known miRNAs present in serum. This analysis demonstrates that the levels of many miRNAs circulating in the mouse are increased with age, and that the increases can be antagonized by CR. The genes targeted by this set of age-modulated miRNAs are predicted to regulate biological processes directly relevant to the manifestations of aging including metabolic changes, and the miRNAs themselves have been linked to diseases associated with old age. This finding implicates circulating miRNAs in the aging process, raising questions about their tissues of origin, their cellular targets, and their functional role in metabolic changes that occur with aging.

  3. RNAi-mediated endogene silencing in strawberry fruit: detection of primary and secondary siRNAs by deep sequencing.

    PubMed

    Härtl, Katja; Kalinowski, Gregor; Hoffmann, Thomas; Preuss, Anja; Schwab, Wilfried

    2017-05-01

    RNA interference (RNAi) has been exploited as a reverse genetic tool for functional genomics in the nonmodel species strawberry (Fragaria × ananassa) since 2006. Here, we analysed for the first time different but overlapping nucleotide sections (>200 nt) of two endogenous genes, FaCHS (chalcone synthase) and FaOMT (O-methyltransferase), as inducer sequences and a transitive vector system to compare their gene silencing efficiencies. In total, ten vectors were assembled each containing the nucleotide sequence of one fragment in sense and corresponding antisense orientation separated by an intron (inverted hairpin construct, ihp). All sequence fragments along the full lengths of both target genes resulted in a significant down-regulation of the respective gene expression and related metabolite levels. Quantitative PCR data and successful application of a transitive vector system coinciding with a phenotypic change suggested propagation of the silencing signal. The spreading of the signal in strawberry fruit in the 3' direction was shown for the first time by the detection of secondary small interfering RNAs (siRNAs) outside of the primary targets by deep sequencing. Down-regulation of endogenes by the transitive method was less effective than silencing by ihp constructs probably because the numbers of primary siRNAs exceeded the quantity of secondary siRNAs by three orders of magnitude. Besides, we observed consistent hotspots of primary and secondary siRNA formation along the target sequence which fall within a distance of less than 200 nt. Thus, ihp vectors seem to be superior over the transitive vector system for functional genomics in strawberry fruit. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  4. Exome and deep sequencing of clinically aggressive neuroblastoma reveal somatic mutations that affect key pathways involved in cancer progression

    PubMed Central

    Lasorsa, Vito Alessandro; Formicola, Daniela; Pignataro, Piero; Cimmino, Flora; Calabrese, Francesco Maria; Mora, Jaume; Esposito, Maria Rosaria; Pantile, Marcella; Zanon, Carlo; De Mariano, Marilena; Longo, Luca; Hogarty, Michael D.; de Torres, Carmen; Tonini, Gian Paolo; Iolascon, Achille; Capasso, Mario

    2016-01-01

    The spectrum of somatic mutation of the most aggressive forms of neuroblastoma is not completely determined. We sought to identify potential cancer drivers in clinically aggressive neuroblastoma. Whole exome sequencing was conducted on 17 germline and tumor DNA samples from high-risk patients with adverse events within 36 months from diagnosis (HR-Event3) to identify somatic mutations and deep targeted sequencing of 134 genes selected from the initial screening in additional 48 germline and tumor pairs (62.5% HR-Event3 and high-risk patients), 17 HR-Event3 tumors and 17 human-derived neuroblastoma cell lines. We revealed 22 significantly mutated genes, many of which implicated in cancer progression. Fifteen genes (68.2%) were highly expressed in neuroblastoma supporting their involvement in the disease. CHD9, a cancer driver gene, was the most significantly altered (4.0% of cases) after ALK. Other genes (PTK2, NAV3, NAV1, FZD1 and ATRX), expressed in neuroblastoma and involved in cell invasion and migration were mutated at frequency ranged from 4% to 2%. Focal adhesion and regulation of actin cytoskeleton pathways, were frequently disrupted (14.1% of cases) thus suggesting potential novel therapeutic strategies to prevent disease progression. Notably BARD1, CHEK2 and AXIN2 were enriched in rare, potentially pathogenic, germline variants. In summary, whole exome and deep targeted sequencing identified novel cancer genes of clinically aggressive neuroblastoma. Our analyses show pathway-level implications of infrequently mutated genes in leading neuroblastoma progression. PMID:27009842

  5. Exome and deep sequencing of clinically aggressive neuroblastoma reveal somatic mutations that affect key pathways involved in cancer progression.

    PubMed

    Lasorsa, Vito Alessandro; Formicola, Daniela; Pignataro, Piero; Cimmino, Flora; Calabrese, Francesco Maria; Mora, Jaume; Esposito, Maria Rosaria; Pantile, Marcella; Zanon, Carlo; De Mariano, Marilena; Longo, Luca; Hogarty, Michael D; de Torres, Carmen; Tonini, Gian Paolo; Iolascon, Achille; Capasso, Mario

    2016-04-19

    The spectrum of somatic mutation of the most aggressive forms of neuroblastoma is not completely determined. We sought to identify potential cancer drivers in clinically aggressive neuroblastoma.Whole exome sequencing was conducted on 17 germline and tumor DNA samples from high-risk patients with adverse events within 36 months from diagnosis (HR-Event3) to identify somatic mutations and deep targeted sequencing of 134 genes selected from the initial screening in additional 48 germline and tumor pairs (62.5% HR-Event3 and high-risk patients), 17 HR-Event3 tumors and 17 human-derived neuroblastoma cell lines.We revealed 22 significantly mutated genes, many of which implicated in cancer progression. Fifteen genes (68.2%) were highly expressed in neuroblastoma supporting their involvement in the disease. CHD9, a cancer driver gene, was the most significantly altered (4.0% of cases) after ALK.Other genes (PTK2, NAV3, NAV1, FZD1 and ATRX), expressed in neuroblastoma and involved in cell invasion and migration were mutated at frequency ranged from 4% to 2%.Focal adhesion and regulation of actin cytoskeleton pathways, were frequently disrupted (14.1% of cases) thus suggesting potential novel therapeutic strategies to prevent disease progression.Notably BARD1, CHEK2 and AXIN2 were enriched in rare, potentially pathogenic, germline variants.In summary, whole exome and deep targeted sequencing identified novel cancer genes of clinically aggressive neuroblastoma. Our analyses show pathway-level implications of infrequently mutated genes in leading neuroblastoma progression.

  6. Integrated design, execution, and analysis of arrayed and pooled CRISPR genome-editing experiments.

    PubMed

    Canver, Matthew C; Haeussler, Maximilian; Bauer, Daniel E; Orkin, Stuart H; Sanjana, Neville E; Shalem, Ophir; Yuan, Guo-Cheng; Zhang, Feng; Concordet, Jean-Paul; Pinello, Luca

    2018-05-01

    CRISPR (clustered regularly interspaced short palindromic repeats) genome-editing experiments offer enormous potential for the evaluation of genomic loci using arrayed single guide RNAs (sgRNAs) or pooled sgRNA libraries. Numerous computational tools are available to help design sgRNAs with optimal on-target efficiency and minimal off-target potential. In addition, computational tools have been developed to analyze deep-sequencing data resulting from genome-editing experiments. However, these tools are typically developed in isolation and oftentimes are not readily translatable into laboratory-based experiments. Here, we present a protocol that describes in detail both the computational and benchtop implementation of an arrayed and/or pooled CRISPR genome-editing experiment. This protocol provides instructions for sgRNA design with CRISPOR (computational tool for the design, evaluation, and cloning of sgRNA sequences), experimental implementation, and analysis of the resulting high-throughput sequencing data with CRISPResso (computational tool for analysis of genome-editing outcomes from deep-sequencing data). This protocol allows for design and execution of arrayed and pooled CRISPR experiments in 4-5 weeks by non-experts, as well as computational data analysis that can be performed in 1-2 d by both computational and noncomputational biologists alike using web-based and/or command-line versions.

  7. Fungal diversity in deep-sea sediments associated with asphalt seeps at the Sao Paulo Plateau

    NASA Astrophysics Data System (ADS)

    Nagano, Yuriko; Miura, Toshiko; Nishi, Shinro; Lima, Andre O.; Nakayama, Cristina; Pellizari, Vivian H.; Fujikura, Katsunori

    2017-12-01

    We investigated the fungal diversity in a total of 20 deep-sea sediment samples (of which 14 samples were associated with natural asphalt seeps and 6 samples were not associated) collected from two different sites at the Sao Paulo Plateau off Brazil by Ion Torrent PGM targeting ITS region of ribosomal RNA. Our results suggest that diverse fungi (113 operational taxonomic units (OTUs) based on clustering at 97% sequence similarity assigned into 9 classes and 31 genus) are present in deep-sea sediment samples collected at the Sao Paulo Plateau, dominated by Ascomycota (74.3%), followed by Basidiomycota (11.5%), unidentified fungi (7.1%), and sequences with no affiliation to any organisms in the public database (7.1%). However, it was revealed that only three species, namely Penicillium sp., Cadophora malorum and Rhodosporidium diobovatum, were dominant, with the majority of OTUs remaining a minor community. Unexpectedly, there was no significant difference in major fungal community structure between the asphalt seep and non-asphalt seep sites, despite the presence of mass hydrocarbon deposits and the high amount of macro organisms surrounding the asphalt seeps. However, there were some differences in the minor fungal communities, with possible asphalt degrading fungi present specifically in the asphalt seep sites. In contrast, some differences were found between the two different sampling sites. Classification of OTUs revealed that only 47 (41.6%) fungal OTUs exhibited >97% sequence similarity, in comparison with pre-existing ITS sequences in public databases, indicating that a majority of deep-sea inhabiting fungal taxa still remain undescribed. Although our knowledge on fungi and their role in deep-sea environments is still limited and scarce, this study increases our understanding of fungal diversity and community structure in deep-sea environments.

  8. Comparative sequencing analysis reveals high genomic concordance between matched primary and metastatic colorectal cancer lesions.

    PubMed

    Brannon, A Rose; Vakiani, Efsevia; Sylvester, Brooke E; Scott, Sasinya N; McDermott, Gregory; Shah, Ronak H; Kania, Krishan; Viale, Agnes; Oschwald, Dayna M; Vacic, Vladimir; Emde, Anne-Katrin; Cercek, Andrea; Yaeger, Rona; Kemeny, Nancy E; Saltz, Leonard B; Shia, Jinru; D'Angelica, Michael I; Weiser, Martin R; Solit, David B; Berger, Michael F

    2014-08-28

    Colorectal cancer is the second leading cause of cancer death in the United States, with over 50,000 deaths estimated in 2014. Molecular profiling for somatic mutations that predict absence of response to anti-EGFR therapy has become standard practice in the treatment of metastatic colorectal cancer; however, the quantity and type of tissue available for testing is frequently limited. Further, the degree to which the primary tumor is a faithful representation of metastatic disease has been questioned. As next-generation sequencing technology becomes more widely available for clinical use and additional molecularly targeted agents are considered as treatment options in colorectal cancer, it is important to characterize the extent of tumor heterogeneity between primary and metastatic tumors. We performed deep coverage, targeted next-generation sequencing of 230 key cancer-associated genes for 69 matched primary and metastatic tumors and normal tissue. Mutation profiles were 100% concordant for KRAS, NRAS, and BRAF, and were highly concordant for recurrent alterations in colorectal cancer. Additionally, whole genome sequencing of four patient trios did not reveal any additional site-specific targetable alterations. Colorectal cancer primary tumors and metastases exhibit high genomic concordance. As current clinical practices in colorectal cancer revolve around KRAS, NRAS, and BRAF mutation status, diagnostic sequencing of either primary or metastatic tissue as available is acceptable for most patients. Additionally, consistency between targeted sequencing and whole genome sequencing results suggests that targeted sequencing may be a suitable strategy for clinical diagnostic applications.

  9. Brain Tumor Segmentation Using Deep Belief Networks and Pathological Knowledge.

    PubMed

    Zhan, Tianming; Chen, Yi; Hong, Xunning; Lu, Zhenyu; Chen, Yunjie

    2017-01-01

    In this paper, we propose an automatic brain tumor segmentation method based on Deep Belief Networks (DBNs) and pathological knowledge. The proposed method is targeted against gliomas (both low and high grade) obtained in multi-sequence magnetic resonance images (MRIs). Firstly, a novel deep architecture is proposed to combine the multi-sequences intensities feature extraction with classification to get the classification probabilities of each voxel. Then, graph cut based optimization is executed on the classification probabilities to strengthen the spatial relationships of voxels. At last, pathological knowledge of gliomas is applied to remove some false positives. Our method was validated in the Brain Tumor Segmentation Challenge 2012 and 2013 databases (BRATS 2012, 2013). The performance of segmentation results demonstrates our proposal providing a competitive solution with stateof- the-art methods. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    PubMed Central

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729

  11. Chemokine Receptor Signatures in Allogeneic Stem Cell Transplantation

    DTIC Science & Technology

    2014-08-01

    versus-host disease (GHVD). We use T-cell receptor deep sequencing to characterize the repertoire of effector T-cells in allogeneic hematopoietic stem ... cell transplant (HSCT) recipients and identify the role of chemokine receptors in effector cell infiltration of target organs. In the recent funding

  12. Identification of MicroRNAs and transcript targets in Camelina sativa by deep sequencing and computational methods

    DOE PAGES

    Poudel, Saroj; Aryal, Niranjan; Lu, Chaofu; ...

    2015-03-31

    Camelina sativa is an annual oilseed crop that is under intensive development for renewable resources of biofuels and industrial oils. MicroRNAs, or miRNAs, are endogenously encoded small RNAs that play key roles in diverse plant biological processes. Here, we conducted deep sequencing on small RNA libraries prepared from camelina leaves, flower buds and two stages of developing seeds corresponding to initial and peak storage products accumulation. Computational analyses identified 207 known miRNAs belonging to 63 families, as well as 5 novel miRNAs. These miRNAs, especially members of the miRNA families, varied greatly in different tissues and developmental stages. The predictedmore » miRNA target genes are involved in a broad range of physiological functions including lipid metabolism. This report is the first step toward elucidating roles of miRNAs in C. sativa and will provide additional tools to improve this oilseed crop for biofuels and biomaterials.« less

  13. Deep sequencing of the LRRK2 gene in 14,002 individuals reveals evidence of purifying selection and independent origin of the p.Arg1628Pro mutation in Europe

    PubMed Central

    Rubio, Justin P.; Topp, Simon; Warren, Liling; St Jean, Pamela L.; Wegmann, Daniel; Kessner, Darren; Novembre, John; Shen, Judong; Fraser, Dana; Aponte, Jennifer; Nangle, Keith; Cardon, Lon R.; Ehm, Margaret G.; Chissoe, Stephanie L.; Whittaker, John C.; Nelson, Matthew R.; Mooser, Vincent E.

    2012-01-01

    Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single nucleotide variants (SNVs), 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Amongst Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci. PMID:22415848

  14. Deep sequencing of the LRRK2 gene in 14,002 individuals reveals evidence of purifying selection and independent origin of the p.Arg1628Pro mutation in Europe.

    PubMed

    Rubio, Justin P; Topp, Simon; Warren, Liling; St Jean, Pamela L; Wegmann, Daniel; Kessner, Darren; Novembre, John; Shen, Judong; Fraser, Dana; Aponte, Jennifer; Nangle, Keith; Cardon, Lon R; Ehm, Margaret G; Chissoe, Stephanie L; Whittaker, John C; Nelson, Matthew R; Mooser, Vincent E

    2012-07-01

    Genetic variation in LRRK2 predisposes to Parkinson disease (PD), which underpins its development as a therapeutic target. Here, we aimed to identify novel genotype-phenotype associations that might support developing LRRK2 therapies for other conditions. We sequenced the 51 exons of LRRK2 in cases comprising 12 common diseases (n = 9,582), and in 4,420 population controls. We identified 739 single-nucleotide variants, 62% of which were observed in only one person, including 316 novel exonic variants. We found evidence of purifying selection for the LRRK2 gene and a trend suggesting that this is more pronounced in the central (ROC-COR-kinase) core protein domains of LRRK2 than the flanking domains. Population genetic analyses revealed that LRRK2 is not especially polymorphic or differentiated in comparison to 201 other drug target genes. Among Europeans, we identified 17 carriers (0.13%) of pathogenic LRRK2 mutations that were not significantly enriched within any disease or in those reporting a family history of PD. Analysis of pathogenic mutations within Europe reveals that the p.Arg1628Pro (c4883G>C) mutation arose independently in Europe and Asia. Taken together, these findings demonstrate how targeted deep sequencing can help to reveal fundamental characteristics of clinically important loci. © 2012 Wiley Periodicals, Inc.

  15. Grammatical markers switch roles and elicit different electrophysiological responses under shallow and deep semantic requirements.

    PubMed

    Soshi, Takahiro; Nakajima, Heizo; Hagiwara, Hiroko

    2016-10-01

    Static knowledge about the grammar of a natural language is represented in the cortico-subcortical system. However, the differences in dynamic verbal processing under different cognitive conditions are unclear. To clarify this, we conducted an electrophysiological experiment involving a semantic priming paradigm in which semantically congruent or incongruent word sequences (prime nouns-target verbs) were randomly presented. We examined the event-related brain potentials that occurred in response to congruent and incongruent target words that were preceded by primes with or without grammatical case markers. The two participant groups performed either the shallow (lexical judgment) or deep (direct semantic judgment) semantic tasks. We hypothesized that, irrespective of the case markers, the congruent targets would reduce centro-posterior N400 activities under the deep semantic condition, which induces selective attention to the semantic relatedness of content words. However, the same congruent targets with correct case markers would reduce lateralized negativity under the shallow semantic condition because grammatical case markers are related to automatic structural integration under semantically unattended conditions. We observed that congruent targets (e.g., 'open') that were preceded by primes with congruent case markers (e.g., 'shutter-object case') reduced lateralized negativity under the shallow semantic condition. In contrast, congruent targets, irrespective of case markers, consistently yielded N400 reductions under the deep semantic condition. To summarize, human neural verbal processing differed in response to the same grammatical markers in the same verbal expressions under semantically attended or unattended conditions.

  16. Identification of MicroRNAs in Helicoverpa armigera and Spodoptera litura Based on Deep Sequencing and Homology Analysis

    PubMed Central

    Ge, Xie; Zhang, Yong; Jiang, Jianhao; Zhong, Yi; Yang, Xiaonan; Li, Zhiqian; Huang, Yongping; Tan, Anjiang

    2013-01-01

    The current identification of microRNAs (miRNAs) in insects is largely dependent on genome sequences. However, the lack of available genome sequences inhibits the identification of miRNAs in various insect species. In this study, we used a miRNA database of the silkworm Bombyx mori as a reference to identify miRNAs in Helicoverpa armigera and Spodoptera litura using deep sequencing and homology analysis. Because all three species belong to the Lepidoptera, the experiment produced reliable results. Our study identified 97 and 91 conserved miRNAs in H. armigera and S. litura, respectively. Using the genome of B. mori and BAC sequences of H. armigera as references, 1 novel miRNA and 8 novel miRNA candidates were identified in H. armigera, and 4 novel miRNA candidates were identified in S. litura. An evolutionary analysis revealed that most of the identified miRNAs were insect-specific, and more than 20 miRNAs were Lepidoptera-specific. The investigation of the expression patterns of miR-2a, miR-34, miR-2796-3p and miR-11 revealed their potential roles in insect development. miRNA target prediction revealed that conserved miRNA target sites exist in various genes in the 3 species. Conserved miRNA target sites for the Hsp90 gene among the 3 species were validated in the mammalian 293T cell line using a dual-luciferase reporter assay. Our study provides a new approach with which to identify miRNAs in insects lacking genome information and contributes to the functional analysis of insect miRNAs. PMID:23289012

  17. Rapid molecular diagnostics of severe primary immunodeficiency determined by using targeted next-generation sequencing.

    PubMed

    Yu, Hui; Zhang, Victor Wei; Stray-Pedersen, Asbjørg; Hanson, Imelda Celine; Forbes, Lisa R; de la Morena, M Teresa; Chinn, Ivan K; Gorman, Elizabeth; Mendelsohn, Nancy J; Pozos, Tamara; Wiszniewski, Wojciech; Nicholas, Sarah K; Yates, Anne B; Moore, Lindsey E; Berge, Knut Erik; Sorte, Hanne; Bayer, Diana K; ALZahrani, Daifulah; Geha, Raif S; Feng, Yanming; Wang, Guoli; Orange, Jordan S; Lupski, James R; Wang, Jing; Wong, Lee-Jun

    2016-10-01

    Primary immunodeficiency diseases (PIDDs) are inherited disorders of the immune system. The most severe form, severe combined immunodeficiency (SCID), presents with profound deficiencies of T cells, B cells, or both at birth. If not treated promptly, affected patients usually do not live beyond infancy because of infections. Genetic heterogeneity of SCID frequently delays the diagnosis; a specific diagnosis is crucial for life-saving treatment and optimal management. We developed a next-generation sequencing (NGS)-based multigene-targeted panel for SCID and other severe PIDDs requiring rapid therapeutic actions in a clinical laboratory setting. The target gene capture/NGS assay provides an average read depth of approximately 1000×. The deep coverage facilitates simultaneous detection of single nucleotide variants and exonic copy number variants in one comprehensive assessment. Exons with insufficient coverage (<20× read depth) or high sequence homology (pseudogenes) are complemented by amplicon-based sequencing with specific primers to ensure 100% coverage of all targeted regions. Analysis of 20 patient samples with low T-cell receptor excision circle numbers on newborn screening or a positive family history or clinical suspicion of SCID or other severe PIDD identified deleterious mutations in 14 of them. Identified pathogenic variants included both single nucleotide variants and exonic copy number variants, such as hemizygous nonsense, frameshift, and missense changes in IL2RG; compound heterozygous changes in ATM, RAG1, and CIITA; homozygous changes in DCLRE1C and IL7R; and a heterozygous nonsense mutation in CHD7. High-throughput deep sequencing analysis with complete clinical validation greatly increases the diagnostic yield of severe primary immunodeficiency. Establishing a molecular diagnosis enables early immune reconstitution through prompt therapeutic intervention and guides management for improved long-term quality of life. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  18. Studies of a biochemical factory: tomato trichome deep expressed sequence tag sequencing and proteomics.

    PubMed

    Schilmiller, Anthony L; Miner, Dennis P; Larson, Matthew; McDowell, Eric; Gang, David R; Wilkerson, Curtis; Last, Robert L

    2010-07-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces beta-caryophyllene and alpha-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells.

  19. Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

    PubMed Central

    Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

    2010-01-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087

  20. On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution

    NASA Astrophysics Data System (ADS)

    Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein

    2018-07-01

    One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.

  1. On Statistical Modeling of Sequencing Noise in High Depth Data to Assess Tumor Evolution

    NASA Astrophysics Data System (ADS)

    Rabadan, Raul; Bhanot, Gyan; Marsilio, Sonia; Chiorazzi, Nicholas; Pasqualucci, Laura; Khiabanian, Hossein

    2017-12-01

    One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerge from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.

  2. HomozygosityMapper2012--bridging the gap between homozygosity mapping and deep sequencing.

    PubMed

    Seelow, Dominik; Schuelke, Markus

    2012-07-01

    Homozygosity mapping is a common method to map recessive traits in consanguineous families. To facilitate these analyses, we have developed HomozygosityMapper, a web-based approach to homozygosity mapping. HomozygosityMapper allows researchers to directly upload the genotype files produced by the major genotyping platforms as well as deep sequencing data. It detects stretches of homozygosity shared by the affected individuals and displays them graphically. Users can interactively inspect the underlying genotypes, manually refine these regions and eventually submit them to our candidate gene search engine GeneDistiller to identify the most promising candidate genes. Here, we present the new version of HomozygosityMapper. The most striking new feature is the support of Next Generation Sequencing *.vcf files as input. Upon users' requests, we have implemented the analysis of common experimental rodents as well as of important farm animals. Furthermore, we have extended the options for single families and loss of heterozygosity studies. Another new feature is the export of *.bed files for targeted enrichment of the potential disease regions for deep sequencing strategies. HomozygosityMapper also generates files for conventional linkage analyses which are already restricted to the possible disease regions, hence superseding CPU-intensive genome-wide analyses. HomozygosityMapper is freely available at http://www.homozygositymapper.org/.

  3. Monitoring therapy responses at the leukemic subclone level by ultra-deep amplicon resequencing in acute myeloid leukemia.

    PubMed

    Ojamies, P N; Kontro, M; Edgren, H; Ellonen, P; Lagström, S; Almusa, H; Miettinen, T; Eldfors, S; Tamborero, D; Wennerberg, K; Heckman, C; Porkka, K; Wolf, M; Kallioniemi, O

    2017-05-01

    In our individualized systems medicine program, personalized treatment options are identified and administered to chemorefractory acute myeloid leukemia (AML) patients based on exome sequencing and ex vivo drug sensitivity and resistance testing data. Here, we analyzed how clonal heterogeneity affects the responses of 13 AML patients to chemotherapy or targeted treatments using ultra-deep (average 68 000 × coverage) amplicon resequencing. Using amplicon resequencing, we identified 16 variants from 4 patients (frequency 0.54-2%) that were not detected previously by exome sequencing. A correlation-based method was developed to detect mutation-specific responses in serial samples across multiple time points. Significant subclone-specific responses were observed for both chemotherapy and targeted therapy. We detected subclonal responses in patients where clinical European LeukemiaNet (ELN) criteria showed no response. Subclonal responses also helped to identify putative mechanisms underlying drug sensitivities, such as sensitivity to azacitidine in DNMT3A mutated cell clones and resistance to cytarabine in a subclone with loss of NF1 gene. In summary, ultra-deep amplicon resequencing method enables sensitive quantification of subclonal variants and their responses to therapies. This approach provides new opportunities for designing combinatorial therapies blocking multiple subclones as well as for real-time assessment of such treatments.

  4. Sensitive Deep-Sequencing-Based HIV-1 Genotyping Assay To Simultaneously Determine Susceptibility to Protease, Reverse Transcriptase, Integrase, and Maturation Inhibitors, as Well as HIV-1 Coreceptor Tropism

    PubMed Central

    Gibson, Richard M.; Meyer, Ashley M.; Winner, Dane; Archer, John; Feyertag, Felix; Ruiz-Mateos, Ezequiel; Leal, Manuel; Robertson, David L.; Schmotzer, Christine L.

    2014-01-01

    With 29 individual antiretroviral drugs available from six classes that are approved for the treatment of HIV-1 infection, a combination of different phenotypic and genotypic tests is currently needed to monitor HIV-infected individuals. In this study, we developed a novel HIV-1 genotypic assay based on deep sequencing (DeepGen HIV) to simultaneously assess HIV-1 susceptibilities to all drugs targeting the three viral enzymes and to predict HIV-1 coreceptor tropism. Patient-derived gag-p2/NCp7/p1/p6/pol-PR/RT/IN- and env-C2V3 PCR products were sequenced using the Ion Torrent Personal Genome Machine. Reads spanning the 3′ end of the Gag, protease (PR), reverse transcriptase (RT), integrase (IN), and V3 regions were extracted, truncated, translated, and assembled for genotype and HIV-1 coreceptor tropism determination. DeepGen HIV consistently detected both minority drug-resistant viruses and non-R5 HIV-1 variants from clinical specimens with viral loads of ≥1,000 copies/ml and from B and non-B subtypes. Additional mutations associated with resistance to PR, RT, and IN inhibitors, previously undetected by standard (Sanger) population sequencing, were reliably identified at frequencies as low as 1%. DeepGen HIV results correlated with phenotypic (original Trofile, 92%; enhanced-sensitivity Trofile assay [ESTA], 80%; TROCAI, 81%; and VeriTrop, 80%) and genotypic (population sequencing/Geno2Pheno with a 10% false-positive rate [FPR], 84%) HIV-1 tropism test results. DeepGen HIV (83%) and Trofile (85%) showed similar concordances with the clinical response following an 8-day course of maraviroc monotherapy (MCT). In summary, this novel all-inclusive HIV-1 genotypic and coreceptor tropism assay, based on deep sequencing of the PR, RT, IN, and V3 regions, permits simultaneous multiplex detection of low-level drug-resistant and/or non-R5 viruses in up to 96 clinical samples. This comprehensive test, the first of its class, will be instrumental in the development of new antiretroviral drugs and, more importantly, will aid in the treatment and management of HIV-infected individuals. PMID:24468782

  5. Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions

    PubMed Central

    2014-01-01

    Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. PMID:24428920

  6. Prediction of Bispectral Index during Target-controlled Infusion of Propofol and Remifentanil: A Deep Learning Approach.

    PubMed

    Lee, Hyung-Chul; Ryu, Ho-Geol; Chung, Eun-Jin; Jung, Chul-Woo

    2018-03-01

    The discrepancy between predicted effect-site concentration and measured bispectral index is problematic during intravenous anesthesia with target-controlled infusion of propofol and remifentanil. We hypothesized that bispectral index during total intravenous anesthesia would be more accurately predicted by a deep learning approach. Long short-term memory and the feed-forward neural network were sequenced to simulate the pharmacokinetic and pharmacodynamic parts of an empirical model, respectively, to predict intraoperative bispectral index during combined use of propofol and remifentanil. Inputs of long short-term memory were infusion histories of propofol and remifentanil, which were retrieved from target-controlled infusion pumps for 1,800 s at 10-s intervals. Inputs of the feed-forward network were the outputs of long short-term memory and demographic data such as age, sex, weight, and height. The final output of the feed-forward network was the bispectral index. The performance of bispectral index prediction was compared between the deep learning model and previously reported response surface model. The model hyperparameters comprised 8 memory cells in the long short-term memory layer and 16 nodes in the hidden layer of the feed-forward network. The model training and testing were performed with separate data sets of 131 and 100 cases. The concordance correlation coefficient (95% CI) were 0.561 (0.560 to 0.562) in the deep learning model, which was significantly larger than that in the response surface model (0.265 [0.263 to 0.266], P < 0.001). The deep learning model-predicted bispectral index during target-controlled infusion of propofol and remifentanil more accurately compared to the traditional model. The deep learning approach in anesthetic pharmacology seems promising because of its excellent performance and extensibility.

  7. Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

    PubMed

    Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

    2016-04-01

    Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  8. Genomic analysis suggests that mRNA destabilization by the microprocessor is specialized for the auto-regulation of Dgcr8.

    PubMed

    Shenoy, Archana; Blelloch, Robert

    2009-09-11

    The Microprocessor, containing the RNA binding protein Dgcr8 and RNase III enzyme Drosha, is responsible for processing primary microRNAs to precursor microRNAs. The Microprocessor regulates its own levels by cleaving hairpins in the 5'UTR and coding region of the Dgcr8 mRNA, thereby destabilizing the mature transcript. To determine whether the Microprocessor has a broader role in directly regulating other coding mRNA levels, we integrated results from expression profiling and ultra high-throughput deep sequencing of small RNAs. Expression analysis of mRNAs in wild-type, Dgcr8 knockout, and Dicer knockout mouse embryonic stem (ES) cells uncovered mRNAs that were specifically upregulated in the Dgcr8 null background. A number of these transcripts had evolutionarily conserved predicted hairpin targets for the Microprocessor. However, analysis of deep sequencing data of 18 to 200nt small RNAs in mouse ES, HeLa, and HepG2 indicates that exonic sequence reads that map in a pattern consistent with Microprocessor activity are unique to Dgcr8. We conclude that the Microprocessor's role in directly destabilizing coding mRNAs is likely specifically targeted to Dgcr8 itself, suggesting a specialized cellular mechanism for gene auto-regulation.

  9. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    PubMed Central

    2011-01-01

    Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs), have been estimated using expressed sequence tag (EST) libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq) now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal samples and 9 lung cancer cell lines. Conclusions Deep transcriptional sequencing and analysis with targeted and spliced alignment methods can effectively identify TIC events across the genome in individual tissues. Prostate and reference samples exhibit a wide range of TIC events, involving more genes than estimated previously using ESTs. Tissue specificity of TIC events is correlated with expression patterns of the upstream gene. Some TIC events, such as MSMB-NCOA4, may play functional roles in cancer. PMID:21261984

  10. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers.

    PubMed

    Hou, Weiguo; Wang, Shang; Briggs, Brandon R; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

    2018-01-01

    Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  11. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    PubMed Central

    Hou, Weiguo; Wang, Shang; Briggs, Brandon R.; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

    2018-01-01

    Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  12. Characterization of skin ulceration syndrome associated microRNAs in sea cucumber Apostichopus japonicus by deep sequencing.

    PubMed

    Li, Chenghua; Feng, Weida; Qiu, Lihua; Xia, Changge; Su, Xiurong; Jin, Chunhua; Zhou, Tingting; Zeng, Yuan; Li, Taiwu

    2012-08-01

    MicroRNAs (miRNAs) constitute a family of small RNA species which have been demonstrated to be one of key effectors in mediating host-pathogen interaction. In this study, two haemocytes miRNA libraries were constructed with deep sequenced by illumina Hiseq2000 from healthy (L1) and skin ulceration syndrome Apostichopus japonicus (L2). The high throughput solexa sequencing resulted in 9,579,038 and 7,742,558 clean data from L1 and L2, respectively. Sequences analysis revealed that 40 conserved miRNAs were found in both libraries, in which let-7 and mir-125 were speculated to be clustered together and expressed accordingly. Eighty-six miRNA candidates were also identified by reference genome search and stem-loop structure prediction. Importantly, mir-31 and mir-2008 displayed significant differential expression between the two libraries according to FPKM model, which might be considered as promising targets for elucidating the intrinsic mechanism of skin ulceration syndrome outbreak in the species. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.

    PubMed

    Wong, Lai-Ping; Lai, Jason Kuan-Han; Saw, Woei-Yuh; Ong, Rick Twee-Hee; Cheng, Anthony Youzhi; Pillai, Nisha Esakimuthu; Liu, Xuanyao; Xu, Wenting; Chen, Peng; Foo, Jia-Nee; Tan, Linda Wei-Lin; Koo, Seok-Hwee; Soong, Richie; Wenk, Markus Rene; Lim, Wei-Yen; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2014-05-01

    South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.

  14. Poly(A)-tag deep sequencing data processing to extract poly(A) sites.

    PubMed

    Wu, Xiaohui; Ji, Guoli; Li, Qingshun Quinn

    2015-01-01

    Polyadenylation [poly(A)] is an essential posttranscriptional processing step in the maturation of eukaryotic mRNA. The advent of next-generation sequencing (NGS) technology has offered feasible means to generate large-scale data and new opportunities for intensive study of polyadenylation, particularly deep sequencing of the transcriptome targeting the junction of 3'-UTR and the poly(A) tail of the transcript. To take advantage of this unprecedented amount of data, we present an automated workflow to identify polyadenylation sites by integrating NGS data cleaning, processing, mapping, normalizing, and clustering. In this pipeline, a series of Perl scripts are seamlessly integrated to iteratively map the single- or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same genome coordinate are grouped into one cleavage site, and the internal priming artifacts removed. Then the ambiguous region is introduced to parse the genome annotation for cleavage site clustering. Finally, cleavage sites within a close range of 24 nucleotides and from different samples can be clustered into poly(A) clusters. This procedure could be used to identify thousands of reliable poly(A) clusters from millions of NGS sequences in different tissues or treatments.

  15. Acute multi-sgRNA knockdown of KEOPS complex genes reproduces the microcephaly phenotype of the stable knockout zebrafish model.

    PubMed

    Jobst-Schwan, Tilman; Schmidt, Johanna Magdalena; Schneider, Ronen; Hoogstraten, Charlotte A; Ullmann, Jeremy F P; Schapiro, David; Majmundar, Amar J; Kolb, Amy; Eddy, Kaitlyn; Shril, Shirlee; Braun, Daniela A; Poduri, Annapurna; Hildebrandt, Friedhelm

    2018-01-01

    Until recently, morpholino oligonucleotides have been widely employed in zebrafish as an acute and efficient loss-of-function assay. However, off-target effects and reproducibility issues when compared to stable knockout lines have compromised their further use. Here we employed an acute CRISPR/Cas approach using multiple single guide RNAs targeting simultaneously different positions in two exemplar genes (osgep or tprkb) to increase the likelihood of generating mutations on both alleles in the injected F0 generation and to achieve a similar effect as morpholinos but with the reproducibility of stable lines. This multi single guide RNA approach resulted in median likelihoods for at least one mutation on each allele of >99% and sgRNA specific insertion/deletion profiles as revealed by deep-sequencing. Immunoblot showed a significant reduction for Osgep and Tprkb proteins. For both genes, the acute multi-sgRNA knockout recapitulated the microcephaly phenotype and reduction in survival that we observed previously in stable knockout lines, though milder in the acute multi-sgRNA knockout. Finally, we quantify the degree of mutagenesis by deep sequencing, and provide a mathematical model to quantitate the chance for a biallelic loss-of-function mutation. Our findings can be generalized to acute and stable CRISPR/Cas targeting for any zebrafish gene of interest.

  16. Genetic epidemiology of pharmacogenetic variants in South East Asian Malays using whole-genome sequences.

    PubMed

    Sivadas, A; Salleh, M Z; Teh, L K; Scaria, V

    2017-10-01

    Expanding the scope of pharmacogenomic research by including multiple global populations is integral to building robust evidence for its clinical translation. Deep whole-genome sequencing of diverse ethnic populations provides a unique opportunity to study rare and common pharmacogenomic markers that often vary in frequency across populations. In this study, we aim to build a diverse map of pharmacogenetic variants in South East Asian (SEA) Malay population using deep whole-genome sequences of 100 healthy SEA Malay individuals. We investigated the allelic diversity of potentially deleterious pharmacogenomic variants in SEA Malay population. Our analysis revealed 227 common and 466 rare potentially functional single nucleotide variants (SNVs) in 437 pharmacogenomic genes involved in drug metabolism, transport and target genes, including 74 novel variants. This study has created one of the most comprehensive maps of pharmacogenetic markers in any population from whole genomes and will hugely benefit pharmacogenomic investigations and drug dosage recommendations in SEA Malays.

  17. Tracking the origins and drivers of subclonal metastatic expansion in prostate cancer

    DOE PAGES

    Hong, Matthew K. H.; Macintyre, Geoff; Wedge, David C.; ...

    2015-04-01

    Tumour heterogeneity in primary prostate cancer is a well-established phenomenon. However, how the subclonal diversity of tumours changes during metastasis and progression to lethality is poorly understood. Here we reveal the precise direction of metastatic spread across four lethal prostate cancer patients using whole-genome and ultra-deep targeted sequencing of longitudinally collected primary and metastatic tumours. We find one case of metastatic spread to the surgical bed causing local recurrence, and another case of cross-metastatic site seeding combining with dynamic remoulding of subclonal mixtures in response to therapy. By ultra-deep sequencing end-stage blood, we detect both metastatic and primary tumour clones,more » even years after removal of the prostate. As a result, analysis of mutations associated with metastasis reveals an enrichment of TP53 mutations, and additional sequencing of metastases from 19 patients demonstrates that acquisition of TP53 mutations is linked with the expansion of subclones with metastatic potential which we can detect in the blood.« less

  18. Tracking the origins and drivers of subclonal metastatic expansion in prostate cancer.

    PubMed

    Hong, Matthew K H; Macintyre, Geoff; Wedge, David C; Van Loo, Peter; Patel, Keval; Lunke, Sebastian; Alexandrov, Ludmil B; Sloggett, Clare; Cmero, Marek; Marass, Francesco; Tsui, Dana; Mangiola, Stefano; Lonie, Andrew; Naeem, Haroon; Sapre, Nikhil; Phal, Pramit M; Kurganovs, Natalie; Chin, Xiaowen; Kerger, Michael; Warren, Anne Y; Neal, David; Gnanapragasam, Vincent; Rosenfeld, Nitzan; Pedersen, John S; Ryan, Andrew; Haviv, Izhak; Costello, Anthony J; Corcoran, Niall M; Hovens, Christopher M

    2015-04-01

    Tumour heterogeneity in primary prostate cancer is a well-established phenomenon. However, how the subclonal diversity of tumours changes during metastasis and progression to lethality is poorly understood. Here we reveal the precise direction of metastatic spread across four lethal prostate cancer patients using whole-genome and ultra-deep targeted sequencing of longitudinally collected primary and metastatic tumours. We find one case of metastatic spread to the surgical bed causing local recurrence, and another case of cross-metastatic site seeding combining with dynamic remoulding of subclonal mixtures in response to therapy. By ultra-deep sequencing end-stage blood, we detect both metastatic and primary tumour clones, even years after removal of the prostate. Analysis of mutations associated with metastasis reveals an enrichment of TP53 mutations, and additional sequencing of metastases from 19 patients demonstrates that acquisition of TP53 mutations is linked with the expansion of subclones with metastatic potential which we can detect in the blood.

  19. High-speed railway real-time localization auxiliary method based on deep neural network

    NASA Astrophysics Data System (ADS)

    Chen, Dongjie; Zhang, Wensheng; Yang, Yang

    2017-11-01

    High-speed railway intelligent monitoring and management system is composed of schedule integration, geographic information, location services, and data mining technology for integration of time and space data. Assistant localization is a significant submodule of the intelligent monitoring system. In practical application, the general access is to capture the image sequences of the components by using a high-definition camera, digital image processing technique and target detection, tracking and even behavior analysis method. In this paper, we present an end-to-end character recognition method based on a deep CNN network called YOLO-toc for high-speed railway pillar plate number. Different from other deep CNNs, YOLO-toc is an end-to-end multi-target detection framework, furthermore, it exhibits a state-of-art performance on real-time detection with a nearly 50fps achieved on GPU (GTX960). Finally, we realize a real-time but high-accuracy pillar plate number recognition system and integrate natural scene OCR into a dedicated classification YOLO-toc model.

  20. Rapid Creation and Quantitative Monitoring of High Coverage shRNA Libraries

    PubMed Central

    Bassik, Michael C.; Lebbink, Robert Jan; Churchman, L. Stirling; Ingolia, Nicholas T.; Patena, Weronika; LeProust, Emily M.; Schuldiner, Maya; Weissman, Jonathan S.; McManus, Michael T.

    2009-01-01

    Short hairpin RNA (shRNA) libraries are limited by the low efficacy of many shRNAs, giving false negatives, and off-target effects, giving false positives. Here we present a strategy for rapidly creating expanded shRNA pools (∼30 shRNAs/gene) that are analyzed by deep-sequencing (EXPAND). This approach enables identification of multiple effective target-specific shRNAs from a complex pool, allowing a rigorous statistical evaluation of whether a gene is a true hit. PMID:19448642

  1. DeepBase: annotation and discovery of microRNAs and other noncoding RNAs from deep-sequencing data.

    PubMed

    Yang, Jian-Hua; Qu, Liang-Hu

    2012-01-01

    Recent advances in high-throughput deep-sequencing technology have produced large numbers of short and long RNA sequences and enabled the detection and profiling of known and novel microRNAs (miRNAs) and other noncoding RNAs (ncRNAs) at unprecedented sensitivity and depth. In this chapter, we describe the use of deepBase, a database that we have developed to integrate all public deep-sequencing data and to facilitate the comprehensive annotation and discovery of miRNAs and other ncRNAs from these data. deepBase provides an integrative, interactive, and versatile web graphical interface to evaluate miRBase-annotated miRNA genes and other known ncRNAs, explores the expression patterns of miRNAs and other ncRNAs, and discovers novel miRNAs and other ncRNAs from deep-sequencing data. deepBase also provides a deepView genome browser to comparatively analyze these data at multiple levels. deepBase is available at http://deepbase.sysu.edu.cn/.

  2. Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma.

    PubMed

    Wrzeszczynski, Kazimierz O; Frank, Mayu O; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A; Moore Vogel, Julia L; Bruce, Jeffrey N; Lassman, Andrew B; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V; Zody, Michael C; Jobanputra, Vaidehi; Royyuru, Ajay K; Darnell, Robert B

    2017-08-01

    To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. NCT02725684.

  3. Making sense of deep sequencing

    PubMed Central

    Goldman, D.; Domschke, K.

    2016-01-01

    This review, the first of an occasional series, tries to make sense of the concepts and uses of deep sequencing of polynucleic acids (DNA and RNA). Deep sequencing, synonymous with next-generation sequencing, high-throughput sequencing and massively parallel sequencing, includes whole genome sequencing but is more often and diversely applied to specific parts of the genome captured in different ways, for example the highly expressed portion of the genome known as the exome and portions of the genome that are epigenetically marked either by DNA methylation, the binding of proteins including histones, or that are in different configurations and thus more or less accessible to enzymes that cleave DNA. Deep sequencing of RNA (RNASeq) reverse-transcribed to complementary DNA is invaluable for measuring RNA expression and detecting changes in RNA structure. Important concepts in deep sequencing include the length and depth of sequence reads, mapping and assembly of reads, sequencing error, haplotypes, and the propensity of deep sequencing, as with other types of ‘big data’, to generate large numbers of errors, requiring monitoring for methodologic biases and strategies for replication and validation. Deep sequencing yields a unique genetic fingerprint that can be used to identify a person, and a trove of predictors of genetic medical diseases. Deep sequencing to identify epigenetic events including changes in DNA methylation and RNA expression can reveal the history and impact of environmental exposures. Because of the power of sequencing to identify and deliver biomedically significant information about a person and their blood relatives, it creates ethical dilemmas and practical challenges in research and clinical care, for example the decision and procedures to report incidental findings that will increasingly and frequently be discovered. PMID:24925306

  4. A multiple-alignment based primer design algorithm for genetically highly variable DNA targets

    PubMed Central

    2013-01-01

    Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160

  5. Oasis 2: improved online analysis of small RNA-seq data.

    PubMed

    Rahman, Raza-Ur; Gautam, Abhivyakti; Bethune, Jörn; Sattar, Abdul; Fiosins, Maksims; Magruder, Daniel Sumner; Capece, Vincenzo; Shomroni, Orr; Bonn, Stefan

    2018-02-14

    Small RNA molecules play important roles in many biological processes and their dysregulation or dysfunction can cause disease. The current method of choice for genome-wide sRNA expression profiling is deep sequencing. Here we present Oasis 2, which is a new main release of the Oasis web application for the detection, differential expression, and classification of small RNAs in deep sequencing data. Compared to its predecessor Oasis, Oasis 2 features a novel and speed-optimized sRNA detection module that supports the identification of small RNAs in any organism with higher accuracy. Next to the improved detection of small RNAs in a target organism, the software now also recognizes potential cross-species miRNAs and viral and bacterial sRNAs in infected samples. In addition, novel miRNAs can now be queried and visualized interactively, providing essential information for over 700 high-quality miRNA predictions across 14 organisms. Robust biomarker signatures can now be obtained using the novel enhanced classification module. Oasis 2 enables biologists and medical researchers to rapidly analyze and query small RNA deep sequencing data with improved precision, recall, and speed, in an interactive and user-friendly environment. Oasis 2 is implemented in Java, J2EE, mysql, Python, R, PHP and JavaScript. It is freely available at https://oasis.dzne.de.

  6. A deep learning framework for causal shape transformation.

    PubMed

    Lore, Kin Gwn; Stoecklein, Daniel; Davies, Michael; Ganapathysubramanian, Baskar; Sarkar, Soumik

    2018-02-01

    Recurrent neural network (RNN) and Long Short-term Memory (LSTM) networks are the common go-to architecture for exploiting sequential information where the output is dependent on a sequence of inputs. However, in most considered problems, the dependencies typically lie in the latent domain which may not be suitable for applications involving the prediction of a step-wise transformation sequence that is dependent on the previous states only in the visible domain with a known terminal state. We propose a hybrid architecture of convolution neural networks (CNN) and stacked autoencoders (SAE) to learn a sequence of causal actions that nonlinearly transform an input visual pattern or distribution into a target visual pattern or distribution with the same support and demonstrated its practicality in a real-world engineering problem involving the physics of fluids. We solved a high-dimensional one-to-many inverse mapping problem concerning microfluidic flow sculpting, where the use of deep learning methods as an inverse map is very seldom explored. This work serves as a fruitful use-case to applied scientists and engineers in how deep learning can be beneficial as a solution for high-dimensional physical problems, and potentially opening doors to impactful advance in fields such as material sciences and medical biology where multistep topological transformations is a key element. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Selective ribosome profiling as a tool to study the interaction of chaperones and targeting factors with nascent polypeptide chains and ribosomes

    PubMed Central

    Becker, Annemarie H.; Oh, Eugene; Weissman, Jonathan S.; Kramer, Günter; Bukau, Bernd

    2014-01-01

    A plethora of factors is involved in the maturation of newly synthesized proteins, including chaperones, membrane targeting factors, and enzymes. Many factors act cotranslationally through association with ribosome-nascent chain complexes (RNCs), but their target specificities and modes of action remain poorly understood. We developed selective ribosome profiling (SeRP) to identify substrate pools and points of RNC engagement of these factors. SeRP is based on sequencing mRNA fragments covered by translating ribosomes (general ribosome profiling, RP), combined with a procedure to selectively isolate RNCs whose nascent polypeptides are associated with the factor of interest. Factor–RNC interactions are stabilized by crosslinking, the resulting factor–RNC adducts are then nuclease-treated to generate monosomes, and affinity-purified. The ribosome-extracted mRNA footprints are converted to DNA libraries for deep sequencing. The protocol is specified for general RP and SeRP in bacteria. It was first applied to the chaperone trigger factor and is readily adaptable to other cotranslationally acting factors, including eukaryotic factors. Factor–RNC purification and sequencing library preparation takes 7–8 days, sequencing and data analysis can be completed in 5–6 days. PMID:24136347

  8. De novo peptide sequencing by deep learning

    PubMed Central

    Tran, Ngoc Hieu; Zhang, Xianglilan; Xin, Lei; Shan, Baozhen; Li, Ming

    2017-01-01

    De novo peptide sequencing from tandem MS data is the key technology in proteomics for the characterization of proteins, especially for new sequences, such as mAbs. In this study, we propose a deep neural network model, DeepNovo, for de novo peptide sequencing. DeepNovo architecture combines recent advances in convolutional neural networks and recurrent neural networks to learn features of tandem mass spectra, fragment ions, and sequence patterns of peptides. The networks are further integrated with local dynamic programming to solve the complex optimization task of de novo sequencing. We evaluated the method on a wide variety of species and found that DeepNovo considerably outperformed state of the art methods, achieving 7.7–22.9% higher accuracy at the amino acid level and 38.1–64.0% higher accuracy at the peptide level. We further used DeepNovo to automatically reconstruct the complete sequences of antibody light and heavy chains of mouse, achieving 97.5–100% coverage and 97.2–99.5% accuracy, without assisting databases. Moreover, DeepNovo is retrainable to adapt to any sources of data and provides a complete end-to-end training and prediction solution to the de novo sequencing problem. Not only does our study extend the deep learning revolution to a new field, but it also shows an innovative approach in solving optimization problems by using deep learning and dynamic programming. PMID:28720701

  9. Deep sequencing of hepatitis C virus hypervariable region 1 reveals no correlation between genetic heterogeneity and antiviral treatment outcome

    PubMed Central

    2014-01-01

    Background Hypervariable region 1 (HVR1) contained within envelope protein 2 (E2) gene is the most variable part of HCV genome and its translation product is a major target for the host immune response. Variability within HVR1 may facilitate evasion of the immune response and could affect treatment outcome. The aim of the study was to analyze the impact of HVR1 heterogeneity employing sensitive ultra-deep sequencing, on the outcome of PEG-IFN-α (pegylated interferon α) and ribavirin treatment. Methods HVR1 sequences were amplified from pretreatment serum samples of 25 patients infected with genotype 1b HCV (12 responders and 13 non-responders) and were subjected to pyrosequencing (GS Junior, 454/Roche). Reads were corrected for sequencing error using ShoRAH software, while population reconstruction was done using three different minimal variant frequency cut-offs of 1%, 2% and 5%. Statistical analysis was done using Mann–Whitney and Fisher’s exact tests. Results Complexity, Shannon entropy, nucleotide diversity per site, genetic distance and the number of genetic substitutions were not significantly different between responders and non-responders, when analyzing viral populations at any of the three frequencies (≥1%, ≥2% and ≥5%). When clonal sample was used to determine pyrosequencing error, 4% of reads were found to be incorrect and the most abundant variant was present at a frequency of 1.48%. Use of ShoRAH reduced the sequencing error to 1%, with the most abundant erroneous variant present at frequency of 0.5%. Conclusions While deep sequencing revealed complex genetic heterogeneity of HVR1 in chronic hepatitis C patients, there was no correlation between treatment outcome and any of the analyzed quasispecies parameters. PMID:25016390

  10. Cultivation-dependent and cultivation-independent characterization of hydrocarbon-degrading bacteria in Guaymas Basin sediments.

    PubMed

    Gutierrez, Tony; Biddle, Jennifer F; Teske, Andreas; Aitken, Michael D

    2015-01-01

    Marine hydrocarbon-degrading bacteria perform a fundamental role in the biodegradation of crude oil and its petrochemical derivatives in coastal and open ocean environments. However, there is a paucity of knowledge on the diversity and function of these organisms in deep-sea sediment. Here we used stable-isotope probing (SIP), a valuable tool to link the phylogeny and function of targeted microbial groups, to investigate polycyclic aromatic hydrocarbon (PAH)-degrading bacteria under aerobic conditions in sediments from Guaymas Basin with uniformly labeled [(13)C]-phenanthrene (PHE). The dominant sequences in clone libraries constructed from (13)C-enriched bacterial DNA (from PHE enrichments) were identified to belong to the genus Cycloclasticus. We used quantitative PCR primers targeting the 16S rRNA gene of the SIP-identified Cycloclasticus to determine their abundance in sediment incubations amended with unlabeled PHE and showed substantial increases in gene abundance during the experiments. We also isolated a strain, BG-2, representing the SIP-identified Cycloclasticus sequence (99.9% 16S rRNA gene sequence identity), and used this strain to provide direct evidence of PHE degradation and mineralization. In addition, we isolated Halomonas, Thalassospira, and Lutibacterium sp. with demonstrable PHE-degrading capacity from Guaymas Basin sediment. This study demonstrates the value of coupling SIP with cultivation methods to identify and expand on the known diversity of PAH-degrading bacteria in the deep-sea.

  11. Cultivation-dependent and cultivation-independent characterization of hydrocarbon-degrading bacteria in Guaymas Basin sediments

    PubMed Central

    Gutierrez, Tony; Biddle, Jennifer F.; Teske, Andreas; Aitken, Michael D.

    2015-01-01

    Marine hydrocarbon-degrading bacteria perform a fundamental role in the biodegradation of crude oil and its petrochemical derivatives in coastal and open ocean environments. However, there is a paucity of knowledge on the diversity and function of these organisms in deep-sea sediment. Here we used stable-isotope probing (SIP), a valuable tool to link the phylogeny and function of targeted microbial groups, to investigate polycyclic aromatic hydrocarbon (PAH)-degrading bacteria under aerobic conditions in sediments from Guaymas Basin with uniformly labeled [13C]-phenanthrene (PHE). The dominant sequences in clone libraries constructed from 13C-enriched bacterial DNA (from PHE enrichments) were identified to belong to the genus Cycloclasticus. We used quantitative PCR primers targeting the 16S rRNA gene of the SIP-identified Cycloclasticus to determine their abundance in sediment incubations amended with unlabeled PHE and showed substantial increases in gene abundance during the experiments. We also isolated a strain, BG-2, representing the SIP-identified Cycloclasticus sequence (99.9% 16S rRNA gene sequence identity), and used this strain to provide direct evidence of PHE degradation and mineralization. In addition, we isolated Halomonas, Thalassospira, and Lutibacterium sp. with demonstrable PHE-degrading capacity from Guaymas Basin sediment. This study demonstrates the value of coupling SIP with cultivation methods to identify and expand on the known diversity of PAH-degrading bacteria in the deep-sea. PMID:26217326

  12. Uncovering microRNA-mediated response to SO2 stress in Arabidopsis thaliana by deep sequencing.

    PubMed

    Li, Lihong; Xue, Meizhao; Yi, Huilan

    2016-10-05

    Sulfur dioxide (SO2) is a major air pollutant and has significant impacts on plants. MicroRNAs (miRNAs) are a class of gene expression regulators that play important roles in response to environmental stresses. In this study, deep sequencing was used for genome-wide identification of miRNAs and their expression profiles in response to SO2 stress in Arabidopsis thaliana shoots. A total of 27 conserved miRNAs and 5 novel miRNAs were found to be differentially expressed under SO2 stress. qRT-PCR analysis showed mostly negative correlation between miRNA accumulation and target gene mRNA abundance, suggesting regulatory roles of these miRNAs during SO2 exposure. The target genes of SO2-responsive miRNAs encode transcription factors and proteins that regulate auxin signaling and stress response, and the miRNAs-mediated suppression of these genes could improve plant resistance to SO2 stress. Promoter sequence analysis of genes encoding SO2-responsive miRNAs showed that stress-responsive and phytohormone-related cis-regulatory elements occurred frequently, providing additional evidence of the involvement of miRNAs in adaption to SO2 stress. This study represents a comprehensive expression profiling of SO2-responsive miRNAs in Arabidopsis and broads our perspective on the ubiquitous regulatory roles of miRNAs under stress conditions. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. Cryopyrin-associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation

    PubMed Central

    Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M.; Walts, Avram D.; Hoffmann, Patrycja; Remmers, Elaine F.; Kastner, Daniel L.; Ombrello, Amanda K.

    2015-01-01

    Objective To identify the cause of disease in an adult patient presenting with recent onset fevers, chills, urticaria, fatigue, and profound myalgia, who was negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. Methods We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient’s whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively-selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. Results We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3%–16.8% in monocytes and 15.2%–18% in granulocytes; Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, buccal cells, and in the patient’s cultured fibroblasts. Conclusion These data document the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively-parallel sequencing in clinical diagnosis. PMID:25988971

  14. RISC RNA sequencing for context-specific identification of in vivo miR targets

    PubMed Central

    Matkovich, Scot J; Van Booven, Derek J; Eschenbacher, William H; Dorn, Gerald W

    2010-01-01

    Rationale MicroRNAs (miRs) are expanding our understanding of cardiac disease and have the potential to transform cardiovascular therapeutics. One miR can target hundreds of individual mRNAs, but existing methodologies are not sufficient to accurately and comprehensively identify these mRNA targets in vivo. Objective To develop methods permitting identification of in vivo miR targets in an unbiased manner, using massively parallel sequencing of mouse cardiac transcriptomes in combination with sequencing of mRNA associated with mouse cardiac RNA-induced silencing complexes (RISCs). Methods and Results We optimized techniques for expression profiling small amounts of RNA without introducing amplification bias, and applied this to anti-Argonaute 2 immunoprecipitated RISCs (RISC-Seq) from mouse hearts. By comparing RNA-sequencing results of cardiac RISC and transcriptome from the same individual hearts, we defined 1,645 mRNAs consistently targeted to mouse cardiac RISCs. We employed this approach in hearts overexpressing miRs from Myh6 promoter-driven precursors (programmed RISC-Seq) to identify 209 in vivo targets of miR-133a and 81 in vivo targets of miR-499. Consistent with the fact that miR-133a and miR-499 have widely differing ‘seed’ sequences and belong to different miR families, only 6 targets were common to miR-133a- and miR-499-programmed hearts. Conclusions RISC-sequencing is a highly sensitive method for general RISC profiling and individual miR target identification in biological context, and is applicable to any tissue and any disease state. Summary MicroRNAs (miRs) are key regulators of mRNA translation in health and disease. While bioinformatic predictions suggest that a single miR may target hundreds of mRNAs, the number of experimentally verified targets of miRs is low. To enable comprehensive, unbiased examination of miR targets, we have performed deep RNA sequencing of cardiac transcriptomes in parallel with cardiac RNA-induced silencing complex (RISC)-associated RNAs (the RISCome), called RISC sequencing. We developed methods that did not require cross-linking of RNAs to RISCs or amplification of mRNA prior to sequencing, making it possible to rapidly perform RISC sequencing from intact tissue while avoiding amplification bias. Comparison of RISCome with transcriptome expression defined the degree of RISC enrichment for each mRNA. The majority of the mRNAs enriched in wild-type cardiac RISComes compared to transcriptomes were bioinformatically predicted to be targets of at least 1 of 139 cardiac-expressed miRs. Programming cardiomyocyte RISCs via transgenic overexpression in adult hearts of miR-133a or miR-499, two miRs that contain entirely different ‘seed’ sequences, elicited differing profiles of RISC-targeted mRNAs. Thus, RISC sequencing represents a highly sensitive method for general RISC profiling and individual miR target identification in biological context. PMID:21030712

  15. Exome Sequencing and the Management of Neurometabolic Disorders.

    PubMed

    Tarailo-Graovac, Maja; Shyr, Casper; Ross, Colin J; Horvath, Gabriella A; Salvarinova, Ramona; Ye, Xin C; Zhang, Lin-Hua; Bhavsar, Amit P; Lee, Jessica J Y; Drögemöller, Britt I; Abdelsayed, Mena; Alfadhel, Majid; Armstrong, Linlea; Baumgartner, Matthias R; Burda, Patricie; Connolly, Mary B; Cameron, Jessie; Demos, Michelle; Dewan, Tammie; Dionne, Janis; Evans, A Mark; Friedman, Jan M; Garber, Ian; Lewis, Suzanne; Ling, Jiqiang; Mandal, Rupasri; Mattman, Andre; McKinnon, Margaret; Michoulas, Aspasia; Metzger, Daniel; Ogunbayo, Oluseye A; Rakic, Bojana; Rozmus, Jacob; Ruben, Peter; Sayson, Bryan; Santra, Saikat; Schultz, Kirk R; Selby, Kathryn; Shekel, Paul; Sirrs, Sandra; Skrypnyk, Cristina; Superti-Furga, Andrea; Turvey, Stuart E; Van Allen, Margot I; Wishart, David; Wu, Jiang; Wu, John; Zafeiriou, Dimitrios; Kluijtmans, Leo; Wevers, Ron A; Eydoux, Patrice; Lehman, Anna M; Vallance, Hilary; Stockler-Ipsiroglu, Sylvia; Sinclair, Graham; Wasserman, Wyeth W; van Karnebeek, Clara D

    2016-06-09

    Whole-exome sequencing has transformed gene discovery and diagnosis in rare diseases. Translation into disease-modifying treatments is challenging, particularly for intellectual developmental disorder. However, the exception is inborn errors of metabolism, since many of these disorders are responsive to therapy that targets pathophysiological features at the molecular or cellular level. To uncover the genetic basis of potentially treatable inborn errors of metabolism, we combined deep clinical phenotyping (the comprehensive characterization of the discrete components of a patient's clinical and biochemical phenotype) with whole-exome sequencing analysis through a semiautomated bioinformatics pipeline in consecutively enrolled patients with intellectual developmental disorder and unexplained metabolic phenotypes. We performed whole-exome sequencing on samples obtained from 47 probands. Of these patients, 6 were excluded, including 1 who withdrew from the study. The remaining 41 probands had been born to predominantly nonconsanguineous parents of European descent. In 37 probands, we identified variants in 2 genes newly implicated in disease, 9 candidate genes, 22 known genes with newly identified phenotypes, and 9 genes with expected phenotypes; in most of the genes, the variants were classified as either pathogenic or probably pathogenic. Complex phenotypes of patients in five families were explained by coexisting monogenic conditions. We obtained a diagnosis in 28 of 41 probands (68%) who were evaluated. A test of a targeted intervention was performed in 18 patients (44%). Deep phenotyping and whole-exome sequencing in 41 probands with intellectual developmental disorder and unexplained metabolic abnormalities led to a diagnosis in 68%, the identification of 11 candidate genes newly implicated in neurometabolic disease, and a change in treatment beyond genetic counseling in 44%. (Funded by BC Children's Hospital Foundation and others.).

  16. Exome Sequencing and the Management of Neurometabolic Disorders

    PubMed Central

    Tarailo-Graovac, M.; Shyr, C.; Ross, C.J.; Horvath, G.A.; Salvarinova, R.; Ye, X.C.; Zhang, L.-H.; Bhavsar, A.P.; Lee, J.J.Y.; Drögemöller, B.I.; Abdelsayed, M.; Alfadhel, M.; Armstrong, L.; Baumgartner, M.R.; Burda, P.; Connolly, M.B.; Cameron, J.; Demos, M.; Dewan, T.; Dionne, J.; Evans, A.M.; Friedman, J.M.; Garber, I.; Lewis, S.; Ling, J.; Mandal, R.; Mattman, A.; McKinnon, M.; Michoulas, A.; Metzger, D.; Ogunbayo, O.A.; Rakic, B.; Rozmus, J.; Ruben, P.; Sayson, B.; Santra, S.; Schultz, K.R.; Selby, K.; Shekel, P.; Sirrs, S.; Skrypnyk, C.; Superti-Furga, A.; Turvey, S.E.; Van Allen, M.I.; Wishart, D.; Wu, J.; Wu, J.; Zafeiriou, D.; Kluijtmans, L.; Wevers, R.A.; Eydoux, P.; Lehman, A.M.; Vallance, H.; Stockler-Ipsiroglu, S.; Sinclair, G.; Wasserman, W.W.; van Karnebeek, C.D.

    2016-01-01

    BACKGROUND Whole-exome sequencing has transformed gene discovery and diagnosis in rare diseases. Translation into disease-modifying treatments is challenging, particularly for intellectual developmental disorder. However, the exception is inborn errors of metabolism, since many of these disorders are responsive to therapy that targets pathophysiological features at the molecular or cellular level. METHODS To uncover the genetic basis of potentially treatable inborn errors of metabolism, we combined deep clinical phenotyping (the comprehensive characterization of the discrete components of a patient’s clinical and biochemical phenotype) with whole-exome sequencing analysis through a semiautomated bioinformatics pipeline in consecutively enrolled patients with intellectual developmental disorder and unexplained metabolic phenotypes. RESULTS We performed whole-exome sequencing on samples obtained from 47 probands. Of these patients, 6 were excluded, including 1 who withdrew from the study. The remaining 41 probands had been born to predominantly nonconsanguineous parents of European descent. In 37 probands, we identified variants in 2 genes newly implicated in disease, 9 candidate genes, 22 known genes with newly identified phenotypes, and 9 genes with expected phenotypes; in most of the genes, the variants were classified as either pathogenic or probably pathogenic. Complex phenotypes of patients in five families were explained by coexisting monogenic conditions. We obtained a diagnosis in 28 of 41 probands (68%) who were evaluated. A test of a targeted intervention was performed in 18 patients (44%). CONCLUSIONS Deep phenotyping and whole-exome sequencing in 41 probands with intellectual developmental disorder and unexplained metabolic abnormalities led to a diagnosis in 68%, the identification of 11 candidate genes newly implicated in neurometabolic disease, and a change in treatment beyond genetic counseling in 44%. (Funded by BC Children’s Hospital Foundation and others.) PMID:27276562

  17. Cancer-Associated Mutations in Endometriosis without Cancer

    PubMed Central

    Anglesio, M.S.; Papadopoulos, N.; Ayhan, A.; Nazeran, T.M.; Noë, M.; Horlings, H.M.; Lum, A.; Jones, S.; Senz, J.; Seckin, T.; Ho, J.; Wu, R.-C.; Lac, V.; Ogawa, H.; Tessier-Cloutier, B.; Alhassan, R.; Wang, A.; Wang, Y.; Cohen, J.D.; Wong, F.; Hasanovic, A.; Orr, N.; Zhang, M.; Popoli, M.; McMahon, W.; Wood, L.D.; Mattox, A.; Allaire, C.; Segars, J.; Williams, C.; Tomasetti, C.; Boyd, N.; Kinzler, K.W.; Gilks, C.B.; Diaz, L.; Wang, T.-L.; Vogelstein, B.; Yong, P.J.; Huntsman, D.G.; Shih, I.-M.

    2017-01-01

    BACKGROUND Endometriosis, defined as the presence of ectopic endometrial stroma and epithelium, affects approximately 10% of reproductive-age women and can cause pelvic pain and infertility. Endometriotic lesions are considered to be benign inflammatory lesions but have cancerlike features such as local invasion and resistance to apoptosis. METHODS We analyzed deeply infiltrating endometriotic lesions from 27 patients by means of exomewide sequencing (24 patients) or cancer-driver targeted sequencing (3 patients). Mutations were validated with the use of digital genomic methods in micro-dissected epithelium and stroma. Epithelial and stromal components of lesions from an additional 12 patients were analyzed by means of a droplet digital polymerase-chain-reaction (PCR) assay for recurrent activating KRAS mutations. RESULTS Exome sequencing revealed somatic mutations in 19 of 24 patients (79%). Five patients harbored known cancer driver mutations in ARID1A, PIK3CA, KRAS, or PPP2R1A, which were validated by Safe-Sequencing System or immunohistochemical analysis. The likelihood of driver genes being affected at this rate in the absence of selection was estimated at P = 0.001 (binomial test). Targeted sequencing and a droplet digital PCR assay identified KRAS mutations in 2 of 3 patients and 3 of 12 patients, respectively, with mutations in the epithelium but not the stroma. One patient harbored two different KRAS mutations, c.35G→T and c.35G→C, and another carried identical KRAS c.35G→A mutations in three distinct lesions. CONCLUSIONS We found that lesions in deep infiltrating endometriosis, which are associated with virtually no risk of malignant transformation, harbor somatic cancer driver mutations. Ten of 39 deep infiltrating lesions (26%) carried driver mutations; all the tested somatic mutations appeared to be confined to the epithelial compartment of endometriotic lesions. PMID:28489996

  18. Analysis of deep learning methods for blind protein contact prediction in CASP12.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2018-03-01

    Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.

  19. A first insight into the occurrence and expression of functional amoA and accA genes of autotrophic and ammonia-oxidizing bathypelagic Crenarchaeota of Tyrrhenian Sea

    NASA Astrophysics Data System (ADS)

    Yakimov, Michail M.; Cono, Violetta La; Denaro, Renata

    2009-05-01

    The autotrophic and ammonia-oxidizing crenarchaeal assemblage at offshore site located in the deep Mediterranean (Tyrrhenian Sea, depth 3000 m) water was studied by PCR amplification of the key functional genes involved in energy (ammonia mono-oxygenase alpha subunit, amoA) and central metabolism (acetyl-CoA carboxylase alpha subunit, accA). Using two recently annotated genomes of marine crenarchaeons, an initial set of primers targeting archaeal accA-like genes was designed. Approximately 300 clones were analyzed, of which 100% of amoA library and almost 70% of accA library were unambiguously related to the corresponding genes from marine Crenarchaeota. Even though the acetyl-CoA carboxylase is phylogenetically not well conserved and the remaining clones were affiliated to various bacterial acetyl-CoA/propionyl-CoA carboxylase genes, the pool of archaeal sequences was applied for development of quantitative PCR analysis of accA-like distribution using TaqMan ® methodolgy. The archaeal accA gene fragments, together with alignable gene fragments from the Sargasso Sea and North Pacific Subtropical Gyre (ALOHA Station) metagenome databases, were analyzed by multiple sequence alignment. Two accA-like sequences, found in ALOHA Station at the depth of 4000 m, formed a deeply branched clade with 64% of all archaeal Tyrrhenian clones. No close relatives for residual 36% of clones, except of those recovered from Eastern Mediterranean, was found, suggesting the existence of a specific lineage of the crenarchaeal accA genes in deep Mediterranean water. Alignment of Mediterranean amoA sequences defined four cosmopolitan phylotypes of Crenarchaeota putative ammonia mono-oxygenase subunit A gene occurring in the water sample from the 3000 m depth. Without exception all phylotypes fell into Deep Marine Group I cluster that contain the vast majority of known sequences recovered from global deep-sea environment. Remarkably, three phylotypes accounted for 91% of all Mediterranean amoA clones and corresponded to the sequences retrieved from the less deep compartments of the world's ocean, most likely reflecting the higher temperature at the depth of the Mediterranean Sea. In order to verify whether these phylotypes might represent important Crenarchaeota in the functioning of the Mediterranean bathypelagic ecosystem, expression of crenarchaeal amoA gene was monitored by direct RNA retrieval and following analysis of amoA-related mRNA transcripts. Surprisingly, all mRNA-derived sequences formed a tight monophyletic group, which fell into large Shallow Marine Group I cluster with sequences retrieved from shallow (up to 200 m) waters, sediments and corals. This group was not detected in DNA-based clone library, obviously, due to an overwhelming dominance of the Deep Marine Group I. The failure to recover the amoA transcripts, related to Deep Marine Group I of Crenarchaeota, was unanticipated and likely resulted from the physiology of these strongly adapted deep-sea organisms. As far as all seawater samples were treated on-board under atmospheric pressure conditions and sunlight, the decompression and/or photoinhibition likely affected their metabolic activity, followed by the strong decay of gene expression.

  20. Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus.

    PubMed

    Zhang, Yan; An, Lin; Xu, Jie; Zhang, Bo; Zheng, W Jim; Hu, Ming; Tang, Jijun; Yue, Feng

    2018-02-21

    Although Hi-C technology is one of the most popular tools for studying 3D genome organization, due to sequencing cost, the resolution of most Hi-C datasets are coarse and cannot be used to link distal regulatory elements to their target genes. Here we develop HiCPlus, a computational approach based on deep convolutional neural network, to infer high-resolution Hi-C interaction matrices from low-resolution Hi-C data. We demonstrate that HiCPlus can impute interaction matrices highly similar to the original ones, while only using 1/16 of the original sequencing reads. We show that the models learned from one cell type can be applied to make predictions in other cell or tissue types. Our work not only provides a computational framework to enhance Hi-C data resolution but also reveals features underlying the formation of 3D chromatin interactions.

  1. HIV-1 RNAs are Not Part of the Argonaute 2 Associated RNA Interference Pathway in Macrophages.

    PubMed

    Vongrad, Valentina; Imig, Jochen; Mohammadi, Pejman; Kishore, Shivendra; Jaskiewicz, Lukasz; Hall, Jonathan; Günthard, Huldrych F; Beerenwinkel, Niko; Metzner, Karin J

    2015-01-01

    MiRNAs and other small noncoding RNAs (sncRNAs) are key players in post-transcriptional gene regulation. HIV-1 derived small noncoding RNAs (sncRNAs) have been described in HIV-1 infected cells, but their biological functions still remain to be elucidated. Here, we approached the question whether viral sncRNAs may play a role in the RNA interference (RNAi) pathway or whether viral mRNAs are targeted by cellular miRNAs in human monocyte derived macrophages (MDM). The incorporation of viral sncRNAs and/or their target RNAs into RNA-induced silencing complex was investigated using photoactivatable ribonucleoside-induced cross-linking and immunoprecipitation (PAR-CLIP) as well as high-throughput sequencing of RNA isolated by cross-linking immunoprecipitation (HITS-CLIP), which capture Argonaute2-bound miRNAs and their target RNAs. HIV-1 infected monocyte-derived macrophages (MDM) were chosen as target cells, as they have previously been shown to express HIV-1 sncRNAs. In addition, we applied small RNA deep sequencing to study differential cellular miRNA expression in HIV-1 infected versus non-infected MDMs. PAR-CLIP and HITS-CLIP data demonstrated the absence of HIV-1 RNAs in Ago2-RISC, although the presence of a multitude of HIV-1 sncRNAs in HIV-1 infected MDMs was confirmed by small RNA sequencing. Small RNA sequencing revealed that 1.4% of all sncRNAs were of HIV-1 origin. However, neither HIV-1 derived sncRNAs nor putative HIV-1 target sequences incorporated into Ago2-RISC were identified suggesting that HIV-1 sncRNAs are not involved in the canonical RNAi pathway nor is HIV-1 targeted by this pathway in HIV-1 infected macrophages.

  2. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  3. Acute multi-sgRNA knockdown of KEOPS complex genes reproduces the microcephaly phenotype of the stable knockout zebrafish model

    PubMed Central

    Schneider, Ronen; Hoogstraten, Charlotte A.; Schapiro, David; Majmundar, Amar J.; Kolb, Amy; Eddy, Kaitlyn; Shril, Shirlee; Braun, Daniela A.; Poduri, Annapurna

    2018-01-01

    Until recently, morpholino oligonucleotides have been widely employed in zebrafish as an acute and efficient loss-of-function assay. However, off-target effects and reproducibility issues when compared to stable knockout lines have compromised their further use. Here we employed an acute CRISPR/Cas approach using multiple single guide RNAs targeting simultaneously different positions in two exemplar genes (osgep or tprkb) to increase the likelihood of generating mutations on both alleles in the injected F0 generation and to achieve a similar effect as morpholinos but with the reproducibility of stable lines. This multi single guide RNA approach resulted in median likelihoods for at least one mutation on each allele of >99% and sgRNA specific insertion/deletion profiles as revealed by deep-sequencing. Immunoblot showed a significant reduction for Osgep and Tprkb proteins. For both genes, the acute multi-sgRNA knockout recapitulated the microcephaly phenotype and reduction in survival that we observed previously in stable knockout lines, though milder in the acute multi-sgRNA knockout. Finally, we quantify the degree of mutagenesis by deep sequencing, and provide a mathematical model to quantitate the chance for a biallelic loss-of-function mutation. Our findings can be generalized to acute and stable CRISPR/Cas targeting for any zebrafish gene of interest. PMID:29346415

  4. Molecular characterization of oral squamous cell carcinoma using targeted next-generation sequencing.

    PubMed

    Er, Tze-Kiong; Wang, Yen-Yun; Chen, Chih-Chieh; Herreros-Villanueva, Marta; Liu, Ta-Chih; Yuan, Shyng-Shiou F

    2015-10-01

    Many genetic factors play an important role in the development of oral squamous cell carcinoma. The aim of this study was to assess the mutational profile in oral squamous cell carcinoma using formalin-fixed, paraffin-embedded tumors from a Taiwanese population by performing targeted sequencing of 26 cancer-associated genes that are frequently mutated in solid tumors. Next-generation sequencing was performed in 50 formalin-fixed, paraffin-embedded tumor specimens obtained from patients with oral squamous cell carcinoma. Genetic alterations in the 26 cancer-associated genes were detected using a deep sequencing (>1000X) approach. TP53, PIK3CA, MET, APC, CDH1, and FBXW7 were most frequently mutated genes. Most remarkably, TP53 mutations and PIK3CA mutations, which accounted for 68% and 18% of tumors, respectively, were more prevalent in a Taiwanese population. Other genes including MET (4%), APC (4%), CDH1 (2%), and FBXW7 (2%) were identified in our population. In summary, our study shows the feasibility of performing targeted sequencing using formalin-fixed, paraffin-embedded samples. Additionally, this study also reports the mutational landscape of oral squamous cell carcinoma in the Taiwanese population. We believe that this study will shed new light on fundamental aspects in understanding the molecular pathogenesis of oral squamous cell carcinoma and may aid in the development of new targeted therapies. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. Water mass dynamics shape Ross Sea protist communities in mesopelagic and bathypelagic layers

    NASA Astrophysics Data System (ADS)

    Zoccarato, Luca; Pallavicini, Alberto; Cerino, Federica; Fonda Umani, Serena; Celussi, Mauro

    2016-12-01

    Deep-sea environments host the largest pool of microbes and represent the last largely unexplored and poorly known ecosystems on Earth. The Ross Sea is characterized by unique oceanographic dynamics and harbors several water masses deeply involved in cooling and ventilation of deep oceans. In this study the V9 region of the 18S rDNA was targeted and sequenced with the Ion Torrent high-throughput sequencing technology to unveil differences in protist communities (>2 μm) correlated with biogeochemical properties of the water masses. The analyzed samples were significantly different in terms of environmental parameters and community composition outlining significant structuring effects of temperature and salinity. Overall, Alveolata (especially Dinophyta), Stramenopiles and Excavata groups dominated mesopelagic and bathypelagic layers, and protist communities were shaped according to the biogeochemistry of the water masses (advection effect and mixing events). Newly-formed High Salinity Shelf Water (HSSW) was characterized by high relative abundance of phototrophic organisms that bloom at the surface during the austral summer. Oxygen-depleted Circumpolar Deep Water (CDW) showed higher abundance of Excavata, common bacterivores in deep water masses. At the shelf-break, Antarctic Bottom Water (AABW), formed by the entrainment of shelf waters in CDW, maintained the eukaryotic genetic signature typical of both parental water masses.

  6. DNA-Free Genetically Edited Grapevine and Apple Protoplast Using CRISPR/Cas9 Ribonucleoproteins.

    PubMed

    Malnoy, Mickael; Viola, Roberto; Jung, Min-Hee; Koo, Ok-Jae; Kim, Seokjoong; Kim, Jin-Soo; Velasco, Riccardo; Nagamangala Kanchiswamy, Chidananda

    2016-01-01

    The combined availability of whole genome sequences and genome editing tools is set to revolutionize the field of fruit biotechnology by enabling the introduction of targeted genetic changes with unprecedented control and accuracy, both to explore emergent phenotypes and to introduce new functionalities. Although plasmid-mediated delivery of genome editing components to plant cells is very efficient, it also presents some drawbacks, such as possible random integration of plasmid sequences in the host genome. Additionally, it may well be intercepted by current process-based GMO regulations, complicating the path to commercialization of improved varieties. Here, we explore direct delivery of purified CRISPR/Cas9 ribonucleoproteins (RNPs) to the protoplast of grape cultivar Chardonnay and apple cultivar such as Golden delicious fruit crop plants for efficient targeted mutagenesis. We targeted MLO-7 , a susceptible gene in order to increase resistance to powdery mildew in grape cultivar and DIPM-1, DIPM-2 , and DIPM-4 in the apple to increase resistance to fire blight disease. Furthermore, efficient protoplast transformation, the molar ratio of Cas9 and sgRNAs were optimized for each grape and apple cultivar. The targeted mutagenesis insertion and deletion rate was analyzed using targeted deep sequencing. Our results demonstrate that direct delivery of CRISPR/Cas9 RNPs to the protoplast system enables targeted gene editing and paves the way to the generation of DNA-free genome edited grapevine and apple plants.

  7. Deep Sequencing to Identify the Causes of Viral Encephalitis

    PubMed Central

    Chan, Benjamin K.; Wilson, Theodore; Fischer, Kael F.; Kriesel, John D.

    2014-01-01

    Deep sequencing allows for a rapid, accurate characterization of microbial DNA and RNA sequences in many types of samples. Deep sequencing (also called next generation sequencing or NGS) is being developed to assist with the diagnosis of a wide variety of infectious diseases. In this study, seven frozen brain samples from deceased subjects with recent encephalitis were investigated. RNA from each sample was extracted, randomly reverse transcribed and sequenced. The sequence analysis was performed in a blinded fashion and confirmed with pathogen-specific PCR. This analysis successfully identified measles virus sequences in two brain samples and herpes simplex virus type-1 sequences in three brain samples. No pathogen was identified in the other two brain specimens. These results were concordant with pathogen-specific PCR and partially concordant with prior neuropathological examinations, demonstrating that deep sequencing can accurately identify viral infections in frozen brain tissue. PMID:24699691

  8. Identification of a novel MYO7A mutation in Usher syndrome type 1.

    PubMed

    Cheng, Ling; Yu, Hongsong; Jiang, Yan; He, Juan; Pu, Sisi; Li, Xin; Zhang, Li

    2018-01-05

    Usher syndrome (USH) is an autosomal recessive disease characterized by deafness and retinitis pigmentosa. In view of the high phenotypic and genetic heterogeneity in USH, performing genetic screening with traditional methods is impractical. In the present study, we carried out targeted next-generation sequencing (NGS) to uncover the underlying gene in an USH family (2 USH patients and 15 unaffected relatives). One hundred and thirty-five genes associated with inherited retinal degeneration were selected for deep exome sequencing. Subsequently, variant analysis, Sanger validation and segregation tests were utilized to identify the disease-causing mutations in this family. All affected individuals had a classic USH type I (USH1) phenotype which included deafness, vestibular dysfunction and retinitis pigmentosa. Targeted NGS and Sanger sequencing validation suggested that USH1 patients carried an unreported splice site mutation, c.5168+1G>A, as a compound heterozygous mutation with c.6070C>T (p.R2024X) in the MYO7A gene. A functional study revealed decreased expression of the MYO7A gene in the individuals carrying heterozygous mutations. In conclusion, targeted next-generation sequencing provided a comprehensive and efficient diagnosis for USH1. This study revealed the genetic defects in the MYO7A gene and expanded the spectrum of clinical phenotypes associated with USH1 mutations.

  9. A Phylogenomic Perspective on the Radiation of Ray-Finned Fishes Based upon Targeted Sequencing of Ultraconserved Elements (UCEs)

    PubMed Central

    Sorenson, Laurie; Santini, Francesco

    2013-01-01

    Ray-finned fishes constitute the dominant radiation of vertebrates with over 32,000 species. Although molecular phylogenetics has begun to disentangle major evolutionary relationships within this vast section of the Tree of Life, there is no widely available approach for efficiently collecting phylogenomic data within fishes, leaving much of the enormous potential of massively parallel sequencing technologies for resolving major radiations in ray-finned fishes unrealized. Here, we provide a genomic perspective on longstanding questions regarding the diversification of major groups of ray-finned fishes through targeted enrichment of ultraconserved nuclear DNA elements (UCEs) and their flanking sequence. Our workflow efficiently and economically generates data sets that are orders of magnitude larger than those produced by traditional approaches and is well-suited to working with museum specimens. Analysis of the UCE data set recovers a well-supported phylogeny at both shallow and deep time-scales that supports a monophyletic relationship between Amia and Lepisosteus (Holostei) and reveals elopomorphs and then osteoglossomorphs to be the earliest diverging teleost lineages. Our approach additionally reveals that sequence capture of UCE regions and their flanking sequence offers enormous potential for resolving phylogenetic relationships within ray-finned fishes. PMID:23824177

  10. Brief Report: Cryopyrin-Associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation.

    PubMed

    Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M; Walts, Avram D; Hoffmann, Patrycja; Remmers, Elaine F; Kastner, Daniel L; Ombrello, Amanda K

    2015-09-01

    To identify the cause of disease in an adult patient presenting with recent-onset fevers, chills, urticaria, fatigue, and profound myalgia, who was found to be negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient's whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3-16.8% in monocytes and 15.2-18% in granulocytes. Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, in buccal cells, and in the patient's cultured fibroblasts. Our findings indicate the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively parallel sequencing in clinical diagnosis. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.

  11. Targeted exome sequencing reveals novel USH2A mutations in Chinese patients with simplex Usher syndrome.

    PubMed

    Shu, Hai-Rong; Bi, Huai; Pan, Yang-Chun; Xu, Hang-Yu; Song, Jian-Xin; Hu, Jie

    2015-09-16

    Usher syndrome (USH) is an autosomal recessive disorder characterized by hearing impairment and vision dysfunction due to retinitis pigmentosa. Phenotypic and genetic heterogeneities of this disease make it impractical to obtain a genetic diagnosis by conventional Sanger sequencing. In this study, we applied a next-generation sequencing approach to detect genetic abnormalities in patients with USH. Two unrelated Chinese families were recruited, consisting of two USH afflicted patients and four unaffected relatives. We selected 199 genes related to inherited retinal diseases as targets for deep exome sequencing. Through systematic data analysis using an established bioinformatics pipeline, all variants that passed filter criteria were validated by Sanger sequencing and co-segregation analysis. A homozygous frameshift mutation (c.4382delA, p.T1462Lfs*2) was revealed in exon20 of gene USH2A in the F1 family. Two compound heterozygous mutations, IVS47 + 1G > A and c.13156A > T (p.I4386F), located in intron 48 and exon 63 respectively, of USH2A, were identified as causative mutations for the F2 family. Of note, the missense mutation c.13156A > T has not been reported so far. In conclusion, targeted exome sequencing precisely and rapidly identified the genetic defects in two Chinese USH families and this technique can be applied as a routine examination for these disorders with significant clinical and genetic heterogeneity.

  12. Transcriptomic analysis and mutational status of IDH1 in paired primary-recurrent intrahepatic cholangiocarcinoma.

    PubMed

    Peraldo-Neia, C; Ostano, P; Cavalloni, G; Pignochino, Y; Sangiolo, D; De Cecco, L; Marchesi, E; Ribero, D; Scarpa, A; De Rose, A M; Giuliani, A; Calise, F; Raggi, C; Invernizzi, P; Aglietta, M; Chiorino, G; Leone, F

    2018-06-05

    Effective target therapies for intrahepatic cholangiocarcinoma (ICC) have not been identified so far. One of the reasons may be the genetic evolution from primary (PR) to recurrent (REC) tumors. We aim to identify peculiar characteristics and to select potential targets specific for recurrent tumors. Eighteen ICC paired PR and REC tumors were collected from 5 Italian Centers. Eleven pairs were analyzed for gene expression profiling and 16 for mutational status of IDH1. For one pair, deep mutational analysis by Next Generation Sequencing was also carried out. An independent cohort of patients was used for validation. Two class-paired comparison yielded 315 differentially expressed genes between REC and PR tumors. Up-regulated genes in RECs are involved in RNA/DNA processing, cell cycle, epithelial to mesenchymal transition (EMT), resistance to apoptosis, and cytoskeleton remodeling. Down-regulated genes participate to epithelial cell differentiation, proteolysis, apoptotic, immune response, and inflammatory processes. A 24 gene signature is able to discriminate RECs from PRs in an independent cohort; FANCG is statistically associated with survival in the chol-TCGA dataset. IDH1 was mutated in the RECs of five patients; 4 of them displayed the mutation only in RECs. Deep sequencing performed in one patient confirmed the IDH1 mutation in REC. RECs are enriched for genes involved in EMT, resistance to apoptosis, and cytoskeleton remodeling. Key players of these pathways might be considered druggable targets in RECs. IDH1 is mutated in 30% of RECs, becoming both a marker of progression and a target for therapy.

  13. Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma

    PubMed Central

    Wrzeszczynski, Kazimierz O.; Frank, Mayu O.; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A.; Moore Vogel, Julia L.; Bruce, Jeffrey N.; Lassman, Andrew B.; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V.; Zody, Michael C.; Jobanputra, Vaidehi; Royyuru, Ajay K.

    2017-01-01

    Objective: To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Methods: Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. Results: More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. Conclusions: The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. ClinicalTrials.gov identifier: NCT02725684. PMID:28740869

  14. Deep sequencing of small RNA libraries from human prostate epithelial and stromal cells reveal distinct pattern of microRNAs primarily predicted to target growth factors.

    PubMed

    Singh, Savita; Zheng, Yun; Jagadeeswaran, Guru; Ebron, Jey Sabith; Sikand, Kavleen; Gupta, Sanjay; Sunker, Ramanjulu; Shukla, Girish C

    2016-02-28

    Complex epithelial and stromal cell interactions are required during the development and progression of prostate cancer. Regulatory small non-coding microRNAs (miRNAs) participate in the spatiotemporal regulation of messenger RNA (mRNA) and regulation of translation affecting a large number of genes involved in prostate carcinogenesis. In this study, through deep-sequencing of size fractionated small RNA libraries we profiled the miRNAs of prostate epithelial (PrEC) and stromal (PrSC) cells. Over 50 million reads were obtained for PrEC in which 860,468 were unique sequences. Similarly, nearly 76 million reads for PrSC were obtained in which over 1 million were unique reads. Expression of many miRNAs of broadly conserved and poorly conserved miRNA families were identified. Sixteen highly expressed miRNAs with significant change in expression in PrSC than PrEC were further analyzed in silico. ConsensusPathDB showed the target genes of these miRNAs were significantly involved in adherence junction, cell adhesion, EGRF, TGF-β and androgen signaling. Let-7 family of tumor-suppressor miRNAs expression was highly pervasive in both, PrEC and PrSC cells. In addition, we have also identified several miRNAs that are unique to PrEC or PrSC cells and their predicted putative targets are a group of transcription factors. This study provides perspective on the miRNA expression in PrEC and PrSC, and reveals a global trend in miRNA interactome. We conclude that the most abundant miRNAs are potential regulators of development and differentiation of the prostate gland by targeting a set of growth factors. Additionally, high level expression of the most members of let-7 family miRNAs suggests their role in the fine tuning of the growth and proliferation of prostate epithelial and stromal cells. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  15. microRNA expression profiling in fetal single ventricle malformation identified by deep sequencing.

    PubMed

    Yu, Zhang-Bin; Han, Shu-Ping; Bai, Yun-Fei; Zhu, Chun; Pan, Ya; Guo, Xi-Rong

    2012-01-01

    microRNAs (miRNAs) have emerged as key regulators in many biological processes, particularly cardiac growth and development, although the specific miRNA expression profile associated with this process remains to be elucidated. This study aimed to characterize the cellular microRNA profile involved in the development of congenital heart malformation, through the investigation of single ventricle (SV) defects. Comprehensive miRNA profiling in human fetal SV cardiac tissue was performed by deep sequencing. Differential expression of 48 miRNAs was revealed by sequencing by oligonucleotide ligation and detection (SOLiD) analysis. Of these, 38 were down-regulated and 10 were up-regulated in differentiated SV cardiac tissue, compared to control cardiac tissue. This was confirmed by real-time quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analysis. Predicted target genes of the 48 differentially expressed miRNAs were analyzed by gene ontology and categorized according to cellular process, regulation of biological process and metabolic process. Pathway-Express analysis identified the WNT and mTOR signaling pathways as the most significant processes putatively affected by the differential expression of these miRNAs. The candidate genes involved in cardiac development were identified as potential targets for these differentially expressed microRNAs and the collaborative network of microRNAs and cardiac development related-mRNAs was constructed. These data provide the basis for future investigation of the mechanism of the occurrence and development of fetal SV malformations.

  16. Deep Sequencing Insights in Therapeutic shRNA Processing and siRNA Target Cleavage Precision.

    PubMed

    Denise, Hubert; Moschos, Sterghios A; Sidders, Benjamin; Burden, Frances; Perkins, Hannah; Carter, Nikki; Stroud, Tim; Kennedy, Michael; Fancy, Sally-Ann; Lapthorn, Cris; Lavender, Helen; Kinloch, Ross; Suhy, David; Corbau, Romu

    2014-02-04

    TT-034 (PF-05095808) is a recombinant adeno-associated virus serotype 8 (AAV8) agent expressing three short hairpin RNA (shRNA) pro-drugs that target the hepatitis C virus (HCV) RNA genome. The cytosolic enzyme Dicer cleaves each shRNA into multiple, potentially active small interfering RNA (siRNA) drugs. Using next-generation sequencing (NGS) to identify and characterize active shRNAs maturation products, we observed that each TT-034-encoded shRNA could be processed into as many as 95 separate siRNA strands. Few of these appeared active as determined by Sanger 5' RNA Ligase-Mediated Rapid Amplification of cDNA Ends (5-RACE) and through synthetic shRNA and siRNA analogue studies. Moreover, NGS scrutiny applied on 5-RACE products (RACE-seq) suggested that synthetic siRNAs could direct cleavage in not one, but up to five separate positions on targeted RNA, in a sequence-dependent manner. These data support an on-target mechanism of action for TT-034 without cytotoxicity and question the accepted precision of substrate processing by the key RNA interference (RNAi) enzymes Dicer and siRNA-induced silencing complex (siRISC).Molecular Therapy-Nucleic Acids (2014) 3, e145; doi:10.1038/mtna.2013.73; published online 4 February 2014.

  17. Detection of Somatic Mutations in Gastroenteropancreatic Neuroendocrine Tumors Using Targeted Deep Sequencing.

    PubMed

    Backman, Samuel; Norlén, Olov; Eriksson, Barbro; Skogseid, Britt; Stålberg, Peter; Crona, Joakim

    2017-02-01

    Mutations affecting the mechanistic target of rapamycin (MTOR) signalling pathway are frequent in human cancer and have been identified in up to 15% of pancreatic neuroendocrine tumours (NETs). Grade A evidence supports the efficacy of MTOR inhibition with everolimus in pancreatic NETs. Although a significant proportion of patients experience disease stabilization, only a minority will show objective tumour responses. It has been proposed that genomic mutations resulting in activation of MTOR signalling could be used to predict sensitivity to everolimus. Patients with NETs that underwent treatment with everolimus at our Institution were identified and those with available tumour tissue were selected for further analysis. Targeted next-generation sequencing (NGS) was used to re-sequence 22 genes that were selected on the basis of documented involvement in the MTOR signalling pathway or in the tumourigenesis of gastroenterpancreatic NETs. Radiological responses were documented using Response Evaluation Criteria in Solid Tumours. Six patients were identified, one had a partial response and four had stable disease. Sequencing of tumour tissue resulted in a median sequence depth of 667.1 (range=404-1301) with 1-fold coverage of 95.9-96.5% and 10-fold coverage of 87.6-92.2%. A total of 494 genetic variants were discovered, four of which were identified as pathogenic. All pathogenic variants were validated using Sanger sequencing and were found exclusively in menin 1 (MEN1) and death domain associated protein (DAXX) genes. No mutations in the MTOR pathway-related genes were observed. Targeted NGS is a feasible method with high diagnostic yield for genetic characterization of pancreatic NETs. A potential association between mutations in NETs and response to everolimus should be investigated by future studies. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  18. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    PubMed

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding the molecular mechanisms of drought resistance in mulberry.

  19. The Pediatric Cancer Genome Project

    PubMed Central

    Downing, James R; Wilson, Richard K; Zhang, Jinghui; Mardis, Elaine R; Pui, Ching-Hon; Ding, Li; Ley, Timothy J; Evans, William E

    2013-01-01

    The St. Jude Children’s Research Hospital–Washington University Pediatric Cancer Genome Project (PCGP) is participating in the international effort to identify somatic mutations that drive cancer. These cancer genome sequencing efforts will not only yield an unparalleled view of the altered signaling pathways in cancer but should also identify new targets against which novel therapeutics can be developed. Although these projects are still deep in the phase of generating primary DNA sequence data, important results are emerging and valuable community resources are being generated that should catalyze future cancer research. We describe here the rationale for conducting the PCGP, present some of the early results of this project and discuss the major lessons learned and how these will affect the application of genomic sequencing in the clinic. PMID:22641210

  20. Deep-branching Novel Lineages and High Diversity of Haptophytes in the Skagerrak (Norway) Uncovered by 454 Pyrosequencing

    PubMed Central

    Egge, Elianne S; Eikrem, Wenche; Edvardsen, Bente

    2015-01-01

    Microalgae in the division Haptophyta may be difficult to identify to species by microscopy because they are small and fragile. Here, we used high-throughput sequencing to explore the diversity of haptophytes in outer Oslofjorden, Skagerrak, and supplemented this with electron microscopy. Nano- and picoplanktonic subsurface samples were collected monthly for 2 yr, and the haptophytes were targeted by amplification of RNA/cDNA with Haptophyta-specific 18S ribosomal DNA V4 primers. Pyrosequencing revealed higher species richness of haptophytes than previously observed in the Skagerrak by microscopy. From ca. 400,000 reads we obtained 156 haptophyte operational taxonomic units (OTUs) after rigorous filtering and 99.5% clustering. The majority (84%) of the OTUs matched environmental sequences not linked to a morphological species, most of which were affiliated with the order Prymnesiales. Phylogenetic analyses including Oslofjorden OTUs and available cultured and environmental haptophyte sequences showed that several of the OTUs matched sequences forming deep-branching lineages, potentially representing novel haptophyte classes. Pyrosequencing also retrieved cultured species not previously reported by microscopy in the Skagerrak. Electron microscopy revealed species not yet genetically characterised and some potentially novel taxa. This study contributes to linking genotype to phenotype within this ubiquitous and ecologically important protist group, and reveals great, unknown diversity. PMID:25099994

  1. The small RNA profile in latex from Hevea brasiliensis trees is affected by tapping panel dryness.

    PubMed

    Gébelin, Virginie; Leclercq, Julie; Kuswanhadi; Argout, Xavier; Chaidamsari, Tetty; Hu, Songnian; Tang, Chaorong; Sarah, Gautier; Yang, Meng; Montoro, Pascal

    2013-10-01

    Natural rubber is harvested by tapping Hevea brasiliensis (Willd. ex A. Juss.) Müll. Arg. Harvesting stress can lead to tapping panel dryness (TPD). MicroRNAs (miRNAs) are induced by abiotic stress and regulate gene expression by targeting the cleavage or translational inhibition of target messenger RNAs. This study set out to sequence miRNAs expressed in latex cells and to identify TPD-related putative targets. Deep sequencing of small RNAs was carried out on latex from trees affected by TPD using Solexa technology. The most abundant small RNA class size was 21 nucleotides for TPD trees compared with 24 nucleotides in healthy trees. By combining the LeARN pipeline, data from the Plant MicroRNA database and Hevea EST sequences, we identified 19 additional conserved and four putative species-specific miRNA families not found in previous studies on rubber. The relative transcript abundance of the Hbpre-MIR159b gene increased with TPD. This study revealed a small RNA-specific signature of TPD-affected trees. Both RNA degradation and a shift in miRNA biogenesis are suggested to explain the general decline in small RNAs and, particularly, in miRNAs.

  2. Discovery radiomics via evolutionary deep radiomic sequencer discovery for pathologically proven lung cancer detection.

    PubMed

    Shafiee, Mohammad Javad; Chung, Audrey G; Khalvati, Farzad; Haider, Masoom A; Wong, Alexander

    2017-10-01

    While lung cancer is the second most diagnosed form of cancer in men and women, a sufficiently early diagnosis can be pivotal in patient survival rates. Imaging-based, or radiomics-driven, detection methods have been developed to aid diagnosticians, but largely rely on hand-crafted features that may not fully encapsulate the differences between cancerous and healthy tissue. Recently, the concept of discovery radiomics was introduced, where custom abstract features are discovered from readily available imaging data. We propose an evolutionary deep radiomic sequencer discovery approach based on evolutionary deep intelligence. Motivated by patient privacy concerns and the idea of operational artificial intelligence, the evolutionary deep radiomic sequencer discovery approach organically evolves increasingly more efficient deep radiomic sequencers that produce significantly more compact yet similarly descriptive radiomic sequences over multiple generations. As a result, this framework improves operational efficiency and enables diagnosis to be run locally at the radiologist's computer while maintaining detection accuracy. We evaluated the evolved deep radiomic sequencer (EDRS) discovered via the proposed evolutionary deep radiomic sequencer discovery framework against state-of-the-art radiomics-driven and discovery radiomics methods using clinical lung CT data with pathologically proven diagnostic data from the LIDC-IDRI dataset. The EDRS shows improved sensitivity (93.42%), specificity (82.39%), and diagnostic accuracy (88.78%) relative to previous radiomics approaches.

  3. Genome-wide characterization of microRNA in foxtail millet (Setaria italica)

    PubMed Central

    2013-01-01

    Background MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. Results In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. Conclusions We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally. PMID:24330712

  4. Genome-wide characterization of microRNA in foxtail millet (Setaria italica).

    PubMed

    Yi, Fei; Xie, Shaojun; Liu, Yuwei; Qi, Xin; Yu, Jingjuan

    2013-12-13

    MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally.

  5. Identification and functional analysis of flowering related microRNAs in common wild rice (Oryza rufipogon Griff.).

    PubMed

    Chen, Zongxiang; Li, Fuli; Yang, Songnan; Dong, Yibo; Yuan, Qianhua; Wang, Feng; Li, Weimin; Jiang, Ying; Jia, Shirong; Pei, Xinwu

    2013-01-01

    MicroRNAs (miRNAs) is a class of non-coding RNAs involved in post- transcriptional control of gene expression, via degradation and/or translational inhibition. Six-hundred sixty-one rice miRNAs are known that are important in plant development. However, flowering-related miRNAs have not been characterized in Oryza rufipogon Griff. It was approved by supervision department of Guangdong wild rice protection. We analyzed flowering-related miRNAs in O. rufipogon using high-throughput sequencing (deep sequencing) to understand the changes that occurred during rice domestication, and to elucidate their functions in flowering. Three O. rufipogon sRNA libraries, two vegetative stage (CWR-V1 and CWR-V2) and one flowering stage (CWR-F2) were sequenced using Illumina deep sequencing. A total of 20,156,098, 21,531,511 and 20,995,942 high quality sRNA reads were obtained from CWR-V1, CWR-V2 and CWR-F2, respectively, of which 3,448,185, 4,265,048 and 2,833,527 reads matched known miRNAs. We identified 512 known rice miRNAs in 214 miRNA families and predicted 290 new miRNAs. Targeted functional annotation, GO and KEGG pathway analyses predicted that 187 miRNAs regulate expression of flowering-related genes. Differential expression analysis of flowering-related miRNAs showed that: expression of 95 miRNAs varied significantly between the libraries, 66 are flowering-related miRNAs, such as oru-miR97, oru-miR117, oru-miR135, oru-miR137, et al. 17 are early-flowering -related miRNAs, including osa-miR160f, osa-miR164d, osa-miR167d, osa-miR169a, osa-miR172b, oru-miR4, et al., induced during the floral transition. Real-time PCR revealed the same expression patterns as deep sequencing. miRNAs targets were confirmed for cleavage by 5'-RACE in vivo, and were negatively regulated by miRNAs. This is the first investigation of flowering miRNAs in wild rice. The result indicates that variation in miRNAs occurred during rice domestication and lays a foundation for further study of phase change and flowering in O. rufipogon. Complicated regulatory networks mediated by multiple miRNAs regulate the expression of flowering genes that control the induction of flowering.

  6. Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks.

    PubMed

    Pan, Xiaoyong; Shen, Hong-Bin

    2018-05-02

    RNA-binding proteins (RBPs) take over 5∼10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using pattern learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. In this study, we present a computational method iDeepE to predict RNA-protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN run 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. https://github.com/xypan1232/iDeepE. xypan172436@gmail.com or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.

  7. A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences.

    PubMed

    Xue, Yun; Liao, Zhengling; Li, Meihang; Luo, Jie; Kuang, Qiuhua; Hu, Xiaohui; Li, Tiechen

    2015-01-01

    Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method.

  8. Using RNA-seq and targeted nucleases to identify mechanisms of drug resistance in acute myeloid leukemia.

    PubMed

    Rathe, Susan K; Moriarity, Branden S; Stoltenberg, Christopher B; Kurata, Morito; Aumann, Natalie K; Rahrmann, Eric P; Bailey, Natashay J; Melrose, Ellen G; Beckmann, Dominic A; Liska, Chase R; Largaespada, David A

    2014-08-13

    The evolution from microarrays to transcriptome deep-sequencing (RNA-seq) and from RNA interference to gene knockouts using Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) and Transcription Activator-Like Effector Nucleases (TALENs) has provided a new experimental partnership for identifying and quantifying the effects of gene changes on drug resistance. Here we describe the results from deep-sequencing of RNA derived from two cytarabine (Ara-C) resistance acute myeloid leukemia (AML) cell lines, and present CRISPR and TALEN based methods for accomplishing complete gene knockout (KO) in AML cells. We found protein modifying loss-of-function mutations in Dck in both Ara-C resistant cell lines. CRISPR and TALEN-based KO of Dck dramatically increased the IC₅₀ of Ara-C and introduction of a DCK overexpression vector into Dck KO clones resulted in a significant increase in Ara-C sensitivity. This effort demonstrates the power of using transcriptome analysis and CRISPR/TALEN-based KOs to identify and verify genes associated with drug resistance.

  9. DeepText2GO: Improving large-scale protein function prediction with deep semantic text representation.

    PubMed

    You, Ronghui; Huang, Xiaodi; Zhu, Shanfeng

    2018-06-06

    As of April 2018, UniProtKB has collected more than 115 million protein sequences. Less than 0.15% of these proteins, however, have been associated with experimental GO annotations. As such, the use of automatic protein function prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and functional information from protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Deciphering KRAS and NRAS mutated clone dynamics in MLL-AF4 paediatric leukaemia by ultra deep sequencing analysis.

    PubMed

    Trentin, Luca; Bresolin, Silvia; Giarin, Emanuela; Bardini, Michela; Serafin, Valentina; Accordi, Benedetta; Fais, Franco; Tenca, Claudya; De Lorenzo, Paola; Valsecchi, Maria Grazia; Cazzaniga, Giovanni; Kronnie, Geertruy Te; Basso, Giuseppe

    2016-10-04

    To induce and sustain the leukaemogenic process, MLL-AF4+ leukaemia seems to require very few genetic alterations in addition to the fusion gene itself. Studies of infant and paediatric patients with MLL-AF4+ B cell precursor acute lymphoblastic leukaemia (BCP-ALL) have reported mutations in KRAS and NRAS with incidences ranging from 25 to 50%. Whereas previous studies employed Sanger sequencing, here we used next generation amplicon deep sequencing for in depth evaluation of RAS mutations in 36 paediatric patients at diagnosis of MLL-AF4+ leukaemia. RAS mutations including those in small sub-clones were detected in 63.9% of patients. Furthermore, the mutational analysis of 17 paired samples at diagnosis and relapse revealed complex RAS clone dynamics and showed that the mutated clones present at relapse were almost all originated from clones that were already detectable at diagnosis and survived to the initial therapy. Finally, we showed that mutated patients were indeed characterized by a RAS related signature at both transcriptional and protein levels and that the targeting of the RAS pathway could be of beneficial for treatment of MLL-AF4+ BCP-ALL clones carrying somatic RAS mutations.

  11. Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

    PubMed

    Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

    2016-01-01

    Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.

  12. Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.

    PubMed

    Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.

  13. TCRmodel: high resolution modeling of T cell receptors from sequence.

    PubMed

    Gowthaman, Ragul; Pierce, Brian G

    2018-05-22

    T cell receptors (TCRs), along with antibodies, are responsible for specific antigen recognition in the adaptive immune response, and millions of unique TCRs are estimated to be present in each individual. Understanding the structural basis of TCR targeting has implications in vaccine design, autoimmunity, as well as T cell therapies for cancer. Given advances in deep sequencing leading to immune repertoire-level TCR sequence data, fast and accurate modeling methods are needed to elucidate shared and unique 3D structural features of these molecules which lead to their antigen targeting and cross-reactivity. We developed a new algorithm in the program Rosetta to model TCRs from sequence, and implemented this functionality in a web server, TCRmodel. This web server provides an easy to use interface, and models are generated quickly that users can investigate in the browser and download. Benchmarking of this method using a set of nonredundant recently released TCR crystal structures shows that models are accurate and compare favorably to models from another available modeling method. This server enables the community to obtain insights into TCRs of interest, and can be combined with methods to model and design TCR recognition of antigens. The TCRmodel server is available at: http://tcrmodel.ibbr.umd.edu/.

  14. High-Throughput Sequencing of Arabidopsis microRNAs: Evidence for Frequent Birth and Death of MIRNA Genes

    PubMed Central

    Fahlgren, Noah; Howell, Miya D.; Kasschau, Kristin D.; Chapman, Elisabeth J.; Sullivan, Christopher M.; Cumbie, Jason S.; Givan, Scott A.; Law, Theresa F.; Grant, Sarah R.; Dangl, Jeffery L.; Carrington, James C.

    2007-01-01

    In plants, microRNAs (miRNAs) comprise one of two classes of small RNAs that function primarily as negative regulators at the posttranscriptional level. Several MIRNA genes in the plant kingdom are ancient, with conservation extending between angiosperms and the mosses, whereas many others are more recently evolved. Here, we use deep sequencing and computational methods to identify, profile and analyze non-conserved MIRNA genes in Arabidopsis thaliana. 48 non-conserved MIRNA families, nearly all of which were represented by single genes, were identified. Sequence similarity analyses of miRNA precursor foldback arms revealed evidence for recent evolutionary origin of 16 MIRNA loci through inverted duplication events from protein-coding gene sequences. Interestingly, these recently evolved MIRNA genes have taken distinct paths. Whereas some non-conserved miRNAs interact with and regulate target transcripts from gene families that donated parental sequences, others have drifted to the point of non-interaction with parental gene family transcripts. Some young MIRNA loci clearly originated from one gene family but form miRNAs that target transcripts in another family. We suggest that MIRNA genes are undergoing relatively frequent birth and death, with only a subset being stabilized by integration into regulatory networks. PMID:17299599

  15. [Personalized urooncology based on molecular uropathology: what is the future?].

    PubMed

    Dahl, E; Haller, F

    2013-07-01

    Targeted therapies and biomarker validation are key drivers in the advancement of personalized oncology which is a growing topic in all clinical areas. Compared with other professions, such as pulmonology and gynecology, development in urology has so far been retarded but has recently gained increasing momentum. A basis for this is the currently growing and in future accelerated application of new knowledge derived from molecular biology in the field of uropathology. The rapid gain of knowledge is driven by a whole new class of analytical methods, such as massively parallel sequencing (deep sequencing or next generation sequencing), which enables analysis of virtually a new universe of potential biomarkers. This article describes the emerging paradigm shift in molecular pathological diagnostics of urological tumors using the example of prostate cancer.

  16. Noninvasive genome sampling in chimpanzees.

    PubMed

    Kohn, Michael H

    2010-12-01

    The inevitable has happened: genomic technologies have been added to our noninvasive genetic sampling repertoire. In this issue of Molecular Ecology, Perry et al. (2010) demonstrate how DNA extraction from chimpanzee faeces, followed by a series of steps to enrich for target loci, can be coupled with next-generation sequencing. These authors collected sequence and single-nucleotide polymorphism (SNP) data at more than 600 genomic loci (chromosome 21 and the X) and the complete mitochondrial DNA. By design, each locus was 'deep sequenced' to enable SNP identification. To demonstrate the reliability of their data, the work included samples from six captive chimps, which allowed for a comparison between presumably genuine SNPs obtained from blood and potentially flawed SNPs deduced from faeces. Thus, with this method, anyone with the resources, skills and ambition to do genome sequencing of wild, elusive, or protected mammals can enjoy all of the benefits of noninvasive sampling. © 2010 Blackwell Publishing Ltd.

  17. Identification of microRNAs from Amur grape (Vitis amurensis Rupr.) by deep sequencing and analysis of microRNA variations with bioinformatics.

    PubMed

    Wang, Chen; Han, Jian; Liu, Chonghuai; Kibet, Korir Nicholas; Kayesh, Emrul; Shangguan, Lingfei; Li, Xiaoying; Fang, Jinggui

    2012-03-29

    MicroRNA (miRNA) is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr.) is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs) from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR) analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Deep sequencing of short RNAs from Amur grape flowers and berries identified 72 new potential miRNAs and 34 known but non-conserved miRNAs, indicating that specific miRNAs exist in Amur grape. These results show that a number of regulatory miRNAs exist in Amur grape and play an important role in Amur grape growth, development, and response to abiotic or biotic stress.

  18. Quasispecies Analyses of the HIV-1 Near-full-length Genome With Illumina MiSeq

    PubMed Central

    Ode, Hirotaka; Matsuda, Masakazu; Matsuoka, Kazuhiro; Hachiya, Atsuko; Hattori, Junko; Kito, Yumiko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

    2015-01-01

    Human immunodeficiency virus type-1 (HIV-1) exhibits high between-host genetic diversity and within-host heterogeneity, recognized as quasispecies. Because HIV-1 quasispecies fluctuate in terms of multiple factors, such as antiretroviral exposure and host immunity, analyzing the HIV-1 genome is critical for selecting effective antiretroviral therapy and understanding within-host viral coevolution mechanisms. Here, to obtain HIV-1 genome sequence information that includes minority variants, we sought to develop a method for evaluating quasispecies throughout the HIV-1 near-full-length genome using the Illumina MiSeq benchtop deep sequencer. To ensure the reliability of minority mutation detection, we applied an analysis method of sequence read mapping onto a consensus sequence derived from de novo assembly followed by iterative mapping and subsequent unique error correction. Deep sequencing analyses of aHIV-1 clone showed that the analysis method reduced erroneous base prevalence below 1% in each sequence position and discarded only < 1% of all collected nucleotides, maximizing the usage of the collected genome sequences. Further, we designed primer sets to amplify the HIV-1 near-full-length genome from clinical plasma samples. Deep sequencing of 92 samples in combination with the primer sets and our analysis method provided sufficient coverage to identify >1%-frequency sequences throughout the genome. When we evaluated sequences of pol genes from 18 treatment-naïve patients' samples, the deep sequencing results were in agreement with Sanger sequencing and identified numerous additional minority mutations. The results suggest that our deep sequencing method would be suitable for identifying within-host viral population dynamics throughout the genome. PMID:26617593

  19. Virus Identification in Unknown Tropical Febrile Illness Cases Using Deep Sequencing

    PubMed Central

    Balmaseda, Angel; Harris, Eva; DeRisi, Joseph L.

    2012-01-01

    Dengue virus is an emerging infectious agent that infects an estimated 50–100 million people annually worldwide, yet current diagnostic practices cannot detect an etiologic pathogen in ∼40% of dengue-like illnesses. Metagenomic approaches to pathogen detection, such as viral microarrays and deep sequencing, are promising tools to address emerging and non-diagnosable disease challenges. In this study, we used the Virochip microarray and deep sequencing to characterize the spectrum of viruses present in human sera from 123 Nicaraguan patients presenting with dengue-like symptoms but testing negative for dengue virus. We utilized a barcoding strategy to simultaneously deep sequence multiple serum specimens, generating on average over 1 million reads per sample. We then implemented a stepwise bioinformatic filtering pipeline to remove the majority of human and low-quality sequences to improve the speed and accuracy of subsequent unbiased database searches. By deep sequencing, we were able to detect virus sequence in 37% (45/123) of previously negative cases. These included 13 cases with Human Herpesvirus 6 sequences. Other samples contained sequences with similarity to sequences from viruses in the Herpesviridae, Flaviviridae, Circoviridae, Anelloviridae, Asfarviridae, and Parvoviridae families. In some cases, the putative viral sequences were virtually identical to known viruses, and in others they diverged, suggesting that they may derive from novel viruses. These results demonstrate the utility of unbiased metagenomic approaches in the detection of known and divergent viruses in the study of tropical febrile illness. PMID:22347512

  20. deepTools: a flexible platform for exploring deep-sequencing data.

    PubMed

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Deep sequencing identification of miRNAs in pigeon ovaries illuminated with monochromatic light.

    PubMed

    Wang, Ying; Yang, Hai-Ming; Cao, Wei; Li, Yang-Bai; Wang, Zhi-Yue

    2018-06-08

    The use of light of different wavelengths has grown popular in the poultry industry. An optimum wavelength is believed to improve pigeon egg production, but little is known about the role of microRNAs (miRNAs) in the effects of monochromatic light on ovarian pigeon function. Herein, we harvested ovaries from pigeons reared under monochromatic light of different wavelength and performed deep sequencing on various tissues using an Illumina Solexa high-throughput instrument. We obtained 66,148,548, 67,873,805, and 71,661,771 clean reads from ovaries of pigeons reared under red light (RL), blue light (BL), and white light (WL), respectively. We identified 1917 known miRNAs in nine libraries, of which 524 were novel. Three and five differentially expressed miRNAs were identified in BL vs. WL and RL vs. WL groups, respectively. Quantitative reverse transcription PCR was used to validate differentially expressed miRNAs (miR-200, miR-122, and miR-205b). In addition, 5824 target genes were annotated as differentially expressed miRNAs, most of which are involved in reproductive pathways including oestrogen signalling, cell cycle, and oocyte maturation. Notably, ovarian miR-205b expression was significantly negatively correlated with its target 11β-hydroxysteroid dehydrogenase type 1 (HSD11B1). miRNA-mRNA network analysis suggests that miR-205b targeting of HSD11B1 plays a key role in the effects of monochromatic light on pigeon egg production. These findings indicate that monochromatic light shortens the oviposition interval of pigeons, which may be useful for egg production and pigeon breeding.

  2. Deep Sequencing Reveals Direct Targets of Gammaherpesvirus-Induced mRNA Decay and Suggests That Multiple Mechanisms Govern Cellular Transcript Escape

    PubMed Central

    Clyde, Karen; Glaunsinger, Britt A.

    2011-01-01

    One characteristic of lytic infection with gammaherpesviruses, including Kaposi's sarcoma-associated herpesvirus (KSHV), Epstein-Barr virus (EBV) and murine herpesvirus 68 (MHV68), is the dramatic suppression of cellular gene expression in a process known as host shutoff. The alkaline exonuclease proteins (KSHV SOX, MHV-68 muSOX and EBV BGLF5) have been shown to induce shutoff by destabilizing cellular mRNAs. Here we extend previous analyses of cellular mRNA abundance during lytic infection to characterize the effects of SOX and muSOX, in the absence of other viral genes, utilizing deep sequencing technology (RNA-seq). Consistent with previous observations during lytic infection, the majority of transcripts are downregulated in cells expressing either SOX or muSOX, with muSOX acting as a more potent shutoff factor than SOX. Moreover, most cellular messages fall into the same expression class in both SOX- and muSOX-expressing cells, indicating that both factors target similar pools of mRNAs. More abundant mRNAs are more efficiently downregulated, suggesting a concentration effect in transcript targeting. However, even among highly expressed genes there are mRNAs that escape host shutoff. Further characterization of select escapees reveals multiple mechanisms by which cellular genes can evade downregulation. While some mRNAs are directly refractory to SOX, the steady state levels of others remain unchanged, presumably as a consequence of downstream effects on mRNA biogenesis. Collectively, these studies lay the framework for dissecting the mechanisms underlying the susceptibility of mRNA to destruction during lytic gammaherpesvirus infection. PMID:21573023

  3. The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line.

    PubMed

    Suzuki, Harukazu; Forrest, Alistair R R; van Nimwegen, Erik; Daub, Carsten O; Balwierz, Piotr J; Irvine, Katharine M; Lassmann, Timo; Ravasi, Timothy; Hasegawa, Yuki; de Hoon, Michiel J L; Katayama, Shintaro; Schroder, Kate; Carninci, Piero; Tomaru, Yasuhiro; Kanamori-Katayama, Mutsumi; Kubosaki, Atsutaka; Akalin, Altuna; Ando, Yoshinari; Arner, Erik; Asada, Maki; Asahara, Hiroshi; Bailey, Timothy; Bajic, Vladimir B; Bauer, Denis; Beckhouse, Anthony G; Bertin, Nicolas; Björkegren, Johan; Brombacher, Frank; Bulger, Erika; Chalk, Alistair M; Chiba, Joe; Cloonan, Nicole; Dawe, Adam; Dostie, Josee; Engström, Pär G; Essack, Magbubah; Faulkner, Geoffrey J; Fink, J Lynn; Fredman, David; Fujimori, Ko; Furuno, Masaaki; Gojobori, Takashi; Gough, Julian; Grimmond, Sean M; Gustafsson, Mika; Hashimoto, Megumi; Hashimoto, Takehiro; Hatakeyama, Mariko; Heinzel, Susanne; Hide, Winston; Hofmann, Oliver; Hörnquist, Michael; Huminiecki, Lukasz; Ikeo, Kazuho; Imamoto, Naoko; Inoue, Satoshi; Inoue, Yusuke; Ishihara, Ryoko; Iwayanagi, Takao; Jacobsen, Anders; Kaur, Mandeep; Kawaji, Hideya; Kerr, Markus C; Kimura, Ryuichiro; Kimura, Syuhei; Kimura, Yasumasa; Kitano, Hiroaki; Koga, Hisashi; Kojima, Toshio; Kondo, Shinji; Konno, Takeshi; Krogh, Anders; Kruger, Adele; Kumar, Ajit; Lenhard, Boris; Lennartsson, Andreas; Lindow, Morten; Lizio, Marina; Macpherson, Cameron; Maeda, Norihiro; Maher, Christopher A; Maqungo, Monique; Mar, Jessica; Matigian, Nicholas A; Matsuda, Hideo; Mattick, John S; Meier, Stuart; Miyamoto, Sei; Miyamoto-Sato, Etsuko; Nakabayashi, Kazuhiko; Nakachi, Yutaka; Nakano, Mika; Nygaard, Sanne; Okayama, Toshitsugu; Okazaki, Yasushi; Okuda-Yabukami, Haruka; Orlando, Valerio; Otomo, Jun; Pachkov, Mikhail; Petrovsky, Nikolai; Plessy, Charles; Quackenbush, John; Radovanovic, Aleksandar; Rehli, Michael; Saito, Rintaro; Sandelin, Albin; Schmeier, Sebastian; Schönbach, Christian; Schwartz, Ariel S; Semple, Colin A; Sera, Miho; Severin, Jessica; Shirahige, Katsuhiko; Simons, Cas; St Laurent, George; Suzuki, Masanori; Suzuki, Takahiro; Sweet, Matthew J; Taft, Ryan J; Takeda, Shizu; Takenaka, Yoichi; Tan, Kai; Taylor, Martin S; Teasdale, Rohan D; Tegnér, Jesper; Teichmann, Sarah; Valen, Eivind; Wahlestedt, Claes; Waki, Kazunori; Waterhouse, Andrew; Wells, Christine A; Winther, Ole; Wu, Linda; Yamaguchi, Kazumi; Yanagawa, Hiroshi; Yasuda, Jun; Zavolan, Mihaela; Hume, David A; Arakawa, Takahiro; Fukuda, Shiro; Imamura, Kengo; Kai, Chikatoshi; Kaiho, Ai; Kawashima, Tsugumi; Kawazu, Chika; Kitazume, Yayoi; Kojima, Miki; Miura, Hisashi; Murakami, Kayoko; Murata, Mitsuyoshi; Ninomiya, Noriko; Nishiyori, Hiromi; Noma, Shohei; Ogawa, Chihiro; Sano, Takuma; Simon, Christophe; Tagami, Michihira; Takahashi, Yukari; Kawai, Jun; Hayashizaki, Yoshihide

    2009-05-01

    Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.

  4. Differential Expression of microRNAs in the Ovaries from Letrozole-Induced Rat Model of Polycystic Ovary Syndrome.

    PubMed

    Li, Dandan; Li, Chunjin; Xu, Ying; Xu, Duo; Li, Hongjiao; Gao, Liwei; Chen, Shuxiong; Fu, Lulu; Xu, Xin; Liu, Yongzheng; Zhang, Xueying; Zhang, Jingshun; Ming, Hao; Zheng, Lianwen

    2016-04-01

    Polycystic ovary syndrome (PCOS) is a complex and heterogeneous endocrine disorder. To understand the pathogenesis of PCOS, we established rat models of PCOS induced by letrozole and employed deep sequencing to screen the differential expression of microRNAs (miRNAs) in PCOS rats and control rats. We observed vaginal smear and detected ovarian pathological alteration and hormone level changes in PCOS rats. Deep sequencing showed that a total of 129 miRNAs were differentially expressed in the ovaries from letrozole-induced rat model compared with the control, including 49 miRNAs upregulated and 80 miRNAs downregulated. Furthermore, the differential expression of miR-201-5p, miR-34b-5p, miR-141-3p, and miR-200a-3p were confirmed by real-time polymerase chain reaction. Bioinformatic analysis revealed that these four miRNAs were predicted to target a large set of genes with different functions. Pathway analysis supported that the miRNAs regulate oocyte meiosis, mitogen-activated protein kinase (MAPK) signaling, phosphoinositide 3-kinase/Akt (PI3K-Akt) signaling, Rap1 signaling, and Notch signaling. These data indicate that miRNAs are differentially expressed in rat PCOS model and the differentially expressed miRNA are involved in the etiology and pathophysiology of PCOS. Our findings will help identify miRNAs as novel diagnostic markers and therapeutic targets for PCOS.

  5. Culture-Independent Identification of Periodontitis-Associated Porphyromonas and Tannerella Populations by Targeted Molecular Analysis

    PubMed Central

    de Lillo, A.; Booth, V.; Kyriacou, L.; Weightman, A. J.; Wade, W. G.

    2004-01-01

    Periodontitis is the commonest bacterial disease of humans and is the major cause of adult tooth loss. About half of the oral microflora is unculturable; and 16S rRNA PCR, cloning, and sequencing techniques have demonstrated the high level of species richness of the oral microflora. In the present study, a PCR primer set specific for the genera Porphyromonas and Tannerella was designed and used to analyze the bacterial populations in subgingival plaque samples from inflamed shallow and deep sites in subjects with periodontitis and shallow sites in age- and sex-matched controls. A total of 308 clones were sequenced and found to belong to one of six Porphyromonas or Tannerella species or phylotypes, one of which, Porphyromonas P3, was novel. Tannerella forsythensis was found in significantly higher proportions in patients than in controls. Porphyromonas catoniae and Tannerella phylotype BU063 appeared to be associated with shallow sites. Targeted culture-independent molecular ecology studies have a valuable role to play in the identification of bacterial targets for further investigations of the pathogenesis of bacterial infections. PMID:15583276

  6. Deep Bleeder Acoustic Coagulation (DBAC)-part II: in vivo testing of a research prototype system.

    PubMed

    Sekins, K Michael; Barnes, Stephen R; Fan, Liexiang; Hopple, Jerry D; Hsu, Stephen J; Kook, John; Lee, Chi-Yin; Maleke, Caroline; Zeng, Xiaozheng Jenny; Moreau-Gobard, Romain; Ahiekpor-Dravi, Alexis; Funka-Lea, Gareth; Eaton, John; Wong, Keith; Keneman, Scott; Mitchell, Stuart B; Dunmire, Barbrina; Kucewicz, John C; Clubb, Fred J; Miller, Matthew W; Crum, Lawrence A

    2015-01-01

    Deep Bleeder Acoustic Coagulation (DBAC) is an ultrasound image-guided high-intensity focused ultrasound (HIFU) method proposed to automatically detect and localize (D&L) and treat deep, bleeding, combat wounds in the limbs of soldiers. A prototype DBAC system consisting of an applicator and control unit was developed for testing on animals. To enhance control, and thus safety, of the ultimate human DBAC autonomous product system, a thermal coagulation strategy that minimized cavitation, boiling, and non-linear behaviors was used. The in vivo DBAC applicator design had four therapy tiles (Tx) and two 3D (volume) imaging probes (Ix) and was configured to be compatible with a porcine limb bleeder model developed in this research. The DBAC applicator was evaluated under quantitative test conditions (e.g., bleeder depths, flow rates, treatment time limits, and dose exposure time limits) in an in vivo study (final exam) comprising 12 bleeder treatments in three swine. To quantify blood flow rates, the "bleeder" targets were intact arterial branches, i.e., the superficial femoral artery (SFA) and a deep femoral artery (DFA). D&L identified, characterized, and targeted bleeders. The therapy sequence selected Tx arrays and determined the acoustic power and Tx beam steering, focus, and scan patterns. The user interface commands consisted of two buttons: "Start D&L" and "Start Therapy." Targeting accuracy was assessed by necropsy and histologic exams and efficacy (vessel coagulative occlusion) by angiography and histology. The D&L process (Part I article, J Ther Ultrasound, 2015 (this issue)) executed fully in all cases in under 5 min and targeting evaluation showed 11 of 12 thermal lesions centered on the correct vessel subsection, with minimal damage to adjacent structures. The automated therapy sequence also executed properly, with select manual steps. Because the dose exposure time limit (t dose ≤ 30 s) was associated with nonefficacious treatment, 60-s dosing and dual-dosing was also pursued. Thrombogenic evidence (blood clotting) and collagen denaturation (vessel shrinkage) were found in necropsy and histologically in all targeted SFAs. Acute SFA reductions in blood flow (20-30 %) were achieved in one subject, and one partial and one complete vessel occlusion were confirmed angiographically. The complete occlusion case was achieved with a dual dose (90 s total exposure) with focal intensity ≈500 W/cm(2) (spatial average, temporal average). While not meeting all in vivo objectives, the overall performance of the DBAC applicator was positive. In particular, D&L automation workflow was verified during each of the tests, with processing times well under specified (10 min) limits, and all bleeder branches were detected and localized. Further, gross necropsy and tissue examination confirmed that the HIFU thermal lesions were coincident with the target vessel locations in over 90 % of the multi-array dosing treatments. The SFA/DFA bleeder models selected, and the protocols used, were the most suitable practical model options for the given DBAC anatomical and bleeder requirements. The animal models were imperfect in some challenging aspects, including requiring tissue-mimicking material (TMM) standoffs to achieve deep target depths, thereby introducing device-tissue motion, with resultant imaging artifacts. The model "bleeders" involved intact vessels, which are subject to less efficient heating and coagulation cascade behaviors than true puncture injuries.

  7. Low-Latency Telerobotic Sample Return and Biomolecular Sequencing for Deep Space Gateway

    NASA Astrophysics Data System (ADS)

    Lupisella, M.; Bleacher, J.; Lewis, R.; Dworkin, J.; Wright, M.; Burton, A.; Rubins, K.; Wallace, S.; Stahl, S.; John, K.; Archer, D.; Niles, P.; Regberg, A.; Smith, D.; Race, M.; Chiu, C.; Russell, J.; Rampe, E.; Bywaters, K.

    2018-02-01

    Low-latency telerobotics, crew-assisted sample return, and biomolecular sequencing can be used to acquire and analyze lunar farside and/or Apollo landing site samples. Sequencing can also be used to monitor and study Deep Space Gateway environment and crew health.

  8. AMPLISAS: a web server for multilocus genotyping using next-generation amplicon sequencing data.

    PubMed

    Sebastian, Alvaro; Herdegen, Magdalena; Migalska, Magdalena; Radwan, Jacek

    2016-03-01

    Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. © 2015 John Wiley & Sons Ltd.

  9. Deep sequencing of small RNA repertoires in mice reveals metabolic disorders-associated hepatic miRNAs.

    PubMed

    Liang, Tingming; Liu, Chang; Ye, Zhenchao

    2013-01-01

    Obesity and associated metabolic disorders contribute importantly to the metabolic syndrome. On the other hand, microRNAs (miRNAs) are a class of small non-coding RNAs that repress target gene expression by inducing mRNA degradation and/or translation repression. Dysregulation of specific miRNAs in obesity may influence energy metabolism and cause insulin resistance, which leads to dyslipidemia, steatosis hepatis and type 2 diabetes. In the present study, we comprehensively analyzed and validated dysregulated miRNAs in ob/ob mouse liver, as well as miRNA groups based on miRNA gene cluster and gene family by using deep sequencing miRNA datasets. We found that over 13.8% of the total analyzed miRNAs were dysregulated, of which 37 miRNA species showed significantly differential expression. Further RT-qPCR analysis in some selected miRNAs validated the similar expression patterns observed in deep sequencing. Interestingly, we found that miRNA gene cluster and family always showed consistent dysregulation patterns in ob/ob mouse liver, although they had various enrichment levels. Functional enrichment analysis revealed the versatile physiological roles (over six signal pathways and five human diseases) of these miRNAs. Biological studies indicated that overexpression of miR-126 or inhibition of miR-24 in AML-12 cells attenuated free fatty acids-induced fat accumulation. Taken together, our data strongly suggest that obesity and metabolic disturbance are tightly associated with functional miRNAs. We also identified hepatic miRNA candidates serving as potential biomarkers for the diagnose of the metabolic syndrome.

  10. Fungal communities from the calcareous deep-sea sediments in the Southwest India Ridge revealed by Illumina sequencing technology.

    PubMed

    Zhang, Likui; Kang, Manyu; Huang, Yangchao; Yang, Lixiang

    2016-05-01

    The diversity and ecological significance of bacteria and archaea in deep-sea environments have been thoroughly investigated, but eukaryotic microorganisms in these areas, such as fungi, are poorly understood. To elucidate fungal diversity in calcareous deep-sea sediments in the Southwest India Ridge (SWIR), the internal transcribed spacer (ITS) regions of rRNA genes from two sediment metagenomic DNA samples were amplified and sequenced using the Illumina sequencing platform. The results revealed that 58-63 % and 36-42 % of the ITS sequences (97 % similarity) belonged to Basidiomycota and Ascomycota, respectively. These findings suggest that Basidiomycota and Ascomycota are the predominant fungal phyla in the two samples. We also found that Agaricomycetes, Leotiomycetes, and Pezizomycetes were the major fungal classes in the two samples. At the species level, Thelephoraceae sp. and Phialocephala fortinii were major fungal species in the two samples. Despite the low relative abundance, unidentified fungal sequences were also observed in the two samples. Furthermore, we found that there were slight differences in fungal diversity between the two sediment samples, although both were collected from the SWIR. Thus, our results demonstrate that calcareous deep-sea sediments in the SWIR harbor diverse fungi, which augment the fungal groups in deep-sea sediments. This is the first report of fungal communities in calcareous deep-sea sediments in the SWIR revealed by Illumina sequencing.

  11. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

    PubMed Central

    Margulies, Elliott H.; Cooper, Gregory M.; Asimenos, George; Thomas, Daryl J.; Dewey, Colin N.; Siepel, Adam; Birney, Ewan; Keefe, Damian; Schwartz, Ariel S.; Hou, Minmei; Taylor, James; Nikolaev, Sergey; Montoya-Burgos, Juan I.; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Brown, James B.; Bickel, Peter; Holmes, Ian; Mullikin, James C.; Ureta-Vidal, Abel; Paten, Benedict; Stone, Eric A.; Rosenbloom, Kate R.; Kent, W. James; Bouffard, Gerard G.; Guan, Xiaobin; Hansen, Nancy F.; Idol, Jacquelyn R.; Maduro, Valerie V.B.; Maskeri, Baishali; McDowell, Jennifer C.; Park, Morgan; Thomas, Pamela J.; Young, Alice C.; Blakesley, Robert W.; Muzny, Donna M.; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Jiang, Huaiyang; Weinstock, George M.; Gibbs, Richard A.; Graves, Tina; Fulton, Robert; Mardis, Elaine R.; Wilson, Richard K.; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B.; Chang, Jean L.; Lindblad-Toh, Kerstin; Lander, Eric S.; Hinrichs, Angie; Trumbower, Heather; Clawson, Hiram; Zweig, Ann; Kuhn, Robert M.; Barber, Galt; Harte, Rachel; Karolchik, Donna; Field, Matthew A.; Moore, Richard A.; Matthewson, Carrie A.; Schein, Jacqueline E.; Marra, Marco A.; Antonarakis, Stylianos E.; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross; Haussler, David; Miller, Webb; Pachter, Lior; Green, Eric D.; Sidow, Arend

    2007-01-01

    A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization. PMID:17567995

  12. Position-specific automated processing of V3 env ultra-deep pyrosequencing data for predicting HIV-1 tropism

    PubMed Central

    Jeanne, Nicolas; Saliou, Adrien; Carcenac, Romain; Lefebvre, Caroline; Dubois, Martine; Cazabat, Michelle; Nicot, Florence; Loiseau, Claire; Raymond, Stéphanie; Izopet, Jacques; Delobel, Pierre

    2015-01-01

    HIV-1 coreceptor usage must be accurately determined before starting CCR5 antagonist-based treatment as the presence of undetected minor CXCR4-using variants can cause subsequent virological failure. Ultra-deep pyrosequencing of HIV-1 V3 env allows to detect low levels of CXCR4-using variants that current genotypic approaches miss. However, the computation of the mass of sequence data and the need to identify true minor variants while excluding artifactual sequences generated during amplification and ultra-deep pyrosequencing is rate-limiting. Arbitrary fixed cut-offs below which minor variants are discarded are currently used but the errors generated during ultra-deep pyrosequencing are sequence-dependant rather than random. We have developed an automated processing of HIV-1 V3 env ultra-deep pyrosequencing data that uses biological filters to discard artifactual or non-functional V3 sequences followed by statistical filters to determine position-specific sensitivity thresholds, rather than arbitrary fixed cut-offs. It allows to retain authentic sequences with point mutations at V3 positions of interest and discard artifactual ones with accurate sensitivity thresholds. PMID:26585833

  13. Position-specific automated processing of V3 env ultra-deep pyrosequencing data for predicting HIV-1 tropism.

    PubMed

    Jeanne, Nicolas; Saliou, Adrien; Carcenac, Romain; Lefebvre, Caroline; Dubois, Martine; Cazabat, Michelle; Nicot, Florence; Loiseau, Claire; Raymond, Stéphanie; Izopet, Jacques; Delobel, Pierre

    2015-11-20

    HIV-1 coreceptor usage must be accurately determined before starting CCR5 antagonist-based treatment as the presence of undetected minor CXCR4-using variants can cause subsequent virological failure. Ultra-deep pyrosequencing of HIV-1 V3 env allows to detect low levels of CXCR4-using variants that current genotypic approaches miss. However, the computation of the mass of sequence data and the need to identify true minor variants while excluding artifactual sequences generated during amplification and ultra-deep pyrosequencing is rate-limiting. Arbitrary fixed cut-offs below which minor variants are discarded are currently used but the errors generated during ultra-deep pyrosequencing are sequence-dependant rather than random. We have developed an automated processing of HIV-1 V3 env ultra-deep pyrosequencing data that uses biological filters to discard artifactual or non-functional V3 sequences followed by statistical filters to determine position-specific sensitivity thresholds, rather than arbitrary fixed cut-offs. It allows to retain authentic sequences with point mutations at V3 positions of interest and discard artifactual ones with accurate sensitivity thresholds.

  14. Identification and Functional Analysis of Flowering Related microRNAs in Common Wild Rice (Oryza rufipogon Griff.)

    PubMed Central

    Dong, Yibo; Yuan, Qianhua; Wang, Feng; Li, Weimin; Jiang, Ying; Jia, Shirong; Pei, XinWu

    2013-01-01

    Background MicroRNAs (miRNAs) is a class of non-coding RNAs involved in post- transcriptional control of gene expression, via degradation and/or translational inhibition. Six-hundred sixty-one rice miRNAs are known that are important in plant development. However, flowering-related miRNAs have not been characterized in Oryza rufipogon Griff. It was approved by supervision department of Guangdong wild rice protection. We analyzed flowering-related miRNAs in O. rufipogon using high-throughput sequencing (deep sequencing) to understand the changes that occurred during rice domestication, and to elucidate their functions in flowering. Results Three O. rufipogon sRNA libraries, two vegetative stage (CWR-V1 and CWR-V2) and one flowering stage (CWR-F2) were sequenced using Illumina deep sequencing. A total of 20,156,098, 21,531,511 and 20,995,942 high quality sRNA reads were obtained from CWR-V1, CWR-V2 and CWR-F2, respectively, of which 3,448,185, 4,265,048 and 2,833,527 reads matched known miRNAs. We identified 512 known rice miRNAs in 214 miRNA families and predicted 290 new miRNAs. Targeted functional annotation, GO and KEGG pathway analyses predicted that 187 miRNAs regulate expression of flowering-related genes. Differential expression analysis of flowering-related miRNAs showed that: expression of 95 miRNAs varied significantly between the libraries, 66 are flowering-related miRNAs, such as oru-miR97, oru-miR117, oru-miR135, oru-miR137, et al. 17 are early-flowering -related miRNAs, including osa-miR160f, osa-miR164d, osa-miR167d, osa-miR169a, osa-miR172b, oru-miR4, et al., induced during the floral transition. Real-time PCR revealed the same expression patterns as deep sequencing. miRNAs targets were confirmed for cleavage by 5′-RACE in vivo, and were negatively regulated by miRNAs. Conclusions This is the first investigation of flowering miRNAs in wild rice. The result indicates that variation in miRNAs occurred during rice domestication and lays a foundation for further study of phase change and flowering in O. rufipogon. Complicated regulatory networks mediated by multiple miRNAs regulate the expression of flowering genes that control the induction of flowering. PMID:24386120

  15. Accurate identification of RNA editing sites from primitive sequence with deep neural networks.

    PubMed

    Ouyang, Zhangyi; Liu, Feng; Zhao, Chenghui; Ren, Chao; An, Gaole; Mei, Chuan; Bo, Xiaochen; Shu, Wenjie

    2018-04-16

    RNA editing is a post-transcriptional RNA sequence alteration. Current methods have identified editing sites and facilitated research but require sufficient genomic annotations and prior-knowledge-based filtering steps, resulting in a cumbersome, time-consuming identification process. Moreover, these methods have limited generalizability and applicability in species with insufficient genomic annotations or in conditions of limited prior knowledge. We developed DeepRed, a deep learning-based method that identifies RNA editing from primitive RNA sequences without prior-knowledge-based filtering steps or genomic annotations. DeepRed achieved 98.1% and 97.9% area under the curve (AUC) in training and test sets, respectively. We further validated DeepRed using experimentally verified U87 cell RNA-seq data, achieving 97.9% positive predictive value (PPV). We demonstrated that DeepRed offers better prediction accuracy and computational efficiency than current methods with large-scale, mass RNA-seq data. We used DeepRed to assess the impact of multiple factors on editing identification with RNA-seq data from the Association of Biomolecular Resource Facilities and Sequencing Quality Control projects. We explored developmental RNA editing pattern changes during human early embryogenesis and evolutionary patterns in Drosophila species and the primate lineage using DeepRed. Our work illustrates DeepRed's state-of-the-art performance; it may decipher the hidden principles behind RNA editing, making editing detection convenient and effective.

  16. Sequence capture by hybridization to explore modern and ancient genomic diversity in model and nonmodel organisms

    PubMed Central

    Gasc, Cyrielle; Peyretaillade, Eric

    2016-01-01

    Abstract The recent expansion of next-generation sequencing has significantly improved biological research. Nevertheless, deep exploration of genomes or metagenomic samples remains difficult because of the sequencing depth and the associated costs required. Therefore, different partitioning strategies have been developed to sequence informative subsets of studied genomes. Among these strategies, hybridization capture has proven to be an innovative and efficient tool for targeting and enriching specific biomarkers in complex DNA mixtures. It has been successfully applied in numerous areas of biology, such as exome resequencing for the identification of mutations underlying Mendelian or complex diseases and cancers, and its usefulness has been demonstrated in the agronomic field through the linking of genetic variants to agricultural phenotypic traits of interest. Moreover, hybridization capture has provided access to underexplored, but relevant fractions of genomes through its ability to enrich defined targets and their flanking regions. Finally, on the basis of restricted genomic information, this method has also allowed the expansion of knowledge of nonreference species and ancient genomes and provided a better understanding of metagenomic samples. In this review, we present the major advances and discoveries permitted by hybridization capture and highlight the potency of this approach in all areas of biology. PMID:27105841

  17. Sequence capture by hybridization to explore modern and ancient genomic diversity in model and nonmodel organisms.

    PubMed

    Gasc, Cyrielle; Peyretaillade, Eric; Peyret, Pierre

    2016-06-02

    The recent expansion of next-generation sequencing has significantly improved biological research. Nevertheless, deep exploration of genomes or metagenomic samples remains difficult because of the sequencing depth and the associated costs required. Therefore, different partitioning strategies have been developed to sequence informative subsets of studied genomes. Among these strategies, hybridization capture has proven to be an innovative and efficient tool for targeting and enriching specific biomarkers in complex DNA mixtures. It has been successfully applied in numerous areas of biology, such as exome resequencing for the identification of mutations underlying Mendelian or complex diseases and cancers, and its usefulness has been demonstrated in the agronomic field through the linking of genetic variants to agricultural phenotypic traits of interest. Moreover, hybridization capture has provided access to underexplored, but relevant fractions of genomes through its ability to enrich defined targets and their flanking regions. Finally, on the basis of restricted genomic information, this method has also allowed the expansion of knowledge of nonreference species and ancient genomes and provided a better understanding of metagenomic samples. In this review, we present the major advances and discoveries permitted by hybridization capture and highlight the potency of this approach in all areas of biology. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Uncovering leaf rust responsive miRNAs in wheat (Triticum aestivum L.) using high-throughput sequencing and prediction of their targets through degradome analysis.

    PubMed

    Kumar, Dhananjay; Dutta, Summi; Singh, Dharmendra; Prabhu, Kumble Vinod; Kumar, Manish; Mukhopadhyay, Kunal

    2017-01-01

    Deep sequencing identified 497 conserved and 559 novel miRNAs in wheat, while degradome analysis revealed 701 targets genes. QRT-PCR demonstrated differential expression of miRNAs during stages of leaf rust progression. Bread wheat (Triticum aestivum L.) is an important cereal food crop feeding 30 % of the world population. Major threat to wheat production is the rust epidemics. This study was targeted towards identification and functional characterizations of micro(mi)RNAs and their target genes in wheat in response to leaf rust ingression. High-throughput sequencing was used for transcriptome-wide identification of miRNAs and their expression profiling in retort to leaf rust using mock and pathogen-inoculated resistant and susceptible near-isogenic wheat plants. A total of 1056 mature miRNAs were identified, of which 497 miRNAs were conserved and 559 miRNAs were novel. The pathogen-inoculated resistant plants manifested more miRNAs compared with the pathogen infected susceptible plants. The miRNA counts increased in susceptible isoline due to leaf rust, conversely, the counts decreased in the resistant isoline in response to pathogenesis illustrating precise spatial tuning of miRNAs during compatible and incompatible interaction. Stem-loop quantitative real-time PCR was used to profile 10 highly differentially expressed miRNAs obtained from high-throughput sequencing data. The spatio-temporal profiling validated the differential expression of miRNAs between the isolines as well as in retort to pathogen infection. Degradome analysis provided 701 predicted target genes associated with defense response, signal transduction, development, metabolism, and transcriptional regulation. The obtained results indicate that wheat isolines employ diverse arrays of miRNAs that modulate their target genes during compatible and incompatible interaction. Our findings contribute to increase knowledge on roles of microRNA in wheat-leaf rust interactions and could help in rust resistance breeding programs.

  19. A deep learning method for lincRNA detection using auto-encoder algorithm.

    PubMed

    Yu, Ning; Yu, Zeng; Pan, Yi

    2017-12-06

    RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.

  20. Deep drilling into the Chesapeake Bay impact structure

    USGS Publications Warehouse

    Gohn, G.S.; Koeberl, C.; Miller, K.G.; Reimold, W.U.; Browning, J.V.; Cockell, C.S.; Horton, J. Wright; Kenkmann, T.; Kulpecz, A.A.; Powars, D.S.; Sanford, W.E.; Voytek, M.A.

    2008-01-01

    Samples from a 1.76-kilometer-deep corehole drilled near the center of the late Eocene Chesapeake Bay impact structure (Virginia, USA) reveal its geologic, hydrologic, and biologic history. We conducted stratigraphic and petrologic analyses of the cores to elucidate the timing and results of impact-melt creation and distribution, transient-cavity collapse, and ocean-water resurge. Comparison of post-impact sedimentary sequences inside and outside the structure indicates that compaction of the crater fill influenced long-term sedimentation patterns in the mid-Atlantic region. Salty connate water of the target remains in the crater fill today, where it poses a potential threat to the regional groundwater resource. Observed depth variations in microbial abundance indicate a complex history of impact-related thermal sterilization and habitat modification, and subsequent post-impact repopulation.

  1. Deep drilling into the Chesapeake Bay impact structure.

    PubMed

    Gohn, G S; Koeberl, C; Miller, K G; Reimold, W U; Browning, J V; Cockell, C S; Horton, J W; Kenkmann, T; Kulpecz, A A; Powars, D S; Sanford, W E; Voytek, M A

    2008-06-27

    Samples from a 1.76-kilometer-deep corehole drilled near the center of the late Eocene Chesapeake Bay impact structure (Virginia, USA) reveal its geologic, hydrologic, and biologic history. We conducted stratigraphic and petrologic analyses of the cores to elucidate the timing and results of impact-melt creation and distribution, transient-cavity collapse, and ocean-water resurge. Comparison of post-impact sedimentary sequences inside and outside the structure indicates that compaction of the crater fill influenced long-term sedimentation patterns in the mid-Atlantic region. Salty connate water of the target remains in the crater fill today, where it poses a potential threat to the regional groundwater resource. Observed depth variations in microbial abundance indicate a complex history of impact-related thermal sterilization and habitat modification, and subsequent post-impact repopulation.

  2. A new method for enhancer prediction based on deep belief network.

    PubMed

    Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong

    2017-10-16

    Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.

  3. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    USDA-ARS?s Scientific Manuscript database

    Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RN...

  4. Identification of microRNAs from Amur grape (vitis amurensis Rupr.) by deep sequencing and analysis of microRNA variations with bioinformatics

    PubMed Central

    2012-01-01

    Background MicroRNA (miRNA) is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr.) is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs) from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR) analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72 new potential miRNAs and 34 known but non-conserved miRNAs, indicating that specific miRNAs exist in Amur grape. These results show that a number of regulatory miRNAs exist in Amur grape and play an important role in Amur grape growth, development, and response to abiotic or biotic stress. PMID:22455456

  5. Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes.

    PubMed

    Burkholder, William F; Newell, Evan W; Poidinger, Michael; Chen, Swaine; Fink, Katja

    2017-01-01

    The inaugural workshop "Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes" was held in Singapore on 13-14 October 2016. The aim of the workshop was to discuss the latest trends in using high-throughput sequencing, bioinformatics, and allied technologies to analyze immune and pathogen repertoires and their interplay within the host, bringing together key international players in the field and Singapore-based researchers and clinician-scientists. The focus was in particular on the application of these technologies for the improvement of patient diagnosis, prognosis and treatment, and for other broad public health outcomes. The presentations by scientists and clinicians showed the potential of deep sequencing technology to capture the coevolution of adaptive immunity and pathogens. For clinical applications, some key challenges remain, such as the long turnaround time and relatively high cost of deep sequencing for pathogen identification and characterization and the lack of international standardization in immune repertoire analysis.

  6. Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes

    PubMed Central

    Burkholder, William F.; Newell, Evan W.; Poidinger, Michael; Chen, Swaine; Fink, Katja

    2017-01-01

    The inaugural workshop “Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes” was held in Singapore on 13–14 October 2016. The aim of the workshop was to discuss the latest trends in using high-throughput sequencing, bioinformatics, and allied technologies to analyze immune and pathogen repertoires and their interplay within the host, bringing together key international players in the field and Singapore-based researchers and clinician-scientists. The focus was in particular on the application of these technologies for the improvement of patient diagnosis, prognosis and treatment, and for other broad public health outcomes. The presentations by scientists and clinicians showed the potential of deep sequencing technology to capture the coevolution of adaptive immunity and pathogens. For clinical applications, some key challenges remain, such as the long turnaround time and relatively high cost of deep sequencing for pathogen identification and characterization and the lack of international standardization in immune repertoire analysis. PMID:28620372

  7. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    PubMed

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  8. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits. PMID:23116282

  9. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni.

    PubMed

    Mandhan, Vibha; Kaur, Jagdeep; Singh, Kashmir

    2012-11-01

    MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to understand their roles in key stevia traits.

  10. Deep sequencing of the Trypanosoma cruzi GP63 surface proteases reveals diversity and diversifying selection among chronic and congenital Chagas disease patients.

    PubMed

    Llewellyn, Martin S; Messenger, Louisa A; Luquetti, Alejandro O; Garcia, Lineth; Torrico, Faustino; Tavares, Suelene B N; Cheaib, Bachar; Derome, Nicolas; Delepine, Marc; Baulard, Céline; Deleuze, Jean-Francois; Sauer, Sascha; Miles, Michael A

    2015-04-01

    Chagas disease results from infection with the diploid protozoan parasite Trypanosoma cruzi. T. cruzi is highly genetically diverse, and multiclonal infections in individual hosts are common, but little studied. In this study, we explore T. cruzi infection multiclonality in the context of age, sex and clinical profile among a cohort of chronic patients, as well as paired congenital cases from Cochabamba, Bolivia and Goias, Brazil using amplicon deep sequencing technology. A 450bp fragment of the trypomastigote TcGP63I surface protease gene was amplified and sequenced across 70 chronic and 22 congenital cases on the Illumina MiSeq platform. In addition, a second, mitochondrial target--ND5--was sequenced across the same cohort of cases. Several million reads were generated, and sequencing read depths were normalized within patient cohorts (Goias chronic, n = 43, Goias congenital n = 2, Bolivia chronic, n = 27; Bolivia congenital, n = 20), Among chronic cases, analyses of variance indicated no clear correlation between intra-host sequence diversity and age, sex or symptoms, while principal coordinate analyses showed no clustering by symptoms between patients. Between congenital pairs, we found evidence for the transmission of multiple sequence types from mother to infant, as well as widespread instances of novel genotypes in infants. Finally, non-synonymous to synonymous (dn:ds) nucleotide substitution ratios among sequences of TcGP63Ia and TcGP63Ib subfamilies within each cohort provided powerful evidence of strong diversifying selection at this locus. Our results shed light on the diversity of parasite DTUs within each patient, as well as the extent to which parasite strains pass between mother and foetus in congenital cases. Although we were unable to find any evidence that parasite diversity accumulates with age in our study cohorts, putative diversifying selection within members of the TcGP63I gene family suggests a link between genetic diversity within this gene family and survival in the mammalian host.

  11. DeepMirTar: a deep-learning approach for predicting human miRNA targets.

    PubMed

    Wen, Ming; Cong, Peisheng; Zhang, Zhimin; Lu, Hongmei; Li, Tonghua

    2018-06-01

    MicroRNAs (miRNAs) are small noncoding RNAs that function in RNA silencing and post-transcriptional regulation of gene expression by targeting messenger RNAs (mRNAs). Because the underlying mechanisms associated with miRNA binding to mRNA are not fully understood, a major challenge of miRNA studies involves the identification of miRNA-target sites on mRNA. In silico prediction of miRNA-target sites can expedite costly and time-consuming experimental work by providing the most promising miRNA-target-site candidates. In this study, we reported the design and implementation of DeepMirTar, a deep-learning-based approach for accurately predicting human miRNA targets at the site level. The predicted miRNA-target sites are those having canonical or non-canonical seed, and features, including high-level expert-designed, low-level expert-designed, and raw-data-level, were used to represent the miRNA-target site. Comparison with other state-of-the-art machine-learning methods and existing miRNA-target-prediction tools indicated that DeepMirTar improved overall predictive performance. DeepMirTar is freely available at https://github.com/Bjoux2/DeepMirTar_SdA. lith@tongji.edu.cn, hongmeilu@csu.edu.cn. Supplementary data are available at Bioinformatics online.

  12. Cultivating the Deep Subsurface Microbiome

    NASA Astrophysics Data System (ADS)

    Casar, C. P.; Osburn, M. R.; Flynn, T. M.; Masterson, A.; Kruger, B.

    2017-12-01

    Subterranean ecosystems are poorly understood because many microbes detected in metagenomic surveys are only distantly related to characterized isolates. Cultivating microorganisms from the deep subsurface is challenging due to its inaccessibility and potential for contamination. The Deep Mine Microbial Observatory (DeMMO) in Lead, SD however, offers access to deep microbial life via pristine fracture fluids in bedrock to a depth of 1478 m. The metabolic landscape of DeMMO was previously characterized via thermodynamic modeling coupled with genomic data, illustrating the potential for microbial inhabitants of DeMMO to utilize mineral substrates as energy sources. Here, we employ field and lab based cultivation approaches with pure minerals to link phylogeny to metabolism at DeMMO. Fracture fluids were directed through reactors filled with Fe3O4, Fe2O3, FeS2, MnO2, and FeCO3 at two sites (610 m and 1478 m) for 2 months prior to harvesting for subsequent analyses. We examined mineralogical, geochemical, and microbiological composition of the reactors via DNA sequencing, microscopy, lipid biomarker characterization, and bulk C and N isotope ratios to determine the influence of mineralogy on biofilm community development. Pre-characterized mineral chips were imaged via SEM to assay microbial growth; preliminary results suggest MnO2, Fe3O4, and Fe2O3 were most conducive to colonization. Solid materials from reactors were used as inoculum for batch cultivation experiments. Media designed to mimic fracture fluid chemistry was supplemented with mineral substrates targeting metal reducers. DNA sequences and microscopy of iron oxide-rich biofilms and fracture fluids suggest iron oxidation is a major energy source at redox transition zones where anaerobic fluids meet more oxidizing conditions. We utilized these biofilms and fluids as inoculum in gradient cultivation experiments targeting microaerophilic iron oxidizers. Cultivation of microbes endemic to DeMMO, a system locally dominated by unclassified and candidate phyla, has the potential to yield novel subsurface organisms with unique physiologies. We intend to further utilize subsurface isolates to probe the effects of geochemical perturbations on biosignatures in future studies, thus broadening our understanding of subterranean ecosystems.

  13. RNase L targets distinct sites in influenza A virus RNAs.

    PubMed

    Cooper, Daphne A; Banerjee, Shuvojit; Chakrabarti, Arindam; García-Sastre, Adolfo; Hesselberth, Jay R; Silverman, Robert H; Barton, David J

    2015-03-01

    Influenza A virus (IAV) infections are influenced by type 1 interferon-mediated antiviral defenses and by viral countermeasures to these defenses. When IAV NS1 protein is disabled, RNase L restricts virus replication; however, the RNAs targeted for cleavage by RNase L under these conditions have not been defined. In this study, we used deep-sequencing methods to identify RNase L cleavage sites within host and viral RNAs from IAV PR8ΔNS1-infected A549 cells. Short hairpin RNA knockdown of RNase L allowed us to distinguish between RNase L-dependent and RNase L-independent cleavage sites. RNase L-dependent cleavage sites were evident at discrete locations in IAV RNA segments (both positive and negative strands). Cleavage in PB2, PB1, and PA genomic RNAs suggests that viral RNPs are susceptible to cleavage by RNase L. Prominent amounts of cleavage mapped to specific regions within IAV RNAs, including some areas of increased synonymous-site conservation. Among cellular RNAs, RNase L-dependent cleavage was most frequent at precise locations in rRNAs. Our data show that RNase L targets specific sites in both host and viral RNAs to restrict influenza virus replication when NS1 protein is disabled. RNase L is a critical component of interferon-regulated and double-stranded-RNA-activated antiviral host responses. We sought to determine how RNase L exerts its antiviral activity during influenza virus infection. We enhanced the antiviral activity of RNase L by disabling a viral protein, NS1, that inhibits the activation of RNase L. Then, using deep-sequencing methods, we identified the host and viral RNAs targeted by RNase L. We found that RNase L cleaved viral RNAs and rRNAs at very precise locations. The direct cleavage of IAV RNAs by RNase L highlights an intimate battle between viral RNAs and an antiviral endonuclease. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  14. DNA Replication Profiling Using Deep Sequencing.

    PubMed

    Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

    2018-01-01

    Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

  15. High-Throughput Sequencing Reveals Differential Expression of miRNAs in Intestine from Sea Cucumber during Aestivation

    PubMed Central

    Chen, Muyan; Zhang, Xiumei; Liu, Jianning; Storey, Kenneth B.

    2013-01-01

    The regulatory role of miRNA in gene expression is an emerging hot new topic in the control of hypometabolism. Sea cucumber aestivation is a complicated physiological process that includes obvious hypometabolism as evidenced by a decrease in the rates of oxygen consumption and ammonia nitrogen excretion, as well as a serious degeneration of the intestine into a very tiny filament. To determine whether miRNAs play regulatory roles in this process, the present study analyzed profiles of miRNA expression in the intestine of the sea cucumber (Apostichopus japonicus), using Solexa deep sequencing technology. We identified 308 sea cucumber miRNAs, including 18 novel miRNAs specific to sea cucumber. Animals sampled during deep aestivation (DA) after at least 15 days of continuous torpor, were compared with animals from a non-aestivation (NA) state (animals that had passed through aestivation and returned to the active state). We identified 42 differentially expressed miRNAs [RPM (reads per million) >10, |FC| (|fold change|) ≥1, FDR (false discovery rate) <0.01] during aestivation, which were validated by two other miRNA profiling methods: miRNA microarray and real-time PCR. Among the most prominent miRNA species, miR-200-3p, miR-2004, miR-2010, miR-22, miR-252a, miR-252a-3p and miR-92 were significantly over-expressed during deep aestivation compared with non-aestivation animals. Preliminary analyses of their putative target genes and GO analysis suggest that these miRNAs could play important roles in global transcriptional depression and cell differentiation during aestivation. High-throughput sequencing data and microarray data have been submitted to GEO database. PMID:24143179

  16. Ancient genomics

    PubMed Central

    Der Sarkissian, Clio; Allentoft, Morten E.; Ávila-Arcos, María C.; Barnett, Ross; Campos, Paula F.; Cappellini, Enrico; Ermini, Luca; Fernández, Ruth; da Fonseca, Rute; Ginolhac, Aurélien; Hansen, Anders J.; Jónsson, Hákon; Korneliussen, Thorfinn; Margaryan, Ashot; Martin, Michael D.; Moreno-Mayar, J. Víctor; Raghavan, Maanasa; Rasmussen, Morten; Velasco, Marcela Sandoval; Schroeder, Hannes; Schubert, Mikkel; Seguin-Orlando, Andaine; Wales, Nathan; Gilbert, M. Thomas P.; Willerslev, Eske; Orlando, Ludovic

    2015-01-01

    The past decade has witnessed a revolution in ancient DNA (aDNA) research. Although the field's focus was previously limited to mitochondrial DNA and a few nuclear markers, whole genome sequences from the deep past can now be retrieved. This breakthrough is tightly connected to the massive sequence throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans, archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when testing specific hypotheses related to the past. PMID:25487338

  17. Transcriptome sequencing analysis of novel sRNAs of Kineococcus radiotolerans in response to ionizing radiation.

    PubMed

    Chen, Zhouwei; Li, Lufeng; Shan, Zhan; Huang, Hannian; Chen, Huan; Ding, Xianfeng; Guo, Jiangfeng; Liu, Lili

    2016-11-01

    Kineococcus radiotolerans is a Gram-positive, radio-resistant bacterium isolated from a radioactive environment. The small noncoding RNAs (sRNAs) in bacteria are reported to play roles in the immediate response to stress and/or the recovery from stress. The analysis of K. radiotolerans transcriptome sequencing results can identify these sRNAs in a genome-wide detection, using RNA sequencing (RNA-seq) by the deep sequencing technique. In this study, the raw data of radiation-exposed samples (RS) and control samples (CS) were acquired separately from the sequencing platform. There were 217 common sRNA candidates in the two samples screened in the genome-wide scale by bioinformatics analysis. There were 43 differentially expressed sRNA candidates, including 28 up-regulated and 15 down-regulated ones. The down-regulated sRNAs were selected for the sRNA target prediction, of which 12 sRNAs that may modulate the genes related to the transcription regulation and DNA repair were considered as the candidates involved in the radio-resistance regulation system. Copyright © 2016 Elsevier GmbH. All rights reserved.

  18. DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data.

    PubMed

    Arango-Argoty, Gustavo; Garner, Emily; Pruden, Amy; Heath, Lenwood S; Vikesland, Peter; Zhang, Liqing

    2018-02-01

    Growing concerns about increasing rates of antibiotic resistance call for expanded and comprehensive global monitoring. Advancing methods for monitoring of environmental media (e.g., wastewater, agricultural waste, food, and water) is especially needed for identifying potential resources of novel antibiotic resistance genes (ARGs), hot spots for gene exchange, and as pathways for the spread of ARGs and human exposure. Next-generation sequencing now enables direct access and profiling of the total metagenomic DNA pool, where ARGs are typically identified or predicted based on the "best hits" of sequence searches against existing databases. Unfortunately, this approach produces a high rate of false negatives. To address such limitations, we propose here a deep learning approach, taking into account a dissimilarity matrix created using all known categories of ARGs. Two deep learning models, DeepARG-SS and DeepARG-LS, were constructed for short read sequences and full gene length sequences, respectively. Evaluation of the deep learning models over 30 antibiotic resistance categories demonstrates that the DeepARG models can predict ARGs with both high precision (> 0.97) and recall (> 0.90). The models displayed an advantage over the typical best hit approach, yielding consistently lower false negative rates and thus higher overall recall (> 0.9). As more data become available for under-represented ARG categories, the DeepARG models' performance can be expected to be further enhanced due to the nature of the underlying neural networks. Our newly developed ARG database, DeepARG-DB, encompasses ARGs predicted with a high degree of confidence and extensive manual inspection, greatly expanding current ARG repositories. The deep learning models developed here offer more accurate antimicrobial resistance annotation relative to current bioinformatics practice. DeepARG does not require strict cutoffs, which enables identification of a much broader diversity of ARGs. The DeepARG models and database are available as a command line version and as a Web service at http://bench.cs.vt.edu/deeparg .

  19. The “curved lead pathway” method to enable a single lead to reach any two intracranial targets

    NASA Astrophysics Data System (ADS)

    Ding, Chen-Yu; Yu, Liang-Hong; Lin, Yuan-Xiang; Chen, Fan; Lin, Zhang-Ya; Kang, De-Zhi

    2017-01-01

    Deep brain stimulation is an effective way to treat movement disorders, and a powerful research tool for exploring brain functions. This report proposes a “curved lead pathway” method for lead implantation, such that a single lead can reach in sequence to any two intracranial targets. A new type of stereotaxic system for implanting a curved lead to the brain of human/primates was designed, the auxiliary device needed for this method to be used in rat/mouse was fabricated and verified in rat, and the Excel algorithm used for automatically calculating the necessary parameters was implemented. This “curved lead pathway” method of lead implantation may complement the current method, make lead implantation for multiple targets more convenient, and expand the experimental techniques of brain function research.

  20. Endogenous System Microbes as Treatment Process ...

    EPA Pesticide Factsheets

    Monitoring the efficacy of treatment strategies to remove pathogens in decentralized systems remains a challenge. Evaluating log reduction targets by measuring pathogen levels is hampered by their sporadic and low occurrence rates. Fecal indicator bacteria are used in centralized systems to indicate the presence of fecal pathogens, but are ineffective decentralized treatment process indicators as they generally occur at levels too low to assess log reduction targets. System challenge testing by spiking with high loads of fecal indicator organisms, like MS2 coliphage, has limitations, especially for large systems. Microbes that are endogenous to the decentralized system, occur in high abundances and mimic removal rates of bacterial, viral and/or parasitic protozoan pathogens during treatment could serve as alternative treatment process indicators to verify log reduction targets. To identify abundant microbes in wastewater, the bacterial and viral communities were examined using deep sequencing. Building infrastructure-associated bacteria, like Zoogloea, were observed as dominant members of the bacterial community in graywater. In blackwater, bacteriophage of the order Caudovirales constituted the majority of contiguous sequences from the viral community. This study identifies candidate treatment process indicators in decentralized systems that could be used to verify log removal during treatment. The association of the presence of treatment process indic

  1. Repurposing CRISPR/Cas9 for in situ functional assays.

    PubMed

    Malina, Abba; Mills, John R; Cencic, Regina; Yan, Yifei; Fraser, James; Schippers, Laura M; Paquet, Marilène; Dostie, Josée; Pelletier, Jerry

    2013-12-01

    RNAi combined with next-generation sequencing has proven to be a powerful and cost-effective genetic screening platform in mammalian cells. Still, this technology has its limitations and is incompatible with in situ mutagenesis screens on a genome-wide scale. Using p53 as a proof-of-principle target, we readapted the CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR associated 9) genome-editing system to demonstrate the feasibility of this methodology for targeted gene disruption positive selection assays. By using novel "all-in-one" lentiviral and retroviral delivery vectors heterologously expressing both a codon-optimized Cas9 and its synthetic guide RNA (sgRNA), we show robust selection for the CRISPR-modified Trp53 locus following drug treatment. Furthermore, by linking Cas9 expression to GFP fluorescence, we use an "all-in-one" system to track disrupted Trp53 in chemoresistant lymphomas in the Eμ-myc mouse model. Deep sequencing analysis of the tumor-derived endogenous Cas9-modified Trp53 locus revealed a wide spectrum of mutants that were enriched with seemingly limited off-target effects. Taken together, these results establish Cas9 genome editing as a powerful and practical approach for positive in situ genetic screens.

  2. Repurposing CRISPR/Cas9 for in situ functional assays

    PubMed Central

    Malina, Abba; Mills, John R.; Cencic, Regina; Yan, Yifei; Fraser, James; Schippers, Laura M.; Paquet, Marilène; Dostie, Josée; Pelletier, Jerry

    2013-01-01

    RNAi combined with next-generation sequencing has proven to be a powerful and cost-effective genetic screening platform in mammalian cells. Still, this technology has its limitations and is incompatible with in situ mutagenesis screens on a genome-wide scale. Using p53 as a proof-of-principle target, we readapted the CRISPR (clustered regularly interspaced short palindromic repeats)/Cas9 (CRISPR associated 9) genome-editing system to demonstrate the feasibility of this methodology for targeted gene disruption positive selection assays. By using novel “all-in-one” lentiviral and retroviral delivery vectors heterologously expressing both a codon-optimized Cas9 and its synthetic guide RNA (sgRNA), we show robust selection for the CRISPR-modified Trp53 locus following drug treatment. Furthermore, by linking Cas9 expression to GFP fluorescence, we use an “all-in-one” system to track disrupted Trp53 in chemoresistant lymphomas in the Eμ-myc mouse model. Deep sequencing analysis of the tumor-derived endogenous Cas9-modified Trp53 locus revealed a wide spectrum of mutants that were enriched with seemingly limited off-target effects. Taken together, these results establish Cas9 genome editing as a powerful and practical approach for positive in situ genetic screens. PMID:24298059

  3. Understanding the complex evolution of rapidly mutating viruses with deep sequencing: Beyond the analysis of viral diversity.

    PubMed

    Leung, Preston; Eltahla, Auda A; Lloyd, Andrew R; Bull, Rowena A; Luciani, Fabio

    2017-07-15

    With the advent of affordable deep sequencing technologies, detection of low frequency variants within genetically diverse viral populations can now be achieved with unprecedented depth and efficiency. The high-resolution data provided by next generation sequencing technologies is currently recognised as the gold standard in estimation of viral diversity. In the analysis of rapidly mutating viruses, longitudinal deep sequencing datasets from viral genomes during individual infection episodes, as well as at the epidemiological level during outbreaks, now allow for more sophisticated analyses such as statistical estimates of the impact of complex mutation patterns on the evolution of the viral populations both within and between hosts. These analyses are revealing more accurate descriptions of the evolutionary dynamics that underpin the rapid adaptation of these viruses to the host response, and to drug therapies. This review assesses recent developments in methods and provide informative research examples using deep sequencing data generated from rapidly mutating viruses infecting humans, particularly hepatitis C virus (HCV), human immunodeficiency virus (HIV), Ebola virus and influenza virus, to understand the evolution of viral genomes and to explore the relationship between viral mutations and the host adaptive immune response. Finally, we discuss limitations in current technologies, and future directions that take advantage of publically available large deep sequencing datasets. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS.

    PubMed

    Lanchantin, Jack; Singh, Ritambhara; Wang, Beilun; Qi, Yanjun

    2017-01-01

    Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequence's saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them.

  5. Improved detection of CXCR4-using HIV by V3 genotyping: application of population-based and "deep" sequencing to plasma RNA and proviral DNA.

    PubMed

    Swenson, Luke C; Moores, Andrew; Low, Andrew J; Thielen, Alexander; Dong, Winnie; Woods, Conan; Jensen, Mark A; Wynhoven, Brian; Chan, Dennison; Glascock, Christopher; Harrigan, P Richard

    2010-08-01

    Tropism testing should rule out CXCR4-using HIV before treatment with CCR5 antagonists. Currently, the recombinant phenotypic Trofile assay (Monogram) is most widely utilized; however, genotypic tests may represent alternative methods. Independent triplicate amplifications of the HIV gp120 V3 region were made from either plasma HIV RNA or proviral DNA. These underwent standard, population-based sequencing with an ABI3730 (RNA n = 63; DNA n = 40), or "deep" sequencing with a Roche/454 Genome Sequencer-FLX (RNA n = 12; DNA n = 12). Position-specific scoring matrices (PSSMX4/R5) (-6.96 cutoff) and geno2pheno[coreceptor] (5% false-positive rate) inferred tropism from V3 sequence. These methods were then independently validated with a separate, blinded dataset (n = 278) of screening samples from the maraviroc MOTIVATE trials. Standard sequencing of HIV RNA with PSSM yielded 69% sensitivity and 91% specificity, relative to Trofile. The validation dataset gave 75% sensitivity and 83% specificity. Proviral DNA plus PSSM gave 77% sensitivity and 71% specificity. "Deep" sequencing of HIV RNA detected >2% inferred-CXCR4-using virus in 8/8 samples called non-R5 by Trofile, and <2% in 4/4 samples called R5. Triplicate analyses of V3 standard sequence data detect greater proportions of CXCR4-using samples than previously achieved. Sequencing proviral DNA and "deep" V3 sequencing may also be useful tools for assessing tropism.

  6. The RDE-10/RDE-11 complex triggers RNAi-induced mRNA degradation by association with target mRNA in C. elegans

    PubMed Central

    Yang, Huan; Zhang, Ying; Vallandingham, Jim; Li, Hau; Florens, Laurence; Mak, Ho Yi

    2012-01-01

    The molecular mechanisms for target mRNA degradation in Caenorhabditis elegans undergoing RNAi are not fully understood. Using a combination of genetic, proteomic, and biochemical approaches, we report a divergent RDE-10/RDE-11 complex that is required for RNAi in C. elegans. Genetic analysis indicates that the RDE-10/RDE-11 complex acts in parallel to nuclear RNAi. Association of the complex with target mRNA is dependent on RDE-1 but not RRF-1, suggesting that target mRNA recognition depends on primary but not secondary siRNA. Furthermore, RDE-11 is required for mRNA degradation subsequent to target engagement. Deep sequencing reveals a fivefold decrease in secondary siRNA abundance in rde-10 and rde-11 mutant animals, while primary siRNA and microRNA biogenesis is normal. Therefore, the RDE-10/RDE-11 complex is critical for amplifying the exogenous RNAi response. Our work uncovers an essential output of the RNAi pathway in C. elegans. PMID:22508728

  7. The RDE-10/RDE-11 complex triggers RNAi-induced mRNA degradation by association with target mRNA in C. elegans.

    PubMed

    Yang, Huan; Zhang, Ying; Vallandingham, Jim; Li, Hua; Li, Hau; Florens, Laurence; Mak, Ho Yi

    2012-04-15

    The molecular mechanisms for target mRNA degradation in Caenorhabditis elegans undergoing RNAi are not fully understood. Using a combination of genetic, proteomic, and biochemical approaches, we report a divergent RDE-10/RDE-11 complex that is required for RNAi in C. elegans. Genetic analysis indicates that the RDE-10/RDE-11 complex acts in parallel to nuclear RNAi. Association of the complex with target mRNA is dependent on RDE-1 but not RRF-1, suggesting that target mRNA recognition depends on primary but not secondary siRNA. Furthermore, RDE-11 is required for mRNA degradation subsequent to target engagement. Deep sequencing reveals a fivefold decrease in secondary siRNA abundance in rde-10 and rde-11 mutant animals, while primary siRNA and microRNA biogenesis is normal. Therefore, the RDE-10/RDE-11 complex is critical for amplifying the exogenous RNAi response. Our work uncovers an essential output of the RNAi pathway in C. elegans.

  8. Apple miRNAs and tasiRNAs with novel regulatory networks

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) and their regulatory functions have been extensively characterized in model species but whether apple has evolved similar or unique regulatory features remains unknown. Results We performed deep small RNA-seq and identified 23 conserved, 10 less-conserved and 42 apple-specific miRNAs or families with distinct expression patterns. The identified miRNAs target 118 genes representing a wide range of enzymatic and regulatory activities. Apple also conserves two TAS gene families with similar but unique trans-acting small interfering RNA (tasiRNA) biogenesis profiles and target specificities. Importantly, we found that miR159, miR828 and miR858 can collectively target up to 81 MYB genes potentially involved in diverse aspects of plant growth and development. These miRNA target sites are differentially conserved among MYBs, which is largely influenced by the location and conservation of the encoded amino acid residues in MYB factors. Finally, we found that 10 of the 19 miR828-targeted MYBs undergo small interfering RNA (siRNA) biogenesis at the 3' cleaved, highly divergent transcript regions, generating over 100 sequence-distinct siRNAs that potentially target over 70 diverse genes as confirmed by degradome analysis. Conclusions Our work identified and characterized apple miRNAs, their expression patterns, targets and regulatory functions. We also discovered that three miRNAs and the ensuing siRNAs exploit both conserved and divergent sequence features of MYB genes to initiate distinct regulatory networks targeting a multitude of genes inside and outside the MYB family. PMID:22704043

  9. Whole-exome sequencing reveals the spectrum of gene mutations and the clonal evolution patterns in paediatric acute myeloid leukaemia.

    PubMed

    Shiba, Norio; Yoshida, Kenichi; Shiraishi, Yuichi; Okuno, Yusuke; Yamato, Genki; Hara, Yusuke; Nagata, Yasunobu; Chiba, Kenichi; Tanaka, Hiroko; Terui, Kiminori; Kato, Motohiro; Park, Myoung-Ja; Ohki, Kentaro; Shimada, Akira; Takita, Junko; Tomizawa, Daisuke; Kudo, Kazuko; Arakawa, Hirokazu; Adachi, Souichi; Taga, Takashi; Tawa, Akio; Ito, Etsuro; Horibe, Keizo; Sanada, Masashi; Miyano, Satoru; Ogawa, Seishi; Hayashi, Yasuhide

    2016-11-01

    Acute myeloid leukaemia (AML) is a molecularly and clinically heterogeneous disease. Targeted sequencing efforts have identified several mutations with diagnostic and prognostic values in KIT, NPM1, CEBPA and FLT3 in both adult and paediatric AML. In addition, massively parallel sequencing enabled the discovery of recurrent mutations (i.e. IDH1/2 and DNMT3A) in adult AML. In this study, whole-exome sequencing (WES) of 22 paediatric AML patients revealed mutations in components of the cohesin complex (RAD21 and SMC3), BCORL1 and ASXL2 in addition to previously known gene mutations. We also revealed intratumoural heterogeneities in many patients, implicating multiple clonal evolution events in the development of AML. Furthermore, targeted deep sequencing in 182 paediatric AML patients identified three major categories of recurrently mutated genes: cohesion complex genes [STAG2, RAD21 and SMC3 in 17 patients (8·3%)], epigenetic regulators [ASXL1/ASXL2 in 17 patients (8·3%), BCOR/BCORL1 in 7 patients (3·4%)] and signalling molecules. We also performed WES in four patients with relapsed AML. Relapsed AML evolved from one of the subclones at the initial phase and was accompanied by many additional mutations, including common driver mutations that were absent or existed only with lower allele frequency in the diagnostic samples, indicating a multistep process causing leukaemia recurrence. © 2016 John Wiley & Sons Ltd.

  10. Insilico profiling of microRNAs in Korean ginseng (Panax ginseng Meyer)

    PubMed Central

    Mathiyalagan, Ramya; Subramaniyam, Sathiyamoorthy; Natarajan, Sathishkumar; Kim, Yeon Ju; Sun, Myung Suk; Kim, Se Young; Kim, Yu-Jin; Yang, Deok Chun

    2013-01-01

    MicroRNAs (miRNAs) are a class of recently discovered non-coding small RNA molecules, on average approximately 21 nucleotides in length, which underlie numerous important biological roles in gene regulation in various organisms. The miRNA database (release 18) has 18,226 miRNAs, which have been deposited from different species. Although miRNAs have been identified and validated in many plant species, no studies have been reported on discovering miRNAs in Panax ginseng Meyer, which is a traditionally known medicinal plant in oriental medicine, also known as Korean ginseng. It has triterpene ginseng saponins called ginsenosides, which are responsible for its various pharmacological activities. Predicting conserved miRNAs by homology-based analysis with available expressed sequence tag (EST) sequences can be powerful, if the species lacks whole genome sequence information. In this study by using the EST based computational approach, 69 conserved miRNAs belonging to 44 miRNA families were identified in Korean ginseng. The digital gene expression patterns of predicted conserved miRNAs were analyzed by deep sequencing using small RNA sequences of flower buds, leaves, and lateral roots. We have found that many of the identified miRNAs showed tissue specific expressions. Using the insilico method, 346 potential targets were identified for the predicted 69 conserved miRNAs by searching the ginseng EST database, and the predicted targets were mainly involved in secondary metabolic processes, responses to biotic and abiotic stress, and transcription regulator activities, as well as a variety of other metabolic processes. PMID:23717176

  11. GeneImp: Fast Imputation to Large Reference Panels Using Genotype Likelihoods from Ultralow Coverage Sequencing

    PubMed Central

    Spiliopoulou, Athina; Colombo, Marco; Orchard, Peter; Agakov, Felix; McKeigue, Paul

    2017-01-01

    We address the task of genotype imputation to a dense reference panel given genotype likelihoods computed from ultralow coverage sequencing as inputs. In this setting, the data have a high-level of missingness or uncertainty, and are thus more amenable to a probabilistic representation. Most existing imputation algorithms are not well suited for this situation, as they rely on prephasing for computational efficiency, and, without definite genotype calls, the prephasing task becomes computationally expensive. We describe GeneImp, a program for genotype imputation that does not require prephasing and is computationally tractable for whole-genome imputation. GeneImp does not explicitly model recombination, instead it capitalizes on the existence of large reference panels—comprising thousands of reference haplotypes—and assumes that the reference haplotypes can adequately represent the target haplotypes over short regions unaltered. We validate GeneImp based on data from ultralow coverage sequencing (0.5×), and compare its performance to the most recent version of BEAGLE that can perform this task. We show that GeneImp achieves imputation quality very close to that of BEAGLE, using one to two orders of magnitude less time, without an increase in memory complexity. Therefore, GeneImp is the first practical choice for whole-genome imputation to a dense reference panel when prephasing cannot be applied, for instance, in datasets produced via ultralow coverage sequencing. A related future application for GeneImp is whole-genome imputation based on the off-target reads from deep whole-exome sequencing. PMID:28348060

  12. Deep sequencing and proteomic analysis of the microRNA-induced silencing complex in human red blood cells.

    PubMed

    Azzouzi, Imane; Moest, Hansjoerg; Wollscheid, Bernd; Schmugge, Markus; Eekels, Julia J M; Speer, Oliver

    2015-05-01

    During maturation, erythropoietic cells extrude their nuclei but retain their ability to respond to oxidant stress by tightly regulating protein translation. Several studies have reported microRNA-mediated regulation of translation during terminal stages of erythropoiesis, even after enucleation. In the present study, we performed a detailed examination of the endogenous microRNA machinery in human red blood cells using a combination of deep sequencing analysis of microRNAs and proteomic analysis of the microRNA-induced silencing complex. Among the 197 different microRNAs detected, miR-451a was the most abundant, representing more than 60% of all read sequences. In addition, miR-451a and its known target, 14-3-3ζ mRNA, were bound to the microRNA-induced silencing complex, implying their direct interaction in red blood cells. The proteomic characterization of endogenous Argonaute 2-associated microRNA-induced silencing complex revealed 26 cofactor candidates. Among these cofactors, we identified several RNA-binding proteins, as well as motor proteins and vesicular trafficking proteins. Our results demonstrate that red blood cells contain complex microRNA machinery, which might enable immature red blood cells to control protein translation independent of de novo nuclei information. Copyright © 2015 ISEH - International Society for Experimental Hematology. Published by Elsevier Inc. All rights reserved.

  13. Origin of olivine at Copernicus

    NASA Technical Reports Server (NTRS)

    Pieters, C. M.; Wilhelms, D. E.

    1985-01-01

    The central peaks of Copernicus are among the few lunar areas where near-infrared telescopic reflectance spectra indicate extensive exposures of olivine. Other parts of Copernicus crater and ejecta, which were derived from highland units in the upper parts of the target site, contain only low-Ca pyroxene as a mafic mineral. The exposure of compositionally distinct layers including the presence of extensive olivine may result from penetration to an anomalously deep layer of the crust or to the lunar mantle. It is suggested that the Procellarum basin and the younger, superposed Insularum basin have provided access to these normally deep-seated crustal or mantle materials by thinning the upper crustal material early in lunar history. The occurrences of olivine in portions of the compositionally heterogeneous Aristarchus Region, in a related geologic setting, may be due to the same sequence of early events.

  14. Enhanced sensitivity for detection of low-level germline mosaic RB1 mutations in sporadic retinoblastoma cases using deep semiconductor sequencing.

    PubMed

    Chen, Zhao; Moran, Kimberly; Richards-Yutz, Jennifer; Toorens, Erik; Gerhart, Daniel; Ganguly, Tapan; Shields, Carol L; Ganguly, Arupa

    2014-03-01

    Sporadic retinoblastoma (RB) is caused by de novo mutations in the RB1 gene. Often, these mutations are present as mosaic mutations that cannot be detected by Sanger sequencing. Next-generation deep sequencing allows unambiguous detection of the mosaic mutations in lymphocyte DNA. Deep sequencing of the RB1 gene on lymphocyte DNA from 20 bilateral and 70 unilateral RB cases was performed, where Sanger sequencing excluded the presence of mutations. The individual exons of the RB1 gene from each sample were amplified, pooled, ligated to barcoded adapters, and sequenced using semiconductor sequencing on an Ion Torrent Personal Genome Machine. Six low-level mosaic mutations were identified in bilateral RB and four in unilateral RB cases. The incidence of low-level mosaic mutation was estimated to be 30% and 6%, respectively, in sporadic bilateral and unilateral RB cases, previously classified as mutation negative. The frequency of point mutations detectable in lymphocyte DNA increased from 96% to 97% for bilateral RB and from 13% to 18% for unilateral RB. The use of deep sequencing technology increased the sensitivity of the detection of low-level germline mosaic mutations in the RB1 gene. This finding has significant implications for improved clinical diagnosis, genetic counseling, surveillance, and management of RB. © 2013 WILEY PERIODICALS, INC.

  15. Deep Sequencing-guided Design of a High Affinity Dual Specificity Antibody to Target Two Angiogenic Factors in Neovascular Age-related Macular Degeneration* ♦

    PubMed Central

    Koenig, Patrick; Lee, Chingwei V.; Sanowar, Sarah; Wu, Ping; Stinson, Jeremy; Harris, Seth F.; Fuh, Germaine

    2015-01-01

    The development of dual targeting antibodies promises therapies with improved efficacy over mono-specific antibodies. Here, we engineered a Two-in-One VEGF/angiopoietin 2 antibody with dual action Fab (DAF) as a potential therapeutic for neovascular age-related macular degeneration. Crystal structures of the VEGF/angiopoietin 2 DAF in complex with its two antigens showed highly overlapping binding sites. To achieve sufficient affinity of the DAF to block both angiogenic factors, we turned to deep mutational scanning in the complementarity determining regions (CDRs). By mutating all three CDRs of each antibody chain simultaneously, we were able not only to identify affinity improving single mutations but also mutation pairs from different CDRs that synergistically improve both binding functions. Furthermore, insights into the cooperativity between mutations allowed us to identify fold-stabilizing mutations in the CDRs. The data obtained from deep mutational scanning reveal that the majority of the 52 CDR residues are utilized differently for the two antigen binding function and permit, for the first time, the engineering of several DAF variants with sub-nanomolar affinity against two structurally unrelated antigens. The improved variants show similar blocking activity of receptor binding as the high affinity mono-specific antibodies against these two proteins, demonstrating the feasibility of generating a dual specificity binding surface with comparable properties to individual high affinity mono-specific antibodies. PMID:26088137

  16. Identifying vaccine targets for anti-leishmanial vaccine development

    PubMed Central

    Sundar, Shyam; Singh, Bhawana

    2014-01-01

    Leishmaniasis is a neglected tropical disease spread by an arthropod vector. It remains a significant health problem with an incidence of 0.2–0.4 million VL and 0.7–1.2 million CL cases each year. There are limitations associated with the current therapeutic regimens for leishmaniasis and the fact that after recovery from infection the host becomes immune to subsequent infection therefore, these factors forces the feasibility of a vaccine for leishmaniasis. Publication of the genome sequence of Leishmania has paved a new way to understand the pathogenesis and host immunological status therefore providing a deep insight in the field of vaccine research. This review is an effort to study the antigenic targets in Leishmania to develop anti-leishmanial vaccine. PMID:24606556

  17. PIK3CA-associated developmental disorders exhibit distinct classes of mutations with variable expression and tissue distribution

    PubMed Central

    Timms, Andrew E.; Conti, Valerio; Girisha, Katta M.; Martin, Beth; Olds, Carissa; Collins, Sarah; Park, Kaylee; Carter, Melissa; Krägeloh-Mann, Inge; Chitayat, David; Parikh, Aditi Shah; Bradshaw, Rachael; Torti, Erin; Braddock, Stephen; Burke, Leah; Ghedia, Sondhya; Stephan, Mark; Stewart, Fiona; Prasad, Chitra; Napier, Melanie; Saitta, Sulagna; Straussberg, Rachel; Gabbett, Michael; O’Connor, Bridget C.; Yin, Lim Jiin; Lai, Angeline Hwei Meeng; Martin, Nicole; McKinnon, Margaret; Addor, Marie-Claude; Schwartz, Charles E.; Lanoel, Agustina; Conway, Robert L.; Devriendt, Koenraad; Tatton-Brown, Katrina; Pierpont, Mary Ella; Painter, Michael; Worgan, Lisa; Reggin, James; Hennekam, Raoul; Pritchard, Colin C.; Aracena, Mariana; Gripp, Karen W.; Cordisco, Maria; Van Esch, Hilde; Garavelli, Livia; Curry, Cynthia; Goriely, Anne; Kayserilli, Hulya; Shendure, Jay; Graham, John; Guerrini, Renzo; Dobyns, William B.

    2016-01-01

    Mosaicism is increasingly recognized as a cause of developmental disorders with the advent of next-generation sequencing (NGS). Mosaic mutations of PIK3CA have been associated with the widest spectrum of phenotypes associated with overgrowth and vascular malformations. We performed targeted NGS using 2 independent deep-coverage methods that utilize molecular inversion probes and amplicon sequencing in a cohort of 241 samples from 181 individuals with brain and/or body overgrowth. We identified PIK3CA mutations in 60 individuals. Several other individuals (n = 12) were identified separately to have mutations in PIK3CA by clinical targeted-panel testing (n = 6), whole-exome sequencing (n = 5), or Sanger sequencing (n = 1). Based on the clinical and molecular features, this cohort segregated into three distinct groups: (a) severe focal overgrowth due to low-level but highly activating (hotspot) mutations, (b) predominantly brain overgrowth and less severe somatic overgrowth due to less-activating mutations, and (c) intermediate phenotypes (capillary malformations with overgrowth) with intermediately activating mutations. Sixteen of 29 PIK3CA mutations were novel. We also identified constitutional PIK3CA mutations in 10 patients. Our molecular data, combined with review of the literature, show that PIK3CA-related overgrowth disorders comprise a discontinuous spectrum of disorders that correlate with the severity and distribution of mutations. PMID:27631024

  18. Genomic perspectives of spider silk genes through target capture sequencing: Conservation of stabilization mechanisms and homology-based structural models of spidroin terminal regions.

    PubMed

    Collin, Matthew A; Clarke, Thomas H; Ayoub, Nadia A; Hayashi, Cheryl Y

    2018-07-01

    A powerful system for studying protein aggregation, particularly rapid self-assembly, is spider silk. Spider silks are proteinaceous and silk proteins are synthesized and stored within silk glands as liquid dope. As needed, liquid dope is near-instantaneously transformed into solid fibers or viscous adhesives. The dominant constituents of silks are spidroins (spider fibroins) and their terminal domains are vital for the tight control of silk self-assembly. To better understand spidroin termini, we used target capture and deep sequencing to identify spidroin gene sequences from six species representing the araneoid families of Araneidae, Nephilidae, and Theridiidae. We obtained 145 terminal regions, of which 103 are newly annotated here, as well as novel variants within nine diverse spidroin types. Our comparative analyses demonstrated the conservation of acidic, basic, and cysteine amino acid residues across spidroin types that had been proposed to be important for monomer stability, dimer formation, and self-assembly from a limited sampling of spidroins. Computational, protein homology modeling revealed areas of spidroin terminal regions that are highly conserved in three-dimensions despite sequence divergence across spidroin types. Analyses of our dense sampling of terminal regions suggest that most spidroins share stabilization mechanisms, dimer formation, and tertiary structure, despite producing functionally distinct materials. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence

    PubMed Central

    Sun, Jiangming; Singh, Pratibha; Bagge, Annika; Valtat, Bérengère; Vikman, Petter; Spégel, Peter; Mulder, Hindrik

    2016-01-01

    RNA editing is a post-transcriptional alteration of RNA sequences that, via insertions, deletions or base substitutions, can affect protein structure as well as RNA and protein expression. Recently, it has been suggested that RNA editing may be more frequent than previously thought. A great impediment, however, to a deeper understanding of this process is the paramount sequencing effort that needs to be undertaken to identify RNA editing events. Here, we describe an in silico approach, based on machine learning, that ameliorates this problem. Using 41 nucleotide long DNA sequences, we show that novel A-to-I RNA editing events can be predicted from known A-to-I RNA editing events intra- and interspecies. The validity of the proposed method was verified in an independent experimental dataset. Using our approach, 203 202 putative A-to-I RNA editing events were predicted in the whole human genome. Out of these, 9% were previously reported. The remaining sites require further validation, e.g., by targeted deep sequencing. In conclusion, the approach described here is a useful tool to identify potential A-to-I RNA editing events without the requirement of extensive RNA sequencing. PMID:27764195

  20. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

    PubMed Central

    Lai, Zhongwu; Markovets, Aleksandra; Ahdesmaki, Miika; Chapman, Brad; Hofmann, Oliver; McEwen, Robert; Johnson, Justin; Dougherty, Brian; Barrett, J. Carl; Dry, Jonathan R.

    2016-01-01

    Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research. PMID:27060149

  1. Draft Genome Sequence of Deep-Sea Alteromonas sp. Strain V450 Isolated from the Marine Sponge Leiodermatium sp.

    PubMed Central

    Barrett, Nolan H.; McCarthy, Peter J.

    2017-01-01

    ABSTRACT The proteobacterium Alteromonas sp. strain V450 was isolated from the Atlantic deep-sea sponge Leiodermatium sp. Here, we report the draft genome sequence of this strain, with a genome size of approx. 4.39 Mb and a G+C content of 44.01%. The results will aid deep-sea microbial ecology, evolution, and sponge-microbe association studies. PMID:28153886

  2. The salt-responsive transcriptome of chickpea roots and nodules via deepSuperSAGE

    PubMed Central

    2011-01-01

    Background The combination of high-throughput transcript profiling and next-generation sequencing technologies is a prerequisite for genome-wide comprehensive transcriptome analysis. Our recent innovation of deepSuperSAGE is based on an advanced SuperSAGE protocol and its combination with massively parallel pyrosequencing on Roche's 454 sequencing platform. As a demonstration of the power of this combination, we have chosen the salt stress transcriptomes of roots and nodules of the third most important legume crop chickpea (Cicer arietinum L.). While our report is more technology-oriented, it nevertheless addresses a major world-wide problem for crops generally: high salinity. Together with low temperatures and water stress, high salinity is responsible for crop losses of millions of tons of various legume (and other) crops. Continuously deteriorating environmental conditions will combine with salinity stress to further compromise crop yields. As a good example for such stress-exposed crop plants, we started to characterize salt stress responses of chickpeas on the transcriptome level. Results We used deepSuperSAGE to detect early global transcriptome changes in salt-stressed chickpea. The salt stress responses of 86,919 transcripts representing 17,918 unique 26 bp deepSuperSAGE tags (UniTags) from roots of the salt-tolerant variety INRAT-93 two hours after treatment with 25 mM NaCl were characterized. Additionally, the expression of 57,281 transcripts representing 13,115 UniTags was monitored in nodules of the same plants. From a total of 144,200 analyzed 26 bp tags in roots and nodules together, 21,401 unique transcripts were identified. Of these, only 363 and 106 specific transcripts, respectively, were commonly up- or down-regulated (>3.0-fold) under salt stress in both organs, witnessing a differential organ-specific response to stress. Profiting from recent pioneer works on massive cDNA sequencing in chickpea, more than 9,400 UniTags were able to be linked to UniProt entries. Additionally, gene ontology (GO) categories over-representation analysis enabled to filter out enriched biological processes among the differentially expressed UniTags. Subsequently, the gathered information was further cross-checked with stress-related pathways. From several filtered pathways, here we focus exemplarily on transcripts associated with the generation and scavenging of reactive oxygen species (ROS), as well as on transcripts involved in Na+ homeostasis. Although both processes are already very well characterized in other plants, the information generated in the present work is of high value. Information on expression profiles and sequence similarity for several hundreds of transcripts of potential interest is now available. Conclusions This report demonstrates, that the combination of the high-throughput transcriptome profiling technology SuperSAGE with one of the next-generation sequencing platforms allows deep insights into the first molecular reactions of a plant exposed to salinity. Cross validation with recent reports enriched the information about the salt stress dynamics of more than 9,000 chickpea ESTs, and enlarged their pool of alternative transcripts isoforms. As an example for the high resolution of the employed technology that we coin deepSuperSAGE, we demonstrate that ROS-scavenging and -generating pathways undergo strong global transcriptome changes in chickpea roots and nodules already 2 hours after onset of moderate salt stress (25 mM NaCl). Additionally, a set of more than 15 candidate transcripts are proposed to be potential components of the salt overly sensitive (SOS) pathway in chickpea. Newly identified transcript isoforms are potential targets for breeding novel cultivars with high salinity tolerance. We demonstrate that these targets can be integrated into breeding schemes by micro-arrays and RT-PCR assays downstream of the generation of 26 bp tags by SuperSAGE. PMID:21320317

  3. The salt-responsive transcriptome of chickpea roots and nodules via deepSuperSAGE.

    PubMed

    Molina, Carlos; Zaman-Allah, Mainassara; Khan, Faheema; Fatnassi, Nadia; Horres, Ralf; Rotter, Björn; Steinhauer, Diana; Amenc, Laurie; Drevon, Jean-Jacques; Winter, Peter; Kahl, Günter

    2011-02-14

    The combination of high-throughput transcript profiling and next-generation sequencing technologies is a prerequisite for genome-wide comprehensive transcriptome analysis. Our recent innovation of deepSuperSAGE is based on an advanced SuperSAGE protocol and its combination with massively parallel pyrosequencing on Roche's 454 sequencing platform. As a demonstration of the power of this combination, we have chosen the salt stress transcriptomes of roots and nodules of the third most important legume crop chickpea (Cicer arietinum L.). While our report is more technology-oriented, it nevertheless addresses a major world-wide problem for crops generally: high salinity. Together with low temperatures and water stress, high salinity is responsible for crop losses of millions of tons of various legume (and other) crops. Continuously deteriorating environmental conditions will combine with salinity stress to further compromise crop yields. As a good example for such stress-exposed crop plants, we started to characterize salt stress responses of chickpeas on the transcriptome level. We used deepSuperSAGE to detect early global transcriptome changes in salt-stressed chickpea. The salt stress responses of 86,919 transcripts representing 17,918 unique 26 bp deepSuperSAGE tags (UniTags) from roots of the salt-tolerant variety INRAT-93 two hours after treatment with 25 mM NaCl were characterized. Additionally, the expression of 57,281 transcripts representing 13,115 UniTags was monitored in nodules of the same plants. From a total of 144,200 analyzed 26 bp tags in roots and nodules together, 21,401 unique transcripts were identified. Of these, only 363 and 106 specific transcripts, respectively, were commonly up- or down-regulated (>3.0-fold) under salt stress in both organs, witnessing a differential organ-specific response to stress.Profiting from recent pioneer works on massive cDNA sequencing in chickpea, more than 9,400 UniTags were able to be linked to UniProt entries. Additionally, gene ontology (GO) categories over-representation analysis enabled to filter out enriched biological processes among the differentially expressed UniTags. Subsequently, the gathered information was further cross-checked with stress-related pathways. From several filtered pathways, here we focus exemplarily on transcripts associated with the generation and scavenging of reactive oxygen species (ROS), as well as on transcripts involved in Na+ homeostasis. Although both processes are already very well characterized in other plants, the information generated in the present work is of high value. Information on expression profiles and sequence similarity for several hundreds of transcripts of potential interest is now available. This report demonstrates, that the combination of the high-throughput transcriptome profiling technology SuperSAGE with one of the next-generation sequencing platforms allows deep insights into the first molecular reactions of a plant exposed to salinity. Cross validation with recent reports enriched the information about the salt stress dynamics of more than 9,000 chickpea ESTs, and enlarged their pool of alternative transcripts isoforms. As an example for the high resolution of the employed technology that we coin deepSuperSAGE, we demonstrate that ROS-scavenging and -generating pathways undergo strong global transcriptome changes in chickpea roots and nodules already 2 hours after onset of moderate salt stress (25 mM NaCl). Additionally, a set of more than 15 candidate transcripts are proposed to be potential components of the salt overly sensitive (SOS) pathway in chickpea. Newly identified transcript isoforms are potential targets for breeding novel cultivars with high salinity tolerance. We demonstrate that these targets can be integrated into breeding schemes by micro-arrays and RT-PCR assays downstream of the generation of 26 bp tags by SuperSAGE.

  4. Deep Sequencing Analysis of miRNA Expression in Breast Muscle of Fast-Growing and Slow-Growing Broilers

    PubMed Central

    Ouyang, Hongjia; He, Xiaomei; Li, Guihuan; Xu, Haiping; Jia, Xinzheng; Nie, Qinghua; Zhang, Xiquan

    2015-01-01

    Growth performance is an important economic trait in chicken. MicroRNAs (miRNAs) have been shown to play important roles in various biological processes, but their functions in chicken growth are not yet clear. To investigate the function of miRNAs in chicken growth, breast muscle tissues of the two-tail samples (highest and lowest body weight) from Recessive White Rock (WRR) and Xinghua Chickens (XH) were performed on high throughput small RNA deep sequencing. In this study, a total of 921 miRNAs were identified, including 733 known mature miRNAs and 188 novel miRNAs. There were 200, 279, 257 and 297 differentially expressed miRNAs in the comparisons of WRRh vs. WRRl, WRRh vs. XHh, WRRl vs. XHl, and XHh vs. XHl group, respectively. A total of 22 highly differentially expressed miRNAs (fold change > 2 or < 0.5; p-value < 0.05; q-value < 0.01), which also have abundant expression (read counts > 1000) were found in our comparisons. As far as two analyses (WRRh vs. WRRl, and XHh vs. XHl) are concerned, we found 80 common differentially expressed miRNAs, while 110 miRNAs were found in WRRh vs. XHh and WRRl vs. XHl. Furthermore, 26 common miRNAs were identified among all four comparisons. Four differentially expressed miRNAs (miR-223, miR-16, miR-205a and miR-222b-5p) were validated by quantitative real-time RT-PCR (qRT-PCR). Regulatory networks of interactions among miRNAs and their targets were constructed using integrative miRNA target-prediction and network-analysis. Growth hormone receptor (GHR) was confirmed as a target of miR-146b-3p by dual-luciferase assay and qPCR, indicating that miR-34c, miR-223, miR-146b-3p, miR-21 and miR-205a are key growth-related target genes in the network. These miRNAs are proposed as candidate miRNAs for future studies concerning miRNA-target function on regulation of chicken growth. PMID:26193261

  5. Deep Sequencing Analysis of miRNA Expression in Breast Muscle of Fast-Growing and Slow-Growing Broilers.

    PubMed

    Ouyang, Hongjia; He, Xiaomei; Li, Guihuan; Xu, Haiping; Jia, Xinzheng; Nie, Qinghua; Zhang, Xiquan

    2015-07-17

    Growth performance is an important economic trait in chicken. MicroRNAs (miRNAs) have been shown to play important roles in various biological processes, but their functions in chicken growth are not yet clear. To investigate the function of miRNAs in chicken growth, breast muscle tissues of the two-tail samples (highest and lowest body weight) from Recessive White Rock (WRR) and Xinghua Chickens (XH) were performed on high throughput small RNA deep sequencing. In this study, a total of 921 miRNAs were identified, including 733 known mature miRNAs and 188 novel miRNAs. There were 200, 279, 257 and 297 differentially expressed miRNAs in the comparisons of WRRh vs. WRRl, WRRh vs. XHh, WRRl vs. XHl, and XHh vs. XHl group, respectively. A total of 22 highly differentially expressed miRNAs (fold change > 2 or < 0.5; p-value < 0.05; q-value < 0.01), which also have abundant expression (read counts > 1000) were found in our comparisons. As far as two analyses (WRRh vs. WRRl, and XHh vs. XHl) are concerned, we found 80 common differentially expressed miRNAs, while 110 miRNAs were found in WRRh vs. XHh and WRRl vs. XHl. Furthermore, 26 common miRNAs were identified among all four comparisons. Four differentially expressed miRNAs (miR-223, miR-16, miR-205a and miR-222b-5p) were validated by quantitative real-time RT-PCR (qRT-PCR). Regulatory networks of interactions among miRNAs and their targets were constructed using integrative miRNA target-prediction and network-analysis. Growth hormone receptor (GHR) was confirmed as a target of miR-146b-3p by dual-luciferase assay and qPCR, indicating that miR-34c, miR-223, miR-146b-3p, miR-21 and miR-205a are key growth-related target genes in the network. These miRNAs are proposed as candidate miRNAs for future studies concerning miRNA-target function on regulation of chicken growth.

  6. Within-Host Variations of Human Papillomavirus Reveal APOBEC Signature Mutagenesis in the Viral Genome.

    PubMed

    Hirose, Yusuke; Onuki, Mamiko; Tenjimbayashi, Yuri; Mori, Seiichiro; Ishii, Yoshiyuki; Takeuchi, Takamasa; Tasaka, Nobutaka; Satoh, Toyomi; Morisada, Tohru; Iwata, Takashi; Miyamoto, Shingo; Matsumoto, Koji; Sekizawa, Akihiko; Kukimoto, Iwao

    2018-06-15

    Persistent infection with oncogenic human papillomaviruses (HPVs) causes cervical cancer, accompanied by the accumulation of somatic mutations into the host genome. There are concomitant genetic changes in the HPV genome during viral infection; however, their relevance to cervical carcinogenesis is poorly understood. Here, we explored within-host genetic diversity of HPV by performing deep-sequencing analyses of viral whole-genome sequences in clinical specimens. The whole genomes of HPV types 16, 52, and 58 were amplified by type-specific PCR from total cellular DNA of cervical exfoliated cells collected from patients with cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) and were deep sequenced. After constructing a reference viral genome sequence for each specimen, nucleotide positions showing changes with >0.5% frequencies compared to the reference sequence were determined for individual samples. In total, 1,052 positions of nucleotide variations were detected in HPV genomes from 151 samples (CIN1, n = 56; CIN2/3, n = 68; ICC, n = 27), with various numbers per sample. Overall, C-to-T and C-to-A substitutions were the dominant changes observed across all histological grades. While C-to-T transitions were predominantly detected in CIN1, their prevalence was decreased in CIN2/3 and fell below that of C-to-A transversions in ICC. Analysis of the trinucleotide context encompassing substituted bases revealed that TpCpN, a preferred target sequence for cellular APOBEC cytosine deaminases, was a primary site for C-to-T substitutions in the HPV genome. These results strongly imply that the APOBEC proteins are drivers of HPV genome mutation, particularly in CIN1 lesions. IMPORTANCE HPVs exhibit surprisingly high levels of genetic diversity, including a large repertoire of minor genomic variants in each viral genotype. Here, by conducting deep-sequencing analyses, we show for the first time a comprehensive snapshot of the within-host genetic diversity of high-risk HPVs during cervical carcinogenesis. Quasispecies harboring minor nucleotide variations in viral whole-genome sequences were extensively observed across different grades of CIN and cervical cancer. Among the within-host variations, C-to-T transitions, a characteristic change mediated by cellular APOBEC cytosine deaminases, were predominantly detected throughout the whole viral genome, most strikingly in low-grade CIN lesions. The results strongly suggest that within-host variations of the HPV genome are primarily generated through the interaction with host cell DNA-editing enzymes and that such within-host variability is an evolutionary source of the genetic diversity of HPVs. Copyright © 2018 American Society for Microbiology.

  7. Identification of Prostate Cancer-Specific microDNAs

    DTIC Science & Technology

    2016-02-01

    circular DNA by rolling circle amplification (RCA) and then amplified DNA fragments were subject to deep sequencing. Deep sequencing of the...demonstrate the existence of microDNAs in prostate cancer. We adopted multiple displacement amplification (MDA) with random 2 primers for enriched...prostate cancer cells through multiple displacement amplification and next generation sequencing. R e la ti v e c e ll g ro w th ( % ) 0 20

  8. Sequence-specific bias correction for RNA-seq data using recurrent neural networks.

    PubMed

    Zhang, Yao-Zhong; Yamaguchi, Rui; Imoto, Seiya; Miyano, Satoru

    2017-01-25

    The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures. The sequence-specific bias of a read is then calculated based on the sequence probabilities estimated by RNNs, and used in the estimation of gene abundance. We explore the application of two popular RNN recurrent units for this task and demonstrate that RNN-based approaches provide a flexible way to model nucleotide sequences without knowledge of predetermined sequence structures. Our experiments show that training a RNN-based nucleotide sequence model is efficient and RNN-based bias correction methods compare well with the-state-of-the-art sequence-specific bias correction method on the commonly used MAQC-III data set. RNNs provides an alternative and flexible way to calculate sequence-specific bias without explicitly pre-determining sequence structures.

  9. Deep-Learning-Based Drug-Target Interaction Prediction.

    PubMed

    Wen, Ming; Zhang, Zhimin; Niu, Shaoyu; Sha, Haozhi; Yang, Ruihan; Yun, Yonghuan; Lu, Hongmei

    2017-04-07

    Identifying interactions between known drugs and targets is a major challenge in drug repositioning. In silico prediction of drug-target interaction (DTI) can speed up the expensive and time-consuming experimental work by providing the most potent DTIs. In silico prediction of DTI can also provide insights about the potential drug-drug interaction and promote the exploration of drug side effects. Traditionally, the performance of DTI prediction depends heavily on the descriptors used to represent the drugs and the target proteins. In this paper, to accurately predict new DTIs between approved drugs and targets without separating the targets into different classes, we developed a deep-learning-based algorithmic framework named DeepDTIs. It first abstracts representations from raw input descriptors using unsupervised pretraining and then applies known label pairs of interaction to build a classification model. Compared with other methods, it is found that DeepDTIs reaches or outperforms other state-of-the-art methods. The DeepDTIs can be further used to predict whether a new drug targets to some existing targets or whether a new target interacts with some existing drugs.

  10. The genetic landscape of paediatric de novo acute myeloid leukaemia as defined by single nucleotide polymorphism array and exon sequencing of 100 candidate genes.

    PubMed

    Olsson, Linda; Zettermark, Sofia; Biloglav, Andrea; Castor, Anders; Behrendtz, Mikael; Forestier, Erik; Paulsson, Kajsa; Johansson, Bertil

    2016-07-01

    Cytogenetic analyses of a consecutive series of 67 paediatric (median age 8 years; range 0-17) de novo acute myeloid leukaemia (AML) patients revealed aberrations in 55 (82%) cases. The most common subgroups were KMT2A rearrangement (29%), normal karyotype (15%), RUNX1-RUNX1T1 (10%), deletions of 5q, 7q and/or 17p (9%), myeloid leukaemia associated with Down syndrome (7%), PML-RARA (7%) and CBFB-MYH11 (5%). Single nucleotide polymorphism array (SNP-A) analysis and exon sequencing of 100 genes, performed in 52 and 40 cases, respectively (39 overlapping), revealed ≥1 aberration in 89%; when adding cytogenetic data, this frequency increased to 98%. Uniparental isodisomies (UPIDs) were detected in 13% and copy number aberrations (CNAs) in 63% (median 2/case); three UPIDs and 22 CNAs were recurrent. Twenty-two genes were targeted by focal CNAs, including AEBP2 and PHF6 deletions and genes involved in AML-associated gene fusions. Deep sequencing identified mutations in 65% of cases (median 1/case). In total, 60 mutations were found in 30 genes, primarily those encoding signalling proteins (47%), transcription factors (25%), or epigenetic modifiers (13%). Twelve genes (BCOR, CEBPA, FLT3, GATA1, KIT, KRAS, NOTCH1, NPM1, NRAS, PTPN11, SMC3 and TP53) were recurrently mutated. We conclude that SNP-A and deep sequencing analyses complement the cytogenetic diagnosis of paediatric AML. © 2016 John Wiley & Sons Ltd.

  11. An Efficient Strategy of Screening for Pathogens in Wild-Caught Ticks and Mosquitoes by Reusing Small RNA Deep Sequencing Data

    PubMed Central

    An, Xiaoping; Fan, Hang; Ma, Maijuan; Anderson, Benjamin D.; Jiang, Jiafu; Liu, Wei; Cao, Wuchun; Tong, Yigang

    2014-01-01

    This paper explored our hypothesis that sRNA (18∼30 bp) deep sequencing technique can be used as an efficient strategy to identify microorganisms other than viruses, such as prokaryotic and eukaryotic pathogens. In the study, the clean reads derived from the sRNA deep sequencing data of wild-caught ticks and mosquitoes were compared against the NCBI nucleotide collection (non-redundant nt database) using Blastn. The blast results were then analyzed with in-house Python scripts. An empirical formula was proposed to identify the putative pathogens. Results showed that not only viruses but also prokaryotic and eukaryotic species of interest can be screened out and were subsequently confirmed with experiments. Specially, a novel Rickettsia spp. was indicated to exist in Haemaphysalis longicornis ticks collected in Beijing. Our study demonstrated the reuse of sRNA deep sequencing data would have the potential to trace the origin of pathogens or discover novel agents of emerging/re-emerging infectious diseases. PMID:24618575

  12. A Follow-Up of the Multicenter Collaborative Study on HIV-1 Drug Resistance and Tropism Testing Using 454 Ultra Deep Pyrosequencing

    PubMed Central

    St. John, Elizabeth P.; Simen, Birgitte B.; Turenchalk, Gregory S.; Braverman, Michael S.; Abbate, Isabella; Aerssens, Jeroen; Bouchez, Olivier; Gabriel, Christian; Izopet, Jacques; Meixenberger, Karolin; Di Giallonardo, Francesca; Schlapbach, Ralph; Paredes, Roger; Sakwa, James; Schmitz-Agheguian, Gudrun G.; Thielen, Alexander; Victor, Martin

    2016-01-01

    Background Ultra deep sequencing is of increasing use not only in research but also in diagnostics. For implementation of ultra deep sequencing assays in clinical laboratories for routine diagnostics, intra- and inter-laboratory testing are of the utmost importance. Methods A multicenter study was conducted to validate an updated assay design for 454 Life Sciences’ GS FLX Titanium system targeting protease/reverse transcriptase (RTP) and env (V3) regions to identify HIV-1 drug-resistance mutations and determine co-receptor use with high sensitivity. The study included 30 HIV-1 subtype B and 6 subtype non-B samples with viral titers (VT) of 3,940–447,400 copies/mL, two dilution series (52,129–1,340 and 25,130–734 copies/mL), and triplicate samples. Amplicons spanning PR codons 10–99, RT codons 1–251 and the entire V3 region were generated using barcoded primers. Analysis was performed using the GS Amplicon Variant Analyzer and geno2pheno for tropism. For comparison, population sequencing was performed using the ViroSeq HIV-1 genotyping system. Results The median sequencing depth across the 11 sites was 1,829 reads per position for RTP (IQR 592–3,488) and 2,410 for V3 (IQR 786–3,695). 10 preselected drug resistant variants were measured across sites and showed high inter-laboratory correlation across all sites with data (P<0.001). The triplicate samples of a plasmid mixture confirmed the high inter-laboratory consistency (mean% ± stdev: 4.6 ±0.5, 4.8 ±0.4, 4.9 ±0.3) and revealed good intra-laboratory consistency (mean% range ± stdev range: 4.2–5.2 ± 0.04–0.65). In the two dilutions series, no variants >20% were missed, variants 2–10% were detected at most sites (even at low VT), and variants 1–2% were detected by some sites. All mutations detected by population sequencing were also detected by UDS. Conclusions This assay design results in an accurate and reproducible approach to analyze HIV-1 mutant spectra, even at variant frequencies well below those routinely detectable by population sequencing. PMID:26756901

  13. Draft Genome Sequence of Deep-Sea Alteromonas sp. Strain V450 Isolated from the Marine Sponge Leiodermatium sp.

    PubMed

    Wang, Guojun; Barrett, Nolan H; McCarthy, Peter J

    2017-02-02

    The proteobacterium Alteromonas sp. strain V450 was isolated from the Atlantic deep-sea sponge Leiodermatium sp. Here, we report the draft genome sequence of this strain, with a genome size of approx. 4.39 Mb and a G+C content of 44.01%. The results will aid deep-sea microbial ecology, evolution, and sponge-microbe association studies. Copyright © 2017 Wang et al.

  14. miRBase: integrating microRNA annotation and deep-sequencing data.

    PubMed

    Kozomara, Ana; Griffiths-Jones, Sam

    2011-01-01

    miRBase is the primary online repository for all microRNA sequences and annotation. The current release (miRBase 16) contains over 15,000 microRNA gene loci in over 140 species, and over 17,000 distinct mature microRNA sequences. Deep-sequencing technologies have delivered a sharp rise in the rate of novel microRNA discovery. We have mapped reads from short RNA deep-sequencing experiments to microRNAs in miRBase and developed web interfaces to view these mappings. The user can view all read data associated with a given microRNA annotation, filter reads by experiment and count, and search for microRNAs by tissue- and stage-specific expression. These data can be used as a proxy for relative expression levels of microRNA sequences, provide detailed evidence for microRNA annotations and alternative isoforms of mature microRNAs, and allow us to revisit previous annotations. miRBase is available online at: http://www.mirbase.org/.

  15. Hairpin RNA Targeting Multiple Viral Genes Confers Strong Resistance to Rice Black-Streaked Dwarf Virus.

    PubMed

    Wang, Fangquan; Li, Wenqi; Zhu, Jinyan; Fan, Fangjun; Wang, Jun; Zhong, Weigong; Wang, Ming-Bo; Liu, Qing; Zhu, Qian-Hao; Zhou, Tong; Lan, Ying; Zhou, Yijun; Yang, Jie

    2016-05-11

    Rice black-streaked dwarf virus (RBSDV) belongs to the genus Fijivirus in the family of Reoviridae and causes severe yield loss in rice-producing areas in Asia. RNA silencing, as a natural defence mechanism against plant viruses, has been successfully exploited for engineering virus resistance in plants, including rice. In this study, we generated transgenic rice lines harbouring a hairpin RNA (hpRNA) construct targeting four RBSDV genes, S1, S2, S6 and S10, encoding the RNA-dependent RNA polymerase, the putative core protein, the RNA silencing suppressor and the outer capsid protein, respectively. Both field nursery and artificial inoculation assays of three generations of the transgenic lines showed that they had strong resistance to RBSDV infection. The RBSDV resistance in the segregating transgenic populations correlated perfectly with the presence of the hpRNA transgene. Furthermore, the hpRNA transgene was expressed in the highly resistant transgenic lines, giving rise to abundant levels of 21-24 nt small interfering RNA (siRNA). By small RNA deep sequencing, the RBSDV-resistant transgenic lines detected siRNAs from all four viral gene sequences in the hpRNA transgene, indicating that the whole chimeric fusion sequence can be efficiently processed by Dicer into siRNAs. Taken together, our results suggest that long hpRNA targeting multiple viral genes can be used to generate stable and durable virus resistance in rice, as well as other plant species.

  16. Transcriptome sequences resolve deep relationships of the grape family.

    PubMed

    Wen, Jun; Xiong, Zhiqiang; Nie, Ze-Long; Mao, Likai; Zhu, Yabing; Kan, Xian-Zhao; Ickert-Bond, Stefanie M; Gerrath, Jean; Zimmer, Elizabeth A; Fang, Xiao-Dong

    2013-01-01

    Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated.

  17. RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.

    PubMed

    Pan, Xiaoyong; Shen, Hong-Bin

    2017-02-28

    RNAs play key roles in cells through the interactions with proteins known as the RNA-binding proteins (RBP) and their binding motifs enable crucial understanding of the post-transcriptional regulation of RNAs. How the RBPs correctly recognize the target RNAs and why they bind specific positions is still far from clear. Machine learning-based algorithms are widely acknowledged to be capable of speeding up this process. Although many automatic tools have been developed to predict the RNA-protein binding sites from the rapidly growing multi-resource data, e.g. sequence, structure, their domain specific features and formats have posed significant computational challenges. One of current difficulties is that the cross-source shared common knowledge is at a higher abstraction level beyond the observed data, resulting in a low efficiency of direct integration of observed data across domains. The other difficulty is how to interpret the prediction results. Existing approaches tend to terminate after outputting the potential discrete binding sites on the sequences, but how to assemble them into the meaningful binding motifs is a topic worth of further investigation. In viewing of these challenges, we propose a deep learning-based framework (iDeep) by using a novel hybrid convolutional neural network and deep belief network to predict the RBP interaction sites and motifs on RNAs. This new protocol is featured by transforming the original observed data into a high-level abstraction feature space using multiple layers of learning blocks, where the shared representations across different domains are integrated. To validate our iDeep method, we performed experiments on 31 large-scale CLIP-seq datasets, and our results show that by integrating multiple sources of data, the average AUC can be improved by 8% compared to the best single-source-based predictor; and through cross-domain knowledge integration at an abstraction level, it outperforms the state-of-the-art predictors by 6%. Besides the overall enhanced prediction performance, the convolutional neural network module embedded in iDeep is also able to automatically capture the interpretable binding motifs for RBPs. Large-scale experiments demonstrate that these mined binding motifs agree well with the experimentally verified results, suggesting iDeep is a promising approach in the real-world applications. The iDeep framework not only can achieve promising performance than the state-of-the-art predictors, but also easily capture interpretable binding motifs. iDeep is available at http://www.csbio.sjtu.edu.cn/bioinf/iDeep.

  18. Draft Genome Sequence of Pseudomonas oceani DSM 100277T, a Deep-Sea Bacterium

    PubMed Central

    2018-01-01

    ABSTRACT Pseudomonas oceani DSM 100277T was isolated from deep seawater in the Okinawa Trough at 1390 m. P. oceani belongs to the Pseudomonas pertucinogena group. Here, we report the draft genome sequence of P. oceani, which has an estimated size of 4.1 Mb and exhibits 3,790 coding sequences, with a G+C content of 59.94 mol%. PMID:29650573

  19. Cross-shore and Vertical Distributions of Invertebrate Larvae Using Autonomous Sampling Coupled with Genetic Analysis

    NASA Astrophysics Data System (ADS)

    Govindarajan, A.; Pineda, J.; Purcell, M.; Tradd, K.; Packard, G.; Girard, A.; Dennett, M.; Breier, J. A., Jr.

    2016-02-01

    We present a new method to estimate the distribution of invertebrate larvae relative to environmental variables such as temperature, salinity, and circulation. A large volume in situ filtering system developed for discrete biogeochemical sampling in the deep-sea (the Suspended Particulate Rosette "SUPR" multisampler) was mounted to the autonomous underwater vehicle REMUS 600 for coastal larval and environmental sampling. We describe the results of SUPR-REMUS deployments conducted in Buzzards Bay, Massachusetts (2014) and west of Martha's Vineyard, Massachusetts (2015). We collected discrete samples cross-shore and from surface, middle, and bottom layers of the water column. Samples were preserved for DNA analysis. Our Buzzards Bay deployment targeted barnacle larvae, which are abundant in late winter and early spring. For these samples, we used morphological analysis and DNA barcodes generated by Sanger sequencing to obtain stage and species-specific cross-shore and vertical distributions. We targeted bivalve larvae in our 2015 deployments, and genetic analysis of larvae from these samples is underway. For these samples, we are comparing species barcode data derived from traditional Sanger sequencing of individuals to those obtained from next generation sequencing (NGS) of bulk plankton samples. Our results demonstrate the utility of autonomous sampling combined with DNA barcoding for studying larval distributions and transport dynamics.

  20. Advanced MR Imaging of the Human Nucleus Accumbens--Additional Guiding Tool for Deep Brain Stimulation.

    PubMed

    Lucas-Neto, Lia; Reimão, Sofia; Oliveira, Edson; Rainha-Campos, Alexandre; Sousa, João; Nunes, Rita G; Gonçalves-Ferreira, António; Campos, Jorge G

    2015-07-01

    The human nucleus accumbens (Acc) has become a target for deep brain stimulation (DBS) in some neuropsychiatric disorders. Nonetheless, even with the most recent advances in neuroimaging it remains difficult to accurately delineate the Acc and closely related subcortical structures, by conventional MRI sequences. It is our purpose to perform a MRI study of the human Acc and to determine whether there are reliable anatomical landmarks that enable the precise location and identification of the nucleus and its core/shell division. For the Acc identification and delineation, based on anatomical landmarks, T1WI, T1IR and STIR 3T-MR images were acquired in 10 healthy volunteers. Additionally, 32-direction DTI was obtained for Acc segmentation. Seed masks for the Acc were generated with FreeSurfer and probabilistic tractography was performed using FSL. The probability of connectivity between the seed voxels and distinct brain areas was determined and subjected to k-means clustering analysis, defining 2 different regions. With conventional T1WI, the Acc borders are better defined through its surrounding anatomical structures. The DTI color-coded vector maps and IR sequences add further detail in the Acc identification and delineation. Additionally, using probabilistic tractography it is possible to segment the Acc into a core and shell division and establish its structural connectivity with different brain areas. Advanced MRI techniques allow in vivo delineation and segmentation of the human Acc and represent an additional guiding tool in the precise and safe target definition for DBS. © 2015 International Neuromodulation Society.

  1. Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks

    PubMed Central

    Lanchantin, Jack; Singh, Ritambhara; Wang, Beilun; Qi, Yanjun

    2018-01-01

    Deep neural network (DNN) models have recently obtained state-of-the-art prediction accuracy for the transcription factor binding (TFBS) site classification task. However, it remains unclear how these approaches identify meaningful DNA sequence signals and give insights as to why TFs bind to certain locations. In this paper, we propose a toolkit called the Deep Motif Dashboard (DeMo Dashboard) which provides a suite of visualization strategies to extract motifs, or sequence patterns from deep neural network models for TFBS classification. We demonstrate how to visualize and understand three important DNN models: convolutional, recurrent, and convolutional-recurrent networks. Our first visualization method is finding a test sequence’s saliency map which uses first-order derivatives to describe the importance of each nucleotide in making the final prediction. Second, considering recurrent models make predictions in a temporal manner (from one end of a TFBS sequence to the other), we introduce temporal output scores, indicating the prediction score of a model over time for a sequential input. Lastly, a class-specific visualization strategy finds the optimal input sequence for a given TFBS positive class via stochastic gradient optimization. Our experimental results indicate that a convolutional-recurrent architecture performs the best among the three architectures. The visualization techniques indicate that CNN-RNN makes predictions by modeling both motifs as well as dependencies among them. PMID:27896980

  2. Aftershock occurrence rate decay for individual sequences and catalogs

    NASA Astrophysics Data System (ADS)

    Nyffenegger, Paul A.

    One of the earliest observations of the Earth's seismicity is that the rate of aftershock occurrence decays with time according to a power law commonly known as modified Omori-law (MOL) decay. However, the physical reasons for aftershock occurrence and the empirical decay in rate remain unclear despite numerous models that yield similar rate decay behavior. Key problems in relating the observed empirical relationship to the physical conditions of the mainshock and fault are the lack of studies including small magnitude mainshocks and the lack of uniformity between studies. We use simulated aftershock sequences to investigate the factors which influence the maximum likelihood (ML) estimate of the Omori-law p value, the parameter describing aftershock occurrence rate decay, for both individual aftershock sequences and "stacked" or superposed sequences. Generally the ML estimate of p is accurate, but since the ML estimated uncertainty is unaffected by whether the sequence resembles an MOL model, a goodness-of-fit test such as the Anderson-Darling statistic is necessary. While stacking aftershock sequences permits the study of entire catalogs and sequences with small aftershock populations, stacking introduces artifacts. The p value for stacked sequences is approximately equal to the mean of the individual sequence p values. We apply single-link cluster analysis to identify all aftershock sequences from eleven regional seismicity catalogs. We observe two new mathematically predictable empirical relationships for the distribution of aftershock sequence populations. The average properties of aftershock sequences are not correlated with tectonic environment, but aftershock populations and p values do show a depth dependence. The p values show great variability with time, and large values or changes in p sometimes precedes major earthquakes. Studies of teleseismic earthquake catalogs over the last twenty years have led seismologists to question seismicity models and aftershock sequence decay for deep sequences. For seven exceptional deep sequences, we conclude that MOL decay adequately describes these sequences, and little difference exists compared to shallow sequences. However, they do include larger aftershock populations compared to most deep sequences. These results imply that p values for deep sequences are larger than those for intermediate depth sequences.

  3. Deep Brain Stimulation of the Dentato-Rubro-Thalamic Tract: Outcomes of Direct Targeting for Tremor.

    PubMed

    Fenoy, Albert J; Schiess, Mya C

    2017-07-01

    Targeting the dentato-rubro-thalamic tract (DRTt) has been suggested to be efficacious in deep brain stimulation (DBS) for tremor suppression, both in case reports and post-hoc analyses. This prospective observational study sought to analyze outcomes after directly targeting the DRTt in tremor patients. 20 consecutively enrolled intention tremor patients obtained pre-operative MRI with diffusion tensor (dTi) sequences. Mean baseline tremor amplitude based on The Essential Tremor Rating Assessment Scale was recorded. The DRTt was drawn for each individual on StealthViz software (Medtronic) using the dentate nucleus as the seed region and the ipsilateral pre-central gyrus as the end region and then directly targeted during surgery. Intraoperative testing confirmed successful tremor control. Post-operative analysis of electrode position relative to the DRTt was performed, as was post-operative assessment of tremor improvement. The mean age of patients was 66.8 years; mean duration of tremor was 16 years. Mean voltage for the L electrode = 3.4 V; R = 2.6 V. Mean distance from the center of the active electrode contact to the DRTt was 0.9 mm on the L, and 0.8 mm on the R. Improvement in arm tremor amplitude from baseline after DBS was significant (P < 0.001). Direct targeting of the DRTt in DBS is an effective strategy for tremor suppression. Accounting for hardware, software, and model limitations, depiction of the DRTt allows for placement of electrode contacts directly within the fiber tract for modulation despite any anatomical variation, which reproducibly resulted in good tremor control. © 2017 International Neuromodulation Society.

  4. Affinity Maturation of a Cyclic Peptide Handle for Therapeutic Antibodies Using Deep Mutational Scanning*

    PubMed Central

    van Rosmalen, Martijn; Janssen, Brian M. G.; Hendrikse, Natalie M.; van der Linden, Ardjan J.; Pieters, Pascal A.; Wanders, Dave; de Greef, Tom F. A.; Merkx, Maarten

    2017-01-01

    Meditopes are cyclic peptides that bind in a specific pocket in the antigen-binding fragment of a therapeutic antibody such as cetuximab. Provided their moderate affinity can be enhanced, meditope peptides could be used as specific non-covalent and paratope-independent handles in targeted drug delivery, molecular imaging, and therapeutic drug monitoring. Here we show that the affinity of a recently reported meditope for cetuximab can be substantially enhanced using a combination of yeast display and deep mutational scanning. Deep sequencing was used to construct a fitness landscape of this protein-peptide interaction, and four mutations were identified that together improved the affinity for cetuximab 10-fold to 15 nm. Importantly, the increased affinity translated into enhanced cetuximab-mediated recruitment to EGF receptor-overexpressing cancer cells. Although in silico Rosetta simulations correctly identified positions that were tolerant to mutation, modeling did not accurately predict the affinity-enhancing mutations. The experimental approach reported here should be generally applicable and could be used to develop meditope peptides with low nanomolar affinity for other therapeutic antibodies. PMID:27974464

  5. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

    PubMed

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-11

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  6. Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields

    NASA Astrophysics Data System (ADS)

    Wang, Sheng; Peng, Jian; Ma, Jianzhu; Xu, Jinbo

    2016-01-01

    Protein secondary structure (SS) prediction is important for studying protein structure and function. When only the sequence (profile) information is used as input feature, currently the best predictors can obtain ~80% Q3 accuracy, which has not been improved in the past decade. Here we present DeepCNF (Deep Convolutional Neural Fields) for protein SS prediction. DeepCNF is a Deep Learning extension of Conditional Neural Fields (CNF), which is an integration of Conditional Random Fields (CRF) and shallow neural networks. DeepCNF can model not only complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent SS labels, so it is much more powerful than CNF. Experimental results show that DeepCNF can obtain ~84% Q3 accuracy, ~85% SOV score, and ~72% Q8 accuracy, respectively, on the CASP and CAMEO test proteins, greatly outperforming currently popular predictors. As a general framework, DeepCNF can be used to predict other protein structure properties such as contact number, disorder regions, and solvent accessibility.

  7. Comprehensive identification and profiling of host miRNAs in response to Singapore grouper iridovirus (SGIV) infection in grouper (Epinephelus coioides).

    PubMed

    Guo, Chuanyu; Cui, Huachun; Ni, Songwei; Yan, Yang; Qin, Qiwei

    2015-10-01

    microRNAs (miRNAs) are an evolutionarily conserved class of non-coding RNA molecules that participate in various biological processes. Employment of high-throughput screening strategies greatly prompts the investigation and profiling of miRNAs in diverse species. In recent years, grouper (Epinephelus spp.) aquaculture was severely affected by iridoviral diseases. However, knowledge regarding the host immune responses to viral infection, especially the miRNA-mediated immune regulatory roles, is rather limited. In this study, by employing Solexa deep sequencing approach, we identified 116 grouper miRNAs from grouper spleen-derived cells (GS). As expected, these miRNAs shared high sequence similarity with miRNAs identified in zebrafish (Danio rerio), pufferfish (Fugu rubripes), and other higher vertebrates. In the process of Singapore grouper iridovirus (SGIV) infection, 45 and 43 miRNAs with altered expression (>1.5-fold) were identified by miRNA microarray assays in grouper spleen tissues and GS cells, respectively. Furthermore, target prediction revealed 189 putative targets of these grouper miRNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Revised annotation of Plutella xylostella microRNAs and their genome-wide target identification.

    PubMed

    Etebari, K; Asgari, S

    2016-12-01

    The diamondback moth, Plutella xylostella, is the most devastating pest of brassica crops worldwide. Although 128 mature microRNAs (miRNAs) have been annotated from this species in miRBase, there is a need to extend and correct the current P. xylostella miRNA repertoire as a result of its recently improved genome assembly and more available small RNA sequence data. We used our new ultra-deep sequence data and bioinformatics to re-annotate the P. xylostella genome for high confidence miRNAs with the correct 5p and 3p arm features. Furthermore, all the P. xylostella annotated genes were also screened to identify potential miRNA binding sites using three target-predicting algorithms. In total, 203 mature miRNAs were annotated, including 33 novel miRNAs. We identified 7691 highly confident binding sites for 160 pxy-miRNAs. The data provided here will facilitate future studies involving functional analyses of P. xylostella miRNAs as a platform to introduce novel approaches for sustainable management of this destructive pest. © 2016 The Royal Entomological Society.

  9. Deep kernel learning method for SAR image target recognition

    NASA Astrophysics Data System (ADS)

    Chen, Xiuyuan; Peng, Xiyuan; Duan, Ran; Li, Junbao

    2017-10-01

    With the development of deep learning, research on image target recognition has made great progress in recent years. Remote sensing detection urgently requires target recognition for military, geographic, and other scientific research. This paper aims to solve the synthetic aperture radar image target recognition problem by combining deep and kernel learning. The model, which has a multilayer multiple kernel structure, is optimized layer by layer with the parameters of Support Vector Machine and a gradient descent algorithm. This new deep kernel learning method improves accuracy and achieves competitive recognition results compared with other learning methods.

  10. Paired Exome Analysis Reveals Clonal Evolution and Potential Therapeutic Targets in Urothelial Carcinoma.

    PubMed

    Lamy, Philippe; Nordentoft, Iver; Birkenkamp-Demtröder, Karin; Thomsen, Mathilde Borg Houlberg; Villesen, Palle; Vang, Søren; Hedegaard, Jakob; Borre, Michael; Jensen, Jørgen Bjerggaard; Høyer, Søren; Pedersen, Jakob Skou; Ørntoft, Torben F; Dyrskjøt, Lars

    2016-10-01

    Greater knowledge concerning tumor heterogeneity and clonality is needed to determine the impact of targeted treatment in the setting of bladder cancer. In this study, we performed whole-exome, transcriptome, and deep-focused sequencing of metachronous tumors from 29 patients initially diagnosed with early-stage bladder tumors (14 with nonprogressive disease and 15 with progressive disease). Tumors from patients with progressive disease showed a higher variance of the intrapatient mutational spectrum and a higher frequency of APOBEC-related mutations. Allele-specific expression was also higher in these patients, particularly in tumor suppressor genes. Phylogenetic analysis revealed a common origin of the metachronous tumors, with a higher proportion of clonal mutations in the ancestral branch; however, 19 potential therapeutic targets were identified as both ancestral and tumor-specific alterations. Few subclones were present based on PyClone analysis. Our results illuminate tumor evolution and identify candidate therapeutic targets in bladder cancer. Cancer Res; 76(19); 5894-906. ©2016 AACR. ©2016 American Association for Cancer Research.

  11. In vivo Discovery of Immunotherapy Targets in the Tumor Microenvironment

    PubMed Central

    Zhou, Penghui; Shaffer, Donald R.; Arias, Diana A. Alvarez; Nakazaki, Yukoh; Pos, Wouter; Torres, Alexis J.; Cremasco, Viviana; Dougan, Stephanie K.; Cowley, Glenn S.; Elpek, Kutlu; Brogdon, Jennifer; Lamb, John; Turley, Shannon; Ploegh, Hidde L.; Root, David E.; Love, J. Christopher; Dranoff, Glenn; Hacohen, Nir; Cantor, Harvey; Wucherpfennig, Kai W.

    2014-01-01

    Recent clinical trials showed that targeting of inhibitory receptors on T cells induces durable responses in a subset of cancer patients, despite advanced disease. However, the regulatory switches controlling T cell function in immunosuppressive tumors are not well understood. Here we show that such inhibitory mechanisms can be systematically discovered in the tumor microenvironment. We devised an in vivo pooled shRNA screen in which shRNAs targeting negative regulators became highly enriched in tumors by releasing a block on T cell proliferation upon tumor antigen recognition. Such shRNAs were identified by deep sequencing of the shRNA cassette from T cells infiltrating tumor or control tissues. One of the target genes was Ppp2r2d, a regulatory subunit of the PP2A phosphatase family: In tumors, Ppp2r2d knockdown inhibited T cell apoptosis and enhanced T cell proliferation as well as cytokine production. Key regulators of immune function can thus be discovered in relevant tissue microenvironments. PMID:24476824

  12. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.

    PubMed

    Jones, David T; Kandathil, Shaun M

    2018-04-26

    In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation. Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions. DeepCov is freely available at https://github.com/psipred/DeepCov. d.t.jones@ucl.ac.uk.

  13. De novo germline and postzygotic mutations in AKT3, PIK3R2 and PIK3CA cause a spectrum of related megalencephaly syndromes

    PubMed Central

    Rivière, Jean-Baptiste; Mirzaa, Ghayda M.; O’Roak, Brian J.; Beddaoui, Margaret; Alcantara, Diana; Conway, Robert L.; St-Onge, Judith; Schwartzentruber, Jeremy A.; Gripp, Karen W.; Nikkel, Sarah M.; Worthylake, Thea; Sullivan, Christopher T.; Ward, Thomas R.; Butler, Hailly E.; Kramer, Nancy A.; Albrecht, Beate; Armour, Christine M.; Armstrong, Linlea; Caluseriu, Oana; Cytrynbaum, Cheryl; Drolet, Beth A.; Innes, A. Micheil; Lauzon, Julie L.; Lin, Angela E.; Mancini, Grazia M. S.; Meschino, Wendy S.; Reggin, James D.; Saggar, Anand K.; Lerman-Sagie, Tally; Uyanik, Gökhan; Weksberg, Rosanna; Zirn, Birgit; Beaulieu, Chandree L.; Majewski, Jacek; Bulman, Dennis E.; O’Driscoll, Mark; Shendure, Jay; Graham, John M.; Boycott, Kym M.; Dobyns, William B.

    2012-01-01

    Megalencephaly-capillary malformation (MCAP) and megalencephaly-polymicrogyria-polydactyly-hydrocephalus (MPPH) syndromes are sporadic overgrowth disorders associated with markedly enlarged brain size and other recognizable features1-5. We performed exome sequencing in three families with MCAP or MPPH and confirmed our initial observations in exomes from 7 MCAP and 174 control individuals, as well as in 40 additional megalencephaly subjects using a combination of Sanger sequencing, restriction-enzyme assays, and targeted deep sequencing. We identified de novo germline or postzygotic mutations in three core components of the phosphatidylinositol-3-kinase (PI3K)/AKT pathway. These include two mutations of AKT3, one recurrent mutation of PIK3R2 in 11 unrelated MPPH families, and 15 mostly postzygotic mutations of PIK3CA in 23 MCAP and one MPPH patients. Our data highlight the central role of PI3K/AKT signaling in vascular, limb and brain development, and emphasize the power of massively parallel sequencing in a challenging context of phenotypic and genetic heterogeneity combined with postzygotic mosaicism. PMID:22729224

  14. De novo germline and postzygotic mutations in AKT3, PIK3R2 and PIK3CA cause a spectrum of related megalencephaly syndromes.

    PubMed

    Rivière, Jean-Baptiste; Mirzaa, Ghayda M; O'Roak, Brian J; Beddaoui, Margaret; Alcantara, Diana; Conway, Robert L; St-Onge, Judith; Schwartzentruber, Jeremy A; Gripp, Karen W; Nikkel, Sarah M; Worthylake, Thea; Sullivan, Christopher T; Ward, Thomas R; Butler, Hailly E; Kramer, Nancy A; Albrecht, Beate; Armour, Christine M; Armstrong, Linlea; Caluseriu, Oana; Cytrynbaum, Cheryl; Drolet, Beth A; Innes, A Micheil; Lauzon, Julie L; Lin, Angela E; Mancini, Grazia M S; Meschino, Wendy S; Reggin, James D; Saggar, Anand K; Lerman-Sagie, Tally; Uyanik, Gökhan; Weksberg, Rosanna; Zirn, Birgit; Beaulieu, Chandree L; Majewski, Jacek; Bulman, Dennis E; O'Driscoll, Mark; Shendure, Jay; Graham, John M; Boycott, Kym M; Dobyns, William B

    2012-06-24

    Megalencephaly-capillary malformation (MCAP) and megalencephaly-polymicrogyria-polydactyly-hydrocephalus (MPPH) syndromes are sporadic overgrowth disorders associated with markedly enlarged brain size and other recognizable features. We performed exome sequencing in 3 families with MCAP or MPPH, and our initial observations were confirmed in exomes from 7 individuals with MCAP and 174 control individuals, as well as in 40 additional subjects with megalencephaly, using a combination of Sanger sequencing, restriction enzyme assays and targeted deep sequencing. We identified de novo germline or postzygotic mutations in three core components of the phosphatidylinositol 3-kinase (PI3K)-AKT pathway. These include 2 mutations in AKT3, 1 recurrent mutation in PIK3R2 in 11 unrelated families with MPPH and 15 mostly postzygotic mutations in PIK3CA in 23 individuals with MCAP and 1 with MPPH. Our data highlight the central role of PI3K-AKT signaling in vascular, limb and brain development and emphasize the power of massively parallel sequencing in a challenging context of phenotypic and genetic heterogeneity combined with postzygotic mosaicism.

  15. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

    PubMed Central

    Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier

    2008-01-01

    Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152

  16. A deep learning framework for modeling structural features of RNA-binding protein targets

    PubMed Central

    Zhang, Sai; Zhou, Jingtian; Hu, Hailin; Gong, Haipeng; Chen, Ligong; Cheng, Chao; Zeng, Jianyang

    2016-01-01

    RNA-binding proteins (RBPs) play important roles in the post-transcriptional control of RNAs. Identifying RBP binding sites and characterizing RBP binding preferences are key steps toward understanding the basic mechanisms of the post-transcriptional gene regulation. Though numerous computational methods have been developed for modeling RBP binding preferences, discovering a complete structural representation of the RBP targets by integrating their available structural features in all three dimensions is still a challenging task. In this paper, we develop a general and flexible deep learning framework for modeling structural binding preferences and predicting binding sites of RBPs, which takes (predicted) RNA tertiary structural information into account for the first time. Our framework constructs a unified representation that characterizes the structural specificities of RBP targets in all three dimensions, which can be further used to predict novel candidate binding sites and discover potential binding motifs. Through testing on the real CLIP-seq datasets, we have demonstrated that our deep learning framework can automatically extract effective hidden structural features from the encoded raw sequence and structural profiles, and predict accurate RBP binding sites. In addition, we have conducted the first study to show that integrating the additional RNA tertiary structural features can improve the model performance in predicting RBP binding sites, especially for the polypyrimidine tract-binding protein (PTB), which also provides a new evidence to support the view that RBPs may own specific tertiary structural binding preferences. In particular, the tests on the internal ribosome entry site (IRES) segments yield satisfiable results with experimental support from the literature and further demonstrate the necessity of incorporating RNA tertiary structural information into the prediction model. The source code of our approach can be found in https://github.com/thucombio/deepnet-rbp. PMID:26467480

  17. EZH2 mutations and promoter hypermethylation in childhood acute lymphoblastic leukemia.

    PubMed

    Schäfer, Vivien; Ernst, Jana; Rinke, Jenny; Winkelmann, Nils; Beck, James F; Hochhaus, Andreas; Gruhn, Bernd; Ernst, Thomas

    2016-07-01

    Acute lymphoblastic leukemia (ALL) is the most common malignancy in children and young adults. The polycomb repressive complex 2 (PRC2) has been identified as one of the most frequently mutated epigenetic protein complexes in hematologic cancers. PRC2 acts as an epigenetic repressor through histone H3 lysine 27 trimethylation (H3K27me3), catalyzed by the histone methyltransferase enhancer of zeste homolog 2 protein (EZH2). To study the prevalence and clinical impact of PRC2 aberrations in an unselected childhood ALL cohort (n = 152), we performed PRC2 mutational screenings by Sanger sequencing and promoter methylation analyses by quantitative pyrosequencing for the three PRC2 core component genes EZH2, suppressor of zeste 12 (SUZ12), and embryonic ectoderm development (EED). Targeted deep next-generation sequencing of 30 frequently mutated genes in leukemia was performed to search for cooperating mutations in patients harboring PRC2 aberrations. Finally, the functional consequence of EZH2 promoter hypermethylation on H3K27me3 was studied by Western blot analyses of primary cells. Loss-of-function EZH2 mutations were detected in 2/152 (1.3 %) patients with common-ALL and early T-cell precursor (ETP)-ALL, respectively. In one patient, targeted deep sequencing identified cooperating mutations in ASXL1 and TET2. EZH2 promoter hypermethylation was found in one patient with ETP-ALL which led to reduced H3K27me3. In comparison with healthy children, the EZH2 promoter was significantly higher methylated in T-ALL patients. No mutations or promoter methylation changes were identified for SUZ12 or EED genes, respectively. Although PRC2 aberrations seem to be rare in childhood ALL, our findings indicate that EZH2 aberrations might contribute to the disease in specific cases. Hereby, EZH2 promoter hypermethylation might have functionally similar consequences as loss-of-function mutations.

  18. Deep Sequencing of 71 Candidate Genes to Characterize Variation Associated with Alcohol Dependence.

    PubMed

    Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Kumar, Gaurav; Aberg, Karolina A; Nerella, Srilaxmi; Xie, Linying; Collins, Ann L; Crowley, James J; Quackenbush, Corey R; Hilliard, Christopher E; Shabalin, Andrey A; Vrieze, Scott I; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; McGue, Matt; Maes, Hermine; Iacono, William G; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J

    2017-04-01

    Previous genomewide association studies (GWASs) have identified a number of putative risk loci for alcohol dependence (AD). However, only a few loci have replicated and these replicated variants only explain a small proportion of AD risk. Using an innovative approach, the goal of this study was to generate hypotheses about potentially causal variants for AD that can be explored further through functional studies. We employed targeted capture of 71 candidate loci and flanking regions followed by next-generation deep sequencing (mean coverage 78X) in 806 European Americans. Regions included in our targeted capture library were genes identified through published GWAS of alcohol, all human alcohol and aldehyde dehydrogenases, reward system genes including dopaminergic and opioid receptors, prioritized candidate genes based on previous associations, and genes involved in the absorption, distribution, metabolism, and excretion of drugs. We performed single-locus tests to determine if any single variant was associated with AD symptom count. Sets of variants that overlapped with biologically meaningful annotations were tested for association in aggregate. No single, common variant was significantly associated with AD in our study. We did, however, find evidence for association with several variant sets. Two variant sets were significant at the q-value <0.10 level: a genic enhancer for ADHFE1 (p = 1.47 × 10 -5 ; q = 0.019), an alcohol dehydrogenase, and ADORA1 (p = 5.29 × 10 -5 ; q = 0.035), an adenosine receptor that belongs to a G-protein-coupled receptor gene family. To our knowledge, this is the first sequencing study of AD to examine variants in entire genes, including flanking and regulatory regions. We found that in addition to protein coding variant sets, regulatory variant sets may play a role in AD. From these findings, we have generated initial functional hypotheses about how these sets may influence AD. Copyright © 2017 by the Research Society on Alcoholism.

  19. Deep Learning Improves Antimicrobial Peptide Recognition.

    PubMed

    Veltri, Daniel; Kamath, Uday; Shehu, Amarda

    2018-03-24

    Bacterial resistance to antibiotics is a growing concern. Antimicrobial peptides (AMPs), natural components of innate immunity, are popular targets for developing new drugs. Machine learning methods are now commonly adopted by wet-laboratory researchers to screen for promising candidates. In this work we utilize deep learning to recognize antimicrobial activity. We propose a neural network model with convolutional and recurrent layers that leverage primary sequence composition. Results show that the proposed model outperforms state-of-the-art classification models on a comprehensive data set. By utilizing the embedding weights, we also present a reduced-alphabet representation and show that reasonable AMP recognition can be maintained using nine amino-acid types. Models and data sets are made freely available through the Antimicrobial Peptide Scanner vr.2 web server at: www.ampscanner.com. amarda@gmu.edu for general inquiries and dan.veltri@gmail.com for web server information. Supplementary data are available at Bioinformatics online.

  20. MutScan: fast detection and visualization of target mutations by scanning FASTQ data.

    PubMed

    Chen, Shifu; Huang, Tanxiao; Wen, Tiexiang; Li, Hong; Xu, Mingyan; Gu, Jia

    2018-01-22

    Some types of clinical genetic tests, such as cancer testing using circulating tumor DNA (ctDNA), require sensitive detection of known target mutations. However, conventional next-generation sequencing (NGS) data analysis pipelines typically involve different steps of filtering, which may cause miss-detection of key mutations with low frequencies. Variant validation is also indicated for key mutations detected by bioinformatics pipelines. Typically, this process can be executed using alignment visualization tools such as IGV or GenomeBrowse. However, these tools are too heavy and therefore unsuitable for validating mutations in ultra-deep sequencing data. We developed MutScan to address problems of sensitive detection and efficient validation for target mutations. MutScan involves highly optimized string-searching algorithms, which can scan input FASTQ files to grab all reads that support target mutations. The collected supporting reads for each target mutation will be piled up and visualized using web technologies such as HTML and JavaScript. Algorithms such as rolling hash and bloom filter are applied to accelerate scanning and make MutScan applicable to detect or visualize target mutations in a very fast way. MutScan is a tool for the detection and visualization of target mutations by only scanning FASTQ raw data directly. Compared to conventional pipelines, this offers a very high performance, executing about 20 times faster, and offering maximal sensitivity since it can grab mutations with even one single supporting read. MutScan visualizes detected mutations by generating interactive pile-ups using web technologies. These can serve to validate target mutations, thus avoiding false positives. Furthermore, MutScan can visualize all mutation records in a VCF file to HTML pages for cloud-friendly VCF validation. MutScan is an open source tool available at GitHub: https://github.com/OpenGene/MutScan.

  1. Deep Illumina-Based Shotgun Sequencing Reveals Dietary Effects on the Structure and Function of the Fecal Microbiome of Growing Kittens

    PubMed Central

    Deusch, Oliver; O’Flynn, Ciaran; Colyer, Alison; Morris, Penelope; Allaway, David; Jones, Paul G.; Swanson, Kelly S.

    2014-01-01

    Background Previously, we demonstrated that dietary protein:carbohydrate ratio dramatically affects the fecal microbial taxonomic structure of kittens using targeted 16S gene sequencing. The present study, using the same fecal samples, applied deep Illumina shotgun sequencing to identify the diet-associated functional potential and analyze taxonomic changes of the feline fecal microbiome. Methodology & Principal Findings Fecal samples from kittens fed one of two diets differing in protein and carbohydrate content (high–protein, low–carbohydrate, HPLC; and moderate-protein, moderate-carbohydrate, MPMC) were collected at 8, 12 and 16 weeks of age (n = 6 per group). A total of 345.3 gigabases of sequence were generated from 36 samples, with 99.75% of annotated sequences identified as bacterial. At the genus level, 26% and 39% of reads were annotated for HPLC- and MPMC-fed kittens, with HPLC-fed cats showing greater species richness and microbial diversity. Two phyla, ten families and fifteen genera were responsible for more than 80% of the sequences at each taxonomic level for both diet groups, consistent with the previous taxonomic study. Significantly different abundances between diet groups were observed for 324 genera (56% of all genera identified) demonstrating widespread diet-induced changes in microbial taxonomic structure. Diversity was not affected over time. Functional analysis identified 2,013 putative enzyme function groups were different (p<0.000007) between the two dietary groups and were associated to 194 pathways, which formed five discrete clusters based on average relative abundance. Of those, ten contained more (p<0.022) enzyme functions with significant diet effects than expected by chance. Six pathways were related to amino acid biosynthesis and metabolism linking changes in dietary protein with functional differences of the gut microbiome. Conclusions These data indicate that feline feces-derived microbiomes have large structural and functional differences relating to the dietary protein:carbohydrate ratio and highlight the impact of diet early in life. PMID:25010839

  2. Heterotrophic Proteobacteria in the vicinity of diffuse hydrothermal venting.

    PubMed

    Meier, Dimitri V; Bach, Wolfgang; Girguis, Peter R; Gruber-Vodicka, Harald R; Reeves, Eoghan P; Richter, Michael; Vidoudez, Charles; Amann, Rudolf; Meyerdierks, Anke

    2016-12-01

    Deep-sea hydrothermal vents are highly dynamic habitats characterized by steep temperature and chemical gradients. The oxidation of reduced compounds dissolved in the venting fluids fuels primary production providing the basis for extensive life. Until recently studies of microbial vent communities have focused primarily on chemolithoautotrophic organisms. In our study, we targeted the change of microbial community compositions along mixing gradients, focusing on distribution and capabilities of heterotrophic microorganisms. Samples were retrieved from different venting areas within the Menez Gwen hydrothermal field, taken along mixing gradients, including diffuse fluid discharge points, their immediate surroundings and the buoyant parts of hydrothermal plumes. High throughput 16S rRNA gene amplicon sequencing, fluorescence in situ hybridization, and targeted metagenome analysis were combined with geochemical analyses. Close to diffuse venting orifices dominated by chemolithoautotrophic Epsilonproteobacteria, in areas where environmental conditions still supported chemolithoautotrophic processes, we detected microbial communities enriched for versatile heterotrophic Alpha- and Gammaproteobacteria. The potential for alkane degradation could be shown for several genera and yet uncultured clades. We propose that hotspots of chemolithoautotrophic life support a 'belt' of heterotrophic bacteria significantly different from the dominating oligotrophic microbiota of the deep sea. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.

  3. Sequence, Structure, and Context Preferences of Human RNA Binding Proteins.

    PubMed

    Dominguez, Daniel; Freese, Peter; Alexis, Maria S; Su, Amanda; Hochman, Myles; Palden, Tsultrim; Bazile, Cassandra; Lambert, Nicole J; Van Nostrand, Eric L; Pratt, Gabriel A; Yeo, Gene W; Graveley, Brenton R; Burge, Christopher B

    2018-06-07

    RNA binding proteins (RBPs) orchestrate the production, processing, and function of mRNAs. Here, we present the affinity landscapes of 78 human RBPs using an unbiased assay that determines the sequence, structure, and context preferences of these proteins in vitro by deep sequencing of bound RNAs. These data enable construction of "RNA maps" of RBP activity without requiring crosslinking-based assays. We found an unexpectedly low diversity of RNA motifs, implying frequent convergence of binding specificity toward a relatively small set of RNA motifs, many with low compositional complexity. Offsetting this trend, however, we observed extensive preferences for contextual features distinct from short linear RNA motifs, including spaced "bipartite" motifs, biased flanking nucleotide composition, and bias away from or toward RNA structure. Our results emphasize the importance of contextual features in RNA recognition, which likely enable targeting of distinct subsets of transcripts by different RBPs that recognize the same linear motif. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  4. IAA-Ala Resistant3, an Evolutionarily Conserved Target of miR167, Mediates Arabidopsis Root Architecture Changes during High Osmotic Stress[W

    PubMed Central

    Kinoshita, Natsuko; Wang, Huan; Kasahara, Hiroyuki; Liu, Jun; MacPherson, Cameron; Machida, Yasunori; Kamiya, Yuji; Hannah, Matthew A.; Chua, Nam-Hai

    2012-01-01

    The functions of microRNAs and their target mRNAs in Arabidopsis thaliana development have been widely documented; however, roles of stress-responsive microRNAs and their targets are not as well understood. Using small RNA deep sequencing and ATH1 microarrays to profile mRNAs, we identified IAA-Ala Resistant3 (IAR3) as a new target of miR167a. As expected, IAR3 mRNA was cleaved at the miR167a complementary site and under high osmotic stress miR167a levels decreased, whereas IAR3 mRNA levels increased. IAR3 hydrolyzes an inactive form of auxin (indole-3-acetic acid [IAA]-alanine) and releases bioactive auxin (IAA), a central phytohormone for root development. In contrast with the wild type, iar3 mutants accumulated reduced IAA levels and did not display high osmotic stress–induced root architecture changes. Transgenic plants expressing a cleavage-resistant form of IAR3 mRNA accumulated high levels of IAR3 mRNAs and showed increased lateral root development compared with transgenic plants expressing wild-type IAR3. Expression of an inducible noncoding RNA to sequester miR167a by target mimicry led to an increase in IAR3 mRNA levels, further confirming the inverse relationship between the two partners. Sequence comparison revealed the miR167 target site on IAR3 mRNA is conserved in evolutionarily distant plant species. Finally, we showed that IAR3 is required for drought tolerance. PMID:22960911

  5. Universal target-enrichment baits for anthozoan (Cnidaria) phylogenomics: New approaches to long-standing problems.

    PubMed

    Quattrini, Andrea M; Faircloth, Brant C; Dueñas, Luisa F; Bridge, Tom C L; Brugler, Mercer R; Calixto-Botía, Iván F; DeLeo, Danielle M; Forêt, Sylvain; Herrera, Santiago; Lee, Simon M Y; Miller, David J; Prada, Carlos; Rádis-Baptista, Gandhi; Ramírez-Portilla, Catalina; Sánchez, Juan A; Rodríguez, Estefanía; McFadden, Catherine S

    2018-03-01

    Anthozoans (e.g., corals, anemones) are an ecologically important and diverse group of marine metazoans that occur from shallow to deep waters worldwide. However, our understanding of the evolutionary relationships among the ~7,500 species within this class is hindered by the lack of phylogenetically informative markers that can be reliably sequenced across a diversity of taxa. We designed and tested 16,306 RNA baits to capture 720 ultraconserved element loci and 1,071 exon loci. Library preparation and target enrichment were performed on 33 taxa from all orders within the class Anthozoa. Following Illumina sequencing and Trinity assembly, we recovered 1,774 of 1,791 targeted loci. The mean number of loci recovered from each species was 638 ± 222, with more loci recovered from octocorals (783 ± 138 loci) than hexacorals (475 ± 187 loci). Parsimony informative sites ranged from 26 to 49% for alignments at differing hierarchical taxonomic levels (e.g., Anthozoa, Octocorallia, Hexacorallia). The per cent of variable sites within each of three genera (Acropora, Alcyonium, and Sinularia) for which multiple species were sequenced ranged from 4.7% to 30%. Maximum-likelihood analyses recovered highly resolved trees with topologies matching those supported by other studies, including the monophyly of the order Scleractinia. Our results demonstrate the utility of this target-enrichment approach to resolve phylogenetic relationships from relatively old to recent divergences. Redesigning the baits with improved affinities to capture loci within each subclass will provide a valuable toolset to address systematic questions, further our understanding of the timing of diversifications and help resolve long-standing controversial relationships in the class Anthozoa. © 2017 John Wiley & Sons Ltd.

  6. Small RNA profiling and degradome analysis reveal regulation of microRNA in peanut embryogenesis and early pod development.

    PubMed

    Gao, Chao; Wang, Pengfei; Zhao, Shuzhen; Zhao, Chuanzhi; Xia, Han; Hou, Lei; Ju, Zheng; Zhang, Ye; Li, Changsheng; Wang, Xingjun

    2017-03-02

    As a typical geocarpic plant, peanut embryogenesis and pod development are complex processes involving many gene regulatory pathways and controlled by appropriate hormone level. MicroRNAs (miRNAs) are small non-coding RNAs that play indispensable roles in post-transcriptional gene regulation. Recently, identification and characterization of peanut miRNAs has been described. However, whether miRNAs participate in the regulation of peanut embryogenesis and pod development has yet to be explored. In this study, small RNA and degradome libraries from peanut early pod of different developmental stages were constructed and sequenced. A total of 70 known and 24 novel miRNA families were discovered. Among them, 16 miRNA families were legume-specific and 12 families were peanut-specific. 30 known and 10 novel miRNA families were differentially expressed during pod development. In addition, 115 target genes were identified for 47 miRNA families by degradome sequencing. Several new targets that might be specific to peanut were found and further validated by RNA ligase-mediated rapid amplification of 5' cDNA ends (RLM 5'-RACE). Furthermore, we performed profiling analysis of intact and total transcripts of several target genes, demonstrating that SPL (miR156/157), NAC (miR164), PPRP (miR167 and miR1088), AP2 (miR172) and GRF (miR396) are actively modulated during early pod development, respectively. Large numbers of miRNAs and their related target genes were identified through deep sequencing. These findings provided new information on miRNA-mediated regulatory pathways in peanut pod, which will contribute to the comprehensive understanding of the molecular mechanisms that governing peanut embryo and early pod development.

  7. Microcinematographic and electron microscopic analysis of target cell lysis induced by cytotoxic T lymphocytes.

    PubMed Central

    Matter, A

    1979-01-01

    A study was carried out to determine the sequence of events of T-cell mediated target cell lysis in microcinematography and electron microscopy. Highly efficient cytotoxic T lymphocytes (CTL) were generated in vivo and in vitro using preimmunized spleen cells and purification procedures. Such CTL were highly specific. This specificity correlated well with the number of adhesions formed between CTL and targets and this criterion was used to study killer-target cell interaction. Microcinematography showed that target cell lysis at the single cell level, despite time variations, could be clearly separated into three phases: (a) a recognition phase, visible by random crawling of CTL over the target cell surface until firm contact was established; (b) a post-recognition phase, during which firm contact between CTL and target was maintained without gross modification of either cell; (c) a phase of target cell disintegration, mainly characterized by vigorous blebbing of the cell membrane resulting in a motionless carcass of the target cell but not in its total dissolution. Only later this carcass decayed and formed a necrotic ghost. Electron microscopic observations were put into sequence according to microcinematography. Post-recognition phase was characterized by a tight apposition of the membranes of CTL and target cell. No gap junctions could be observed. During target cell disintegration, profound cytoplasmic and nuclear changes occurred simultaneous with surface blebbing. Most noticeable were extensive internal vacuolization, mitochondrial swelling, nuclear pycnosis and dissolution of the nucleolus. These observations suggested that target cell lysis does not start with a surface phenomenon similar to complement lysis, but a process involving practically the whole cell simultaneously. It is conceivable, therefore, that the signal from the CTL is transmitted across the target cell, and that the switch to sudden cell death is manipulated deep inside the cell. Images Figure 3 Figures 4-7 Figures 8-11 Figure 12 Figures 13-14 Figure 15 PMID:312256

  8. Identification of miRNA from Bouteloua gracilis, a drought tolerant grass, by deep sequencing and their in silico analysis.

    PubMed

    Ordóñez-Baquera, Perla Lucía; González-Rodríguez, Everardo; Aguado-Santacruz, Gerardo Armando; Rascón-Cruz, Quintín; Conesa, Ana; Moreno-Brito, Verónica; Echavarria, Raquel; Dominguez-Viveros, Joel

    2017-02-01

    MicroRNAs (miRNAs) are small non-coding RNA molecules that regulate signal transduction, development, metabolism, and stress responses in plants through post-transcriptional degradation and/or translational repression of target mRNAs. Several studies have addressed the role of miRNAs in model plant species, but miRNA expression and function in economically important forage crops, such as Bouteloua gracilis (Poaceae), a high-quality and drought-resistant grass distributed in semiarid regions of the United States and northern Mexico remain unknown. We applied high-throughput sequencing technology and bioinformatics analysis and identified 31 conserved miRNA families and 53 novel putative miRNAs with different abundance of reads in chlorophyllic cell cultures derived from B. gracilis. Some conserved miRNA families were highly abundant and possessed predicted targets involved in metabolism, plant growth and development, and stress responses. We also predicted additional identified novel miRNAs with specific targets, including B. gracilis ESTs, which were detected under drought stress conditions. Here we report 31 conserved miRNA families and 53 putative novel miRNAs in B. gracilis. Our results suggested the presence of regulatory miRNAs involved in modulating physiological and stress responses in this grass species. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Deep Impact Sequence Planning Using Multi-Mission Adaptable Planning Tools With Integrated Spacecraft Models

    NASA Technical Reports Server (NTRS)

    Wissler, Steven S.; Maldague, Pierre; Rocca, Jennifer; Seybold, Calina

    2006-01-01

    The Deep Impact mission was ambitious and challenging. JPL's well proven, easily adaptable multi-mission sequence planning tools combined with integrated spacecraft subsystem models enabled a small operations team to develop, validate, and execute extremely complex sequence-based activities within very short development times. This paper focuses on the core planning tool used in the mission, APGEN. It shows how the multi-mission design and adaptability of APGEN made it possible to model spacecraft subsystems as well as ground assets throughout the lifecycle of the Deep Impact project, starting with models of initial, high-level mission objectives, and culminating in detailed predictions of spacecraft behavior during mission-critical activities.

  10. Transcriptome Sequences Resolve Deep Relationships of the Grape Family

    PubMed Central

    Wen, Jun; Xiong, Zhiqiang; Nie, Ze-Long; Mao, Likai; Zhu, Yabing; Kan, Xian-Zhao; Ickert-Bond, Stefanie M.; Gerrath, Jean; Zimmer, Elizabeth A.; Fang, Xiao-Dong

    2013-01-01

    Previous phylogenetic studies of the grape family (Vitaceae) yielded poorly resolved deep relationships, thus impeding our understanding of the evolution of the family. Next-generation sequencing now offers access to protein coding sequences very easily, quickly and cost-effectively. To improve upon earlier work, we extracted 417 orthologous single-copy nuclear genes from the transcriptomes of 15 species of the Vitaceae, covering its phylogenetic diversity. The resulting transcriptome phylogeny provides robust support for the deep relationships, showing the phylogenetic utility of transcriptome data for plants over a time scale at least since the mid-Cretaceous. The pros and cons of transcriptome data for phylogenetic inference in plants are also evaluated. PMID:24069307

  11. Deep Learning and Its Applications in Biomedicine.

    PubMed

    Cao, Chensi; Liu, Feng; Tan, Hai; Song, Deshou; Shu, Wenjie; Li, Weizhong; Zhou, Yiming; Bo, Xiaochen; Xie, Zhi

    2018-02-01

    Advances in biological and medical technologies have been providing us explosive volumes of biological and physiological data, such as medical images, electroencephalography, genomic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning applications, including medical image classification, genomic sequence analysis, as well as protein structure classification and prediction. Finally, we offer our perspectives for the future directions in the field of deep learning. Copyright © 2018. Production and hosting by Elsevier B.V.

  12. Emergent HIV-1 Drug Resistance Mutations Were Not Present at Low-Frequency at Baseline in Non-Nucleoside Reverse Transcriptase Inhibitor-Treated Subjects in the STaR Study

    PubMed Central

    Porter, Danielle P.; Daeumer, Martin; Thielen, Alexander; Chang, Silvia; Martin, Ross; Cohen, Cal; Miller, Michael D.; White, Kirsten L.

    2015-01-01

    At Week 96 of the Single-Tablet Regimen (STaR) study, more treatment-naïve subjects that received rilpivirine/emtricitabine/tenofovir DF (RPV/FTC/TDF) developed resistance mutations compared to those treated with efavirenz (EFV)/FTC/TDF by population sequencing. Furthermore, more RPV/FTC/TDF-treated subjects with baseline HIV-1 RNA >100,000 copies/mL developed resistance compared to subjects with baseline HIV-1 RNA ≤100,000 copies/mL. Here, deep sequencing was utilized to assess the presence of pre-existing low-frequency variants in subjects with and without resistance development in the STaR study. Deep sequencing (Illumina MiSeq) was performed on baseline and virologic failure samples for all subjects analyzed for resistance by population sequencing during the clinical study (n = 33), as well as baseline samples from control subjects with virologic response (n = 118). Primary NRTI or NNRTI drug resistance mutations present at low frequency (≥2% to 20%) were detected in 6.6% of baseline samples by deep sequencing, all of which occurred in control subjects. Deep sequencing results were generally consistent with population sequencing but detected additional primary NNRTI and NRTI resistance mutations at virologic failure in seven samples. HIV-1 drug resistance mutations emerging while on RPV/FTC/TDF or EFV/FTC/TDF treatment were not present at low frequency at baseline in the STaR study. PMID:26690199

  13. Emergent HIV-1 Drug Resistance Mutations Were Not Present at Low-Frequency at Baseline in Non-Nucleoside Reverse Transcriptase Inhibitor-Treated Subjects in the STaR Study.

    PubMed

    Porter, Danielle P; Daeumer, Martin; Thielen, Alexander; Chang, Silvia; Martin, Ross; Cohen, Cal; Miller, Michael D; White, Kirsten L

    2015-12-07

    At Week 96 of the Single-Tablet Regimen (STaR) study, more treatment-naïve subjects that received rilpivirine/emtricitabine/tenofovir DF (RPV/FTC/TDF) developed resistance mutations compared to those treated with efavirenz (EFV)/FTC/TDF by population sequencing. Furthermore, more RPV/FTC/TDF-treated subjects with baseline HIV-1 RNA >100,000 copies/mL developed resistance compared to subjects with baseline HIV-1 RNA ≤100,000 copies/mL. Here, deep sequencing was utilized to assess the presence of pre-existing low-frequency variants in subjects with and without resistance development in the STaR study. Deep sequencing (Illumina MiSeq) was performed on baseline and virologic failure samples for all subjects analyzed for resistance by population sequencing during the clinical study (n = 33), as well as baseline samples from control subjects with virologic response (n = 118). Primary NRTI or NNRTI drug resistance mutations present at low frequency (≥2% to 20%) were detected in 6.6% of baseline samples by deep sequencing, all of which occurred in control subjects. Deep sequencing results were generally consistent with population sequencing but detected additional primary NNRTI and NRTI resistance mutations at virologic failure in seven samples. HIV-1 drug resistance mutations emerging while on RPV/FTC/TDF or EFV/FTC/TDF treatment were not present at low frequency at baseline in the STaR study.

  14. Identification of microRNAs involved in lipid biosynthesis and seed size in developing sea buckthorn seeds using high-throughput sequencing.

    PubMed

    Ding, Jian; Ruan, Chengjiang; Guan, Ying; Krishna, Priti

    2018-03-05

    Sea buckthorn is a plant of medicinal and nutritional importance owing in part to the high levels of essential fatty acids, linoleic (up to 42%) and α-linolenic (up to 39%) acids in the seed oil. Sea buckthorn can produce seeds either via the sexual pathway or by apomixis. The seed development and maturation programs are critically dependent on miRNAs. To understand miRNA-mediated regulation of sea buckthorn seed development, eight small RNA libraries were constructed for deep sequencing from developing seeds of a low oil content line 'SJ1' and a high oil content line 'XE3'. High-throughput sequencing identified 137 known miRNA from 27 families and 264 novel miRNAs. The potential targets of the identified miRNAs were predicted based on sequence homology. Nineteen (four known and 15 novel) and 22 (six known and 16 novel) miRNAs were found to be involved in lipid biosynthesis and seed size, respectively. An integrated analysis of mRNA and miRNA transcriptome and qRT-PCR identified some key miRNAs and their targets (miR164d-ARF2, miR168b-Δ9D, novelmiRNA-108-ACC, novelmiRNA-23-GPD1, novelmiRNA-58-DGAT1, and novelmiRNA-191-DGAT2) potentially involved in seed size and lipid biosynthesis of sea buckthorn seed. These results indicate the potential importance of miRNAs in regulating lipid biosynthesis and seed size in sea buckthorn.

  15. Pooled Resequencing of 122 Ulcerative Colitis Genes in a Large Dutch Cohort Suggests Population-Specific Associations of Rare Variants in MUC2.

    PubMed

    Visschedijk, Marijn C; Alberts, Rudi; Mucha, Soren; Deelen, Patrick; de Jong, Dirk J; Pierik, Marieke; Spekhorst, Lieke M; Imhann, Floris; van der Meulen-de Jong, Andrea E; van der Woude, C Janneke; van Bodegraven, Adriaan A; Oldenburg, Bas; Löwenberg, Mark; Dijkstra, Gerard; Ellinghaus, David; Schreiber, Stefan; Wijmenga, Cisca; Rivas, Manuel A; Franke, Andre; van Diemen, Cleo C; Weersma, Rinse K

    2016-01-01

    Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.

  16. VirusDetect: An automated pipeline for efficient virus discovery using deep sequencing of small RNAs

    USDA-ARS?s Scientific Manuscript database

    Accurate detection of viruses in plants and animals is critical for agriculture production and human health. Deep sequencing and assembly of virus-derived siRNAs has proven to be a highly efficient approach for virus discovery. However, to date no computational tools specifically designed for both k...

  17. Natural Variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gordon, Sean

    2013-03-01

    Sean Gordon of the USDA on Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.

  18. Microbial Diversity in Deep-sea Methane Seep Sediments Presented by SSU rRNA Gene Tag Sequencing

    PubMed Central

    Nunoura, Takuro; Takaki, Yoshihiro; Kazama, Hiromi; Hirai, Miho; Ashi, Juichiro; Imachi, Hiroyuki; Takai, Ken

    2012-01-01

    Microbial community structures in methane seep sediments in the Nankai Trough were analyzed by tag-sequencing analysis for the small subunit (SSU) rRNA gene using a newly developed primer set. The dominant members of Archaea were Deep-sea Hydrothermal Vent Euryarchaeotic Group 6 (DHVEG 6), Marine Group I (MGI) and Deep Sea Archaeal Group (DSAG), and those in Bacteria were Alpha-, Gamma-, Delta- and Epsilonproteobacteria, Chloroflexi, Bacteroidetes, Planctomycetes and Acidobacteria. Diversity and richness were examined by 8,709 and 7,690 tag-sequences from sediments at 5 and 25 cm below the seafloor (cmbsf), respectively. The estimated diversity and richness in the methane seep sediment are as high as those in soil and deep-sea hydrothermal environments, although the tag-sequences obtained in this study were not sufficient to show whole microbial diversity in this analysis. We also compared the diversity and richness of each taxon/division between the sediments from the two depths, and found that the diversity and richness of some taxa/divisions varied significantly along with the depth. PMID:22510646

  19. Deep Sequencing-guided Design of a High Affinity Dual Specificity Antibody to Target Two Angiogenic Factors in Neovascular Age-related Macular Degeneration.

    PubMed

    Koenig, Patrick; Lee, Chingwei V; Sanowar, Sarah; Wu, Ping; Stinson, Jeremy; Harris, Seth F; Fuh, Germaine

    2015-09-04

    The development of dual targeting antibodies promises therapies with improved efficacy over mono-specific antibodies. Here, we engineered a Two-in-One VEGF/angiopoietin 2 antibody with dual action Fab (DAF) as a potential therapeutic for neovascular age-related macular degeneration. Crystal structures of the VEGF/angiopoietin 2 DAF in complex with its two antigens showed highly overlapping binding sites. To achieve sufficient affinity of the DAF to block both angiogenic factors, we turned to deep mutational scanning in the complementarity determining regions (CDRs). By mutating all three CDRs of each antibody chain simultaneously, we were able not only to identify affinity improving single mutations but also mutation pairs from different CDRs that synergistically improve both binding functions. Furthermore, insights into the cooperativity between mutations allowed us to identify fold-stabilizing mutations in the CDRs. The data obtained from deep mutational scanning reveal that the majority of the 52 CDR residues are utilized differently for the two antigen binding function and permit, for the first time, the engineering of several DAF variants with sub-nanomolar affinity against two structurally unrelated antigens. The improved variants show similar blocking activity of receptor binding as the high affinity mono-specific antibodies against these two proteins, demonstrating the feasibility of generating a dual specificity binding surface with comparable properties to individual high affinity mono-specific antibodies. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

  20. Analysis of miRNA expression profiles in melatonin-exposed GC-1 spg cell line.

    PubMed

    Zhu, Xiaoling; Chen, Shuxiong; Jiang, Yanwen; Xu, Ying; Zhao, Yun; Chen, Lu; Li, Chunjin; Zhou, Xu

    2018-02-05

    Melatonin is an endocrine neurohormone secreted by pinealocytes in the pineal gland. It exerts diverse physiological effects, such as circadian rhythm regulator and antioxidant. However, the functional importance of melatonin in spermatogenesis regulation remains unclear. The objectives of this study are to: (1) detect melatonin affection on miRNA expression profiles in GC-1 spg cells by miRNA deep sequencing (DeepSeq) and (2) define melatonin affected miRNA-mRNA interactions and associated biological processes using bioinformatics analysis. GC-1 spg cells were cultured with melatonin (10 -7 M) for 24h. DeepSeq data were validated using quantitative real-time reverse transcription polymerase chain reaction analysis (qRT-PCR). A total of 176 miRNA expressions were found to be significantly different between two groups (fold change of >2 or <0.5 and FDR<0.05). Among these expressions, 171 were up-regulated, and 5 were down-regulated. Ontology analysis of biological processes of these targets indicated a variety of biological functions. Pathway analysis indicated that the predicted targets were involved in cancers, apoptosis and signaling pathways, such as VEGF, TNF, Ras and Notch. Results implicated that melatonin could regulate the expression of miRNA to perform its physiological effects in GC-1 spg cells. These results should be useful to investigate the biological function of miRNAs regulated by melatonin in spermatogenesis and testicular germ cell tumor. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Hypoxia-responsive miRNAs target argonaute 1 to promote angiogenesis

    PubMed Central

    Chen, Zhen; Lai, Tsung-Ching; Jan, Yi-Hua; Lin, Feng-Mao; Wang, Wei-Chi; Xiao, Han; Wang, Yun-Ting; Sun, Wei; Cui, Xiaopei; Li, Ying-Shiuan; Fang, Tzan; Zhao, Hongwei; Padmanabhan, Chellappan; Sun, Ruobai; Wang, Danny Ling; Jin, Hailing; Chau, Gar-Yang; Huang, Hsien-Da; Hsiao, Michael; Shyy, John Y-J.

    2013-01-01

    Despite a general repression of translation under hypoxia, cells selectively upregulate a set of hypoxia-inducible genes. Results from deep sequencing revealed that Let-7 and miR-103/107 are hypoxia-responsive microRNAs (HRMs) that are strongly induced in vascular endothelial cells. In silico bioinformatics and in vitro validation showed that these HRMs are induced by HIF1α and target argonaute 1 (AGO1), which anchors the microRNA-induced silencing complex (miRISC). HRM targeting of AGO1 resulted in the translational desuppression of VEGF mRNA. Inhibition of HRM or overexpression of AGO1 without the 3′ untranslated region decreased hypoxia-induced angiogenesis. Conversely, AGO1 knockdown increased angiogenesis under normoxia in vivo. In addition, data from tumor xenografts and human cancer specimens indicate that AGO1-mediated translational desuppression of VEGF may be associated with tumor angiogenesis and poor prognosis. These findings provide evidence for an angiogenic pathway involving HRMs that target AGO1 and suggest that this pathway may be a suitable target for anti- or proangiogenesis strategies. PMID:23426184

  2. Deep Recurrent Neural Networks for Human Activity Recognition

    PubMed Central

    Murad, Abdulmajid

    2017-01-01

    Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs. PMID:29113103

  3. Deep Recurrent Neural Networks for Human Activity Recognition.

    PubMed

    Murad, Abdulmajid; Pyun, Jae-Young

    2017-11-06

    Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs.

  4. Draft Genome Sequence of Pseudomonas oceani DSM 100277T, a Deep-Sea Bacterium.

    PubMed

    García-Valdés, Elena; Gomila, Margarita; Mulet, Magdalena; Lalucat, Jorge

    2018-04-12

    Pseudomonas oceani DSM 100277 T was isolated from deep seawater in the Okinawa Trough at 1390 m. P. oceani belongs to the Pseudomonas pertucinogena group. Here, we report the draft genome sequence of P. oceani , which has an estimated size of 4.1 Mb and exhibits 3,790 coding sequences, with a G+C content of 59.94 mol%. Copyright © 2018 García-Valdés et al.

  5. Deep Ion Torrent sequencing identifies soil fungal community shifts after frequent prescribed fires in a southeastern US forest ecosystem.

    PubMed

    Brown, Shawn P; Callaham, Mac A; Oliver, Alena K; Jumpponen, Ari

    2013-12-01

    Prescribed burning is a common management tool to control fuel loads, ground vegetation, and facilitate desirable game species. We evaluated soil fungal community responses to long-term prescribed fire treatments in a loblolly pine forest on the Piedmont of Georgia and utilized deep Internal Transcribed Spacer Region 1 (ITS1) amplicon sequencing afforded by the recent Ion Torrent Personal Genome Machine (PGM). These deep sequence data (19,000 + reads per sample after subsampling) indicate that frequent fires (3-year fire interval) shift soil fungus communities, whereas infrequent fires (6-year fire interval) permit system resetting to a state similar to that without prescribed fire. Furthermore, in nonmetric multidimensional scaling analyses, primarily ectomycorrhizal taxa were correlated with axes associated with long fire intervals, whereas soil saprobes tended to be correlated with the frequent fire recurrence. We conclude that (1) multiplexed Ion Torrent PGM analyses allow deep cost effective sequencing of fungal communities but may suffer from short read lengths and inconsistent sequence quality adjacent to the sequencing adaptor; (2) frequent prescribed fires elicit a shift in soil fungal communities; and (3) such shifts do not occur when fire intervals are longer. Our results emphasize the general responsiveness of these forests to management, and the importance of fire return intervals in meeting management objectives. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  6. RNA-Seq analysis to capture the transcriptome landscape of a single cell

    PubMed Central

    Tang, Fuchou; Barbacioru, Catalin; Nordman, Ellen; Xu, Nanlan; Bashkirov, Vladimir I; Lao, Kaiqin; Surani, M. Azim

    2013-01-01

    We describe here a protocol for digital transcriptome analysis in a single mouse blastomere using a deep sequencing approach. An individual blastomere was first isolated and put into lysate buffer by mouth pipette. Reverse transcription was then performed directly on the whole cell lysate. After this, the free primers were removed by Exonuclease I and a poly(A) tail was added to the 3′ end of the first-strand cDNA by Terminal Deoxynucleotidyl Transferase. Then the single cell cDNAs were amplified by 20 plus 9 cycles of PCR. Then 100-200 ng of these amplified cDNAs were used to construct a sequencing library. The sequencing library can be used for deep sequencing using the SOLiD system. Compared with the cDNA microarray technique, our assay can capture up to 75% more genes expressed in early embryos. The protocol can generate deep sequencing libraries within 6 days for 16 single cell samples. PMID:20203668

  7. Deep sequencing reveals double mutations in cis of MPL exon 10 in myeloproliferative neoplasms.

    PubMed

    Pietra, Daniela; Brisci, Angela; Rumi, Elisa; Boggi, Sabrina; Elena, Chiara; Pietrelli, Alessandro; Bordoni, Roberta; Ferrari, Maurizio; Passamonti, Francesco; De Bellis, Gianluca; Cremonesi, Laura; Cazzola, Mario

    2011-04-01

    Somatic mutations of MPL exon 10, mainly involving a W515 substitution, have been described in JAK2 (V617F)-negative patients with essential thrombocythemia and primary myelofibrosis. We used direct sequencing and high-resolution melt analysis to identify mutations of MPL exon 10 in 570 patients with myeloproliferative neoplasms, and allele specific PCR and deep sequencing to further characterize a subset of mutated patients. Somatic mutations were detected in 33 of 221 patients (15%) with JAK2 (V617F)-negative essential thrombocythemia or primary myelofibrosis. Only one patient with essential thrombocythemia carried both JAK2 (V617F) and MPL (W515L). High-resolution melt analysis identified abnormal patterns in all the MPL mutated cases, while direct sequencing did not detect the mutant MPL in one fifth of them. In 3 cases carrying double MPL mutations, deep sequencing analysis showed identical load and location in cis of the paired lesions, indicating their simultaneous occurrence on the same chromosome.

  8. A-to-I RNA editing occurs at over a hundred million genomic sites, located in a majority of human genes.

    PubMed

    Bazak, Lily; Haviv, Ami; Barak, Michal; Jacob-Hirsch, Jasmine; Deng, Patricia; Zhang, Rui; Isaacs, Farren J; Rechavi, Gideon; Li, Jin Billy; Eisenberg, Eli; Levanon, Erez Y

    2014-03-01

    RNA molecules transmit the information encoded in the genome and generally reflect its content. Adenosine-to-inosine (A-to-I) RNA editing by ADAR proteins converts a genomically encoded adenosine into inosine. It is known that most RNA editing in human takes place in the primate-specific Alu sequences, but the extent of this phenomenon and its effect on transcriptome diversity are not yet clear. Here, we analyzed large-scale RNA-seq data and detected ∼1.6 million editing sites. As detection sensitivity increases with sequencing coverage, we performed ultradeep sequencing of selected Alu sequences and showed that the scope of editing is much larger than anticipated. We found that virtually all adenosines within Alu repeats that form double-stranded RNA undergo A-to-I editing, although most sites exhibit editing at only low levels (<1%). Moreover, using high coverage sequencing, we observed editing of transcripts resulting from residual antisense expression, doubling the number of edited sites in the human genome. Based on bioinformatic analyses and deep targeted sequencing, we estimate that there are over 100 million human Alu RNA editing sites, located in the majority of human genes. These findings set the stage for exploring how this primate-specific massive diversification of the transcriptome is utilized.

  9. An Outbreak of Acute Hepatitis Caused by Genotype IB Hepatitis A Viruses Contaminating the Water Supply in Thailand.

    PubMed

    Ruchusatsawat, Kriangsak; Wongpiyabovorn, Jongkonnee; Kawidam, Chonthicha; Thiemsing, Laddawan; Sangkitporn, Somchai; Yoshizaki, Sayaka; Tatsumi, Masashi; Takeda, Naokazu; Ishii, Koji

    2016-01-01

    In 2000, an outbreak of acute hepatitis A was reported in a province adjacent to Bangkok, Thailand. To investigate the cause of the 2000 hepatitis A outbreaks in Thailand using molecular epidemiological analysis. Serum and stool specimens were collected from patients who were clinically diagnosed with acute viral hepatitis. Water samples from drinking water and deep-drilled wells were also collected. These specimens were subjected to polymerase chain reaction (PCR) amplification and sequencing of the VP1/2A region of the hepatitis A virus (HAV) genome. The entire genome sequence of one of the fecal specimens was determined and phylogenetically analyzed with those of known HAV sequences. Eleven of 24 fecal specimens collected from acute viral hepatitis patients were positive as determined by semi- nested reverse transcription PCR targeting the VP1/2A region of HAV. The nucleotide sequence of these samples had an identical genotype IB sequence, suggesting that the same causative agent was present. The complete nucleotide sequence derived from one of the samples indicated that the Thai genotype IB strain should be classified in a unique phylogenetic cluster. The analysis using an adjusted odds ratio showed that the consumption of groundwater was the most likely risk factor associated with the disease. © 2017 S. Karger AG, Basel.

  10. Co-infection and cross-species transmission of divergent Hepatocystis lineages in a wild African primate community.

    PubMed

    Thurber, Mary I; Ghai, Ria R; Hyeroba, David; Weny, Geoffrey; Tumukunde, Alex; Chapman, Colin A; Wiseman, Roger W; Dinis, Jorge; Steeil, James; Greiner, Ellis C; Friedrich, Thomas C; O'Connor, David H; Goldberg, Tony L

    2013-07-01

    Hemoparasites of the apicomplexan family Plasmodiidae include the etiological agents of malaria, as well as a suite of non-human primate parasites from which the human malaria agents evolved. Despite the significance of these parasites for global health, little information is available about their ecology in multi-host communities. Primates were investigated in Kibale National Park, Uganda, where ecological relationships among host species are well characterized. Blood samples were examined for parasites of the genera Plasmodium and Hepatocystis using microscopy and PCR targeting the parasite mitochondrial cytochrome b gene, followed by Sanger sequencing. To assess co-infection, "deep sequencing" of a variable region within cytochrome b was performed. Out of nine black-and-white colobus (Colobus guereza), one blue guenon (Cercopithecus mitis), five grey-cheeked mangabeys (Lophocebus albigena), 23 olive baboons (Papio anubis), 52 red colobus (Procolobus rufomitratus) and 12 red-tailed guenons (Cercopithecus ascanius), 79 infections (77.5%) were found, all of which were Hepatocystis spp. Sanger sequencing revealed 25 different parasite haplotypes that sorted phylogenetically into six species-specific but morphologically similar lineages. "Deep sequencing" revealed mixed-lineage co-infections in baboons and red colobus (41.7% and 64.7% of individuals, respectively) but not in other host species. One lineage infecting red colobus also infected baboons, but always as the minor variant, suggesting directional cross-species transmission. Hepatocystis parasites in this primate community are a diverse assemblage of cryptic lineages, some of which co-infect hosts and at least one of which can cross primate species barriers. Copyright © 2013 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  11. RaptorX-Property: a web server for protein structure property prediction.

    PubMed

    Wang, Sheng; Li, Wei; Liu, Shiwang; Xu, Jinbo

    2016-07-08

    RaptorX Property (http://raptorx2.uchicago.edu/StructurePropertyPred/predict/) is a web server predicting structure property of a protein sequence without using any templates. It outperforms other servers, especially for proteins without close homologs in PDB or with very sparse sequence profile (i.e. carries little evolutionary information). This server employs a powerful in-house deep learning model DeepCNF (Deep Convolutional Neural Fields) to predict secondary structure (SS), solvent accessibility (ACC) and disorder regions (DISO). DeepCNF not only models complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent property labels. Our experimental results show that, tested on CASP10, CASP11 and the other benchmarks, this server can obtain ∼84% Q3 accuracy for 3-state SS, ∼72% Q8 accuracy for 8-state SS, ∼66% Q3 accuracy for 3-state solvent accessibility, and ∼0.89 area under the ROC curve (AUC) for disorder prediction. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Deep sequencing analysis of viral infection and evolution allows rapid and detailed characterization of viral mutant spectrum.

    PubMed

    Isakov, Ofer; Bordería, Antonio V; Golan, David; Hamenahem, Amir; Celniker, Gershon; Yoffe, Liron; Blanc, Hervé; Vignuzzi, Marco; Shomron, Noam

    2015-07-01

    The study of RNA virus populations is a challenging task. Each population of RNA virus is composed of a collection of different, yet related genomes often referred to as mutant spectra or quasispecies. Virologists using deep sequencing technologies face major obstacles when studying virus population dynamics, both experimentally and in natural settings due to the relatively high error rates of these technologies and the lack of high performance pipelines. In order to overcome these hurdles we developed a computational pipeline, termed ViVan (Viral Variance Analysis). ViVan is a complete pipeline facilitating the identification, characterization and comparison of sequence variance in deep sequenced virus populations. Applying ViVan on deep sequenced data obtained from samples that were previously characterized by more classical approaches, we uncovered novel and potentially crucial aspects of virus populations. With our experimental work, we illustrate how ViVan can be used for studies ranging from the more practical, detection of resistant mutations and effects of antiviral treatments, to the more theoretical temporal characterization of the population in evolutionary studies. Freely available on the web at http://www.vivanbioinfo.org : nshomron@post.tau.ac.il Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  13. ChimerDB 3.0: an enhanced database for fusion genes from cancer transcriptome and literature data mining.

    PubMed

    Lee, Myunggyo; Lee, Kyubum; Yu, Namhee; Jang, Insu; Choi, Ikjung; Kim, Pora; Jang, Ye Eun; Kim, Byounggun; Kim, Sunkyu; Lee, Byungwook; Kang, Jaewoo; Lee, Sanghyuk

    2017-01-04

    Fusion gene is an important class of therapeutic targets and prognostic markers in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data and manual curations. In this update, the database coverage was enhanced considerably by adding two new modules of The Cancer Genome Atlas (TCGA) RNA-Seq analysis and PubMed abstract mining. ChimerDB 3.0 is composed of three modules of ChimerKB, ChimerPub and ChimerSeq. ChimerKB represents a knowledgebase including 1066 fusion genes with manual curation that were compiled from public resources of fusion genes with experimental evidences. ChimerPub includes 2767 fusion genes obtained from text mining of PubMed abstracts. ChimerSeq module is designed to archive the fusion candidates from deep sequencing data. Importantly, we have analyzed RNA-Seq data of the TCGA project covering 4569 patients in 23 cancer types using two reliable programs of FusionScan and TopHat-Fusion. The new user interface supports diverse search options and graphic representation of fusion gene structure. ChimerDB 3.0 is available at http://ercsb.ewha.ac.kr/fusiongene/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Identification of ribonucleotide reductase mutation causing temperature-sensitivity of herpes simplex virus isolates from whitlow by deep sequencing.

    PubMed

    Daikoku, Tohru; Oyama, Yukari; Yajima, Misako; Sekizuka, Tsuyoshi; Kuroda, Makoto; Shimada, Yuka; Takehara, Kazuhiko; Miwa, Naoko; Okuda, Tomoko; Sata, Tetsutaro; Shiraki, Kimiyasu

    2015-06-01

    Herpes simplex virus 2 caused a genital ulcer, and a secondary herpetic whitlow appeared during acyclovir therapy. The secondary and recurrent whitlow isolates were acyclovir-resistant and temperature-sensitive in contrast to a genital isolate. We identified the ribonucleotide reductase mutation responsible for temperature-sensitivity by deep-sequencing analysis.

  15. Genome-Wide Identification of miRNAs Responsive to Drought in Peach (Prunus persica) by High-Throughput Deep Sequencing

    PubMed Central

    Eldem, Vahap; Çelikkol Akçay, Ufuk; Ozhuner, Esma; Bakır, Yakup; Uranbey, Serkan; Unver, Turgay

    2012-01-01

    Peach (Prunus persica L.) is one of the most important worldwide fresh fruits. Since fruit growth largely depends on adequate water supply, drought stress is considered as the most important abiotic stress limiting fleshy fruit production and quality in peach. Plant responses to drought stress are regulated both at transcriptional and post-transcriptional level. As post-transcriptional gene regulators, miRNAs (miRNAs) are small (19–25 nucleotides in length), endogenous, non-coding RNAs. Recent studies indicate that miRNAs are involved in plant responses to drought. Therefore, Illumina deep sequencing technology was used for genome-wide identification of miRNAs and their expression profile in response to drought in peach. In this study, four sRNA libraries were constructed from leaf control (LC), leaf stress (LS), root control (RC) and root stress (RS) samples. We identified a total of 531, 471, 535 and 487 known mature miRNAs in LC, LS, RC and RS libraries, respectively. The expression level of 262 (104 up-regulated, 158 down-regulated) of the 453 miRNAs changed significantly in leaf tissue, whereas 368 (221 up-regulated, 147 down-regulated) of the 465 miRNAs had expression levels that changed significantly in root tissue upon drought stress. Additionally, a total of 197, 221, 238 and 265 novel miRNA precursor candidates were identified from LC, LS, RC and RS libraries, respectively. Target transcripts (137 for LC, 133 for LS, 148 for RC and 153 for RS) generated significant Gene Ontology (GO) terms related to DNA binding and catalytic activites. Genome-wide miRNA expression analysis of peach by deep sequencing approach helped to expand our understanding of miRNA function in response to drought stress in peach and Rosaceae. A set of differentially expressed miRNAs could pave the way for developing new strategies to alleviate the adverse effects of drought stress on plant growth and development. PMID:23227166

  16. Diverse correlation patterns between microRNAs and their targets during tomato fruit development indicates different modes of microRNA actions.

    PubMed

    Lopez-Gomollon, Sara; Mohorianu, Irina; Szittya, Gyorgy; Moulton, Vincent; Dalmay, Tamas

    2012-12-01

    MicroRNAs negatively regulate the accumulation of mRNAs therefore when they are expressed in the same cells their expression profiles show an inverse correlation. We previously described one positively correlated miRNA/target pair, but it is not known how widespread this phenomenon is. Here, we investigated the correlation between the expression profiles of differentially expressed miRNAs and their targets during tomato fruit development using deep sequencing, Northern blot and RT-qPCR. We found an equal number of positively and negatively correlated miRNA/target pairs indicating that positive correlation is more frequent than previously thought. We also found that the correlation between microRNA and target expression profiles can vary between mRNAs belonging to the same gene family and even for the same target mRNA at different developmental stages. Since microRNAs always negatively regulate their targets, the high number of positively correlated microRNA/target pairs suggests that mutual exclusion could be as widespread as temporal regulation. The change of correlation during development suggests that the type of regulatory circuit directed by a microRNA can change over time and can be different for individual gene family members. Our results also highlight potential problems for expression profiling-based microRNA target identification/validation.

  17. Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning.

    PubMed

    Teng, Haotian; Cao, Minh Duc; Hall, Michael B; Duarte, Tania; Wang, Sheng; Coin, Lachlan J M

    2018-05-01

    Sequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology that offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling and directly translate the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4,000 reads, we show that our model provides state-of-the-art basecalling accuracy, even on previously unseen species. Chiron achieves basecalling speeds of more than 2,000 bases per second using desktop computer graphics processing units.

  18. Finding a needle in the virus metagenome haystack--micro-metagenome analysis captures a snapshot of the diversity of a bacteriophage armoire.

    PubMed

    Ray, Jessica; Dondrup, Michael; Modha, Sejal; Steen, Ida Helene; Sandaa, Ruth-Anne; Clokie, Martha

    2012-01-01

    Viruses are ubiquitous in the oceans and critical components of marine microbial communities, regulating nutrient transfer to higher trophic levels or to the dissolved organic pool through lysis of host cells. Hydrothermal vent systems are oases of biological activity in the deep oceans, for which knowledge of biodiversity and its impact on global ocean biogeochemical cycling is still in its infancy. In order to gain biological insight into viral communities present in hydrothermal vent systems, we developed a method based on deep-sequencing of pulsed field gel electrophoretic bands representing key viral fractions present in seawater within and surrounding a hydrothermal plume derived from Loki's Castle vent field at the Arctic Mid-Ocean Ridge. The reduction in virus community complexity afforded by this novel approach enabled the near-complete reconstruction of a lambda-like phage genome from the virus fraction of the plume. Phylogenetic examination of distinct gene regions in this lambdoid phage genome unveiled diversity at loci encoding superinfection exclusion- and integrase-like proteins. This suggests the importance of fine-tuning lyosgenic conversion as a viral survival strategy, and provides insights into the nature of host-virus and virus-virus interactions, within hydrothermal plumes. By reducing the complexity of the viral community through targeted sequencing of prominent dsDNA viral fractions, this method has selectively mimicked virus dominance approaching that hitherto achieved only through culturing, thus enabling bioinformatic analysis to locate a lambdoid viral "needle" within the greater viral community "haystack". Such targeted analyses have great potential for accelerating the extraction of biological knowledge from diverse and poorly understood environmental viral communities.

  19. SC1 Promotes MiR124-3p Expression to Maintain the Self-Renewal of Mouse Embryonic Stem Cells by Inhibiting the MEK/ERK Pathway.

    PubMed

    Wei, Qing; Liu, Hongliang; Ai, Zhiying; Wu, Yongyan; Liu, Yingxiang; Shi, Zhaopeng; Ren, Xuexue; Guo, Zekun

    2017-01-01

    Self-renewal is one of the most important features of embryonic stem (ES) cells. SC1 is a small molecule modulator that effectively maintains the self-renewal of mouse ES cells in the absence of leukemia inhibitory factor (LIF), serum and feeder cells. However, the mechanism by which SC1 maintains the undifferentiated state of mouse ES cells remains unclear. In this study, microarray and small RNA deep-sequencing experiments were performed on mouse ES cells treated with or without SC1 to identify the key genes and microRNAs that contributed to self-renewal. SC1 regulates the expressions of pluripotency and differentiation factors, and antagonizes the retinoic acid (RA)-induced differentiation in the presence or absence of LIF. SC1 inhibits the MEK/ERK pathway through Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis and pathway reporting experiments. Small RNA deep-sequencing revealed that SC1 significantly modulates the expression of multiple microRNAs with crucial functions in ES cells. The expression of miR124-3p is upregulated in SC1-treated ES cells, which significantly inhibits the MEK/ERK pathway by targeting Grb2, Sos2 and Egr1. SC1 enhances the self-renewal capacity of mouse ES cells by modulating the expression of key regulatory genes and pluripotency-associated microRNAs. SC1 significantly upregulates miR124-3p expression to further inhibit the MEK/ ERK pathway by targeting Grb2, Sos2 and Egr1. © 2017 The Author(s). Published by S. Karger AG, Basel.

  20. Fiber tractography of the axonal pathways linking the basal ganglia and cerebellum in Parkinson disease: implications for targeting in deep brain stimulation.

    PubMed

    Sweet, Jennifer A; Walter, Benjamin L; Gunalan, Kabilar; Chaturvedi, Ashutosh; McIntyre, Cameron C; Miller, Jonathan P

    2014-04-01

    Stimulation of white matter pathways near targeted structures may contribute to therapeutic effects of deep brain stimulation (DBS) for patients with Parkinson disease (PD). Two tracts linking the basal ganglia and cerebellum have been described in primates: the subthalamopontocerebellar tract (SPCT) and the dentatothalamic tract (DTT). The authors used fiber tractography to evaluate white matter tracts that connect the cerebellum to the region of the basal ganglia in patients with PD who were candidates for DBS. Fourteen patients with advanced PD underwent 3-T MRI, including 30-directional diffusion-weighted imaging sequences. Diffusion tensor tractography was performed using 2 regions of interest: ipsilateral subthalamic and red nuclei, and contralateral cerebellar hemisphere. Nine patients underwent subthalamic DBS, and the course of each tract was observed relative to the location of the most effective stimulation contact and the volume of tissue activated. In all patients 2 distinct tracts were identified that corresponded closely to the described anatomical features of the SPCT and DTT, respectively. The mean overall distance from the active contact to the DTT was 2.18 ± 0.35 mm, and the mean proportional distance relative to the volume of tissue activated was 1.35 ± 0.48. There was a nonsignificant trend toward better postoperative tremor control in patients with electrodes closer to the DTT. The SPCT and the DTT may be related to the expression of symptoms in PD, and this may have implications for DBS targeting. The use of tractography to identify the DTT might assist with DBS targeting in the future.

  1. MicroRNA-944 Affects Cell Growth by Targeting EPHA7 in Non-Small Cell Lung Cancer.

    PubMed

    Liu, Minxia; Zhou, Kecheng; Cao, Yi

    2016-09-26

    MicroRNAs (miRNAs) have critical roles in lung tumorigenesis and development. To determine aberrantly expressed miRNAs involved in non-small cell lung cancer (NSCLC) and investigate pathophysiological functions and mechanisms, we firstly carried out small RNA deep sequencing in NSCLC cell lines (EPLC-32M1, A549 and 801D) and a human immortalized cell line 16HBE, we then studied miRNA function by cell proliferation and apoptosis. cDNA microarray, luciferase reporter assay and miRNA transfection were used to investigate interaction between the miRNA and target gene. miR-944 was significantly down-regulated in NSCLC and had many putative targets. Moreover, the forced expression of miR-944 significantly inhibited the proliferation of NSCLC cells in vitro. By integrating mRNA expression data and miR-944-target prediction, we disclosed that EPHA7 was a potential target of miR-944, which was further verified by luciferase reporter assay and microRNA transfection. Our data indicated that miR-944 targets EPHA7 in NSCLC and regulates NSCLC cell proliferation, which may offer a new mechanism underlying the development and progression of NSCLC.

  2. Biasing genome-editing events toward precise length deletions with an RNA-guided TevCas9 dual nuclease.

    PubMed

    Wolfs, Jason M; Hamilton, Thomas A; Lant, Jeremy T; Laforet, Marcon; Zhang, Jenny; Salemi, Louisa M; Gloor, Gregory B; Schild-Poulter, Caroline; Edgell, David R

    2016-12-27

    The CRISPR/Cas9 nuclease is commonly used to make gene knockouts. The blunt DNA ends generated by cleavage can be efficiently ligated by the classical nonhomologous end-joining repair pathway (c-NHEJ), regenerating the target site. This repair creates a cycle of cleavage, ligation, and target site regeneration that persists until sufficient modification of the DNA break by alternative NHEJ prevents further Cas9 cutting, generating a heterogeneous population of insertions and deletions typical of gene knockouts. Here, we develop a strategy to escape this cycle and bias events toward defined length deletions by creating an RNA-guided dual active site nuclease that generates two noncompatible DNA breaks at a target site, effectively deleting the majority of the target site such that it cannot be regenerated. The TevCas9 nuclease, a fusion of the I-TevI nuclease domain to Cas9, functions robustly in HEK293 cells and generates 33- to 36-bp deletions at frequencies up to 40%. Deep sequencing revealed minimal processing of TevCas9 products, consistent with protection of the DNA ends from exonucleolytic degradation and repair by the c-NHEJ pathway. Directed evolution experiments identified I-TevI variants with broadened targeting range, making TevCas9 an easy-to-use reagent. Our results highlight how the sequence-tolerant cleavage properties of the I-TevI homing endonuclease can be harnessed to enhance Cas9 applications, circumventing the cleavage and ligation cycle and biasing genome-editing events toward defined length deletions.

  3. Genomic complexity and dynamics of clonal evolution in childhood acute myeloid leukemia studied with whole-exome sequencing.

    PubMed

    Masetti, Riccardo; Castelli, Ilaria; Astolfi, Annalisa; Bertuccio, Salvatore Nicola; Indio, Valentina; Togni, Marco; Belotti, Tamara; Serravalle, Salvatore; Tarantino, Giuseppe; Zecca, Marco; Pigazzi, Martina; Basso, Giuseppe; Pession, Andrea; Locatelli, Franco

    2016-08-30

    Despite significant improvement in treatment of childhood acute myeloid leukemia (AML), 30% of patients experience disease recurrence, which is still the major cause of treatment failure and death in these patients. To investigate molecular mechanisms underlying relapse, we performed whole-exome sequencing of diagnosis-relapse pairs and matched remission samples from 4 pediatric AML patients without recurrent cytogenetic alterations. Candidate driver mutations were selected for targeted deep sequencing at high coverage, suitable to detect small subclones (0.12%). BiCEBPα mutation was found to be stable and highly penetrant, representing a separate biological and clinical entity, unlike WT1 mutations, which were extremely unstable. Among the mutational patterns underlying relapse, we detected the acquisition of proliferative advantage by signaling activation (PTPN11 and FLT3-TKD mutations) and the increased resistance to apoptosis (hyperactivation of TYK2). We also found a previously undescribed feature of AML, consisting of a hypermutator phenotype caused by SETD2 inactivation. The consequent accumulation of new mutations promotes the adaptability of the leukemia, contributing to clonal selection. We report a novel ASXL3 mutation characterizing a very small subclone (<1%) present at diagnosis and undergoing expansion (60%) at relapse. Taken together, these findings provide molecular clues for designing optimal therapeutic strategies, in terms of target selection, adequate schedule design and reliable response-monitoring techniques.

  4. Massive Analysis of Rice Small RNAs: Mechanistic Implications of Regulated MicroRNAs and Variants for Differential Target RNA Cleavage[W][OA

    PubMed Central

    Jeong, Dong-Hoon; Park, Sunhee; Zhai, Jixian; Gurazada, Sai Guna Ranjan; De Paoli, Emanuele; Meyers, Blake C.; Green, Pamela J.

    2011-01-01

    Small RNAs have a variety of important roles in plant development, stress responses, and other processes. They exert their influence by guiding mRNA cleavage, translational repression, and chromatin modification. To identify previously unknown rice (Oryza sativa) microRNAs (miRNAs) and those regulated by environmental stress, 62 small RNA libraries were constructed from rice plants and used for deep sequencing with Illumina technology. The libraries represent several tissues from control plants and plants subjected to different environmental stress treatments. More than 94 million genome-matched reads were obtained, resulting in more than 16 million distinct small RNA sequences. This allowed an evaluation of ~400 annotated miRNAs with current criteria and the finding that among these, ~150 had small interfering RNA–like characteristics. Seventy-six new miRNAs were found, and miRNAs regulated in response to water stress, nutrient stress, or temperature stress were identified. Among the new examples of miRNA regulation were members of the same miRNA family that were differentially regulated in different organs and had distinct sequences Some of these distinct family members result in differential target cleavage and provide new insight about how an agriculturally important rice phenotype could be regulated in the panicle. This high-resolution analysis of rice miRNAs should be relevant to plant miRNAs in general, particularly in the Poaceae. PMID:22158467

  5. Identification of novel and conserved microRNAs related to drought stress in potato by deep sequencing.

    PubMed

    Zhang, Ning; Yang, Jiangwei; Wang, Zemin; Wen, Yikai; Wang, Jie; He, Wenhui; Liu, Bailin; Si, Huaijun; Wang, Di

    2014-01-01

    MicroRNAs (miRNAs) are a group of small, non-coding RNAs that play important roles in plant growth, development and stress response. There have been an increasing number of investigations aimed at discovering miRNAs and analyzing their functions in model plants (such as Arabidopsis thaliana and rice). In this research, we constructed small RNA libraries from both polyethylene glycol (PEG 6,000) treated and control potato samples, and a large number of known and novel miRNAs were identified. Differential expression analysis showed that 100 of the known miRNAs were down-regulated and 99 were up-regulated as a result of PEG stress, while 119 of the novel miRNAs were up-regulated and 151 were down-regulated. Based on target prediction, annotation and expression analysis of the miRNAs and their putative target genes, 4 miRNAs were identified as regulating drought-related genes (miR811, miR814, miR835, miR4398). Their target genes were MYB transcription factor (CV431094), hydroxyproline-rich glycoprotein (TC225721), quaporin (TC223412) and WRKY transcription factor (TC199112), respectively. Relative expression trends of those miRNAs were the same as that predicted by Solexa sequencing and they showed a negative correlation with the expression of the target genes. The results provide molecular evidence for the possible involvement of miRNAs in the process of drought response and/or tolerance in the potato plant.

  6. Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks

    PubMed Central

    Avsec, Žiga; Cheng, Jun; Gagneur, Julien

    2018-01-01

    Abstract Motivation Regulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed. Results Here we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 120 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox. Availability and implementation Spline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at https://github.com/gagneurlab/Manuscript_Avsec_Bioinformatics_2017. Contact avsec@in.tum.de or gagneur@in.tum.de Supplementary information Supplementary data are available at Bioinformatics online. PMID:29155928

  7. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

    PubMed

    Fang, Chao; Shang, Yi; Xu, Dong

    2018-05-01

    Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html. © 2018 Wiley Periodicals, Inc.

  8. Deep Sequencing of Plant and Animal DNA Contained within Traditional Chinese Medicines Reveals Legality Issues and Health Safety Concerns

    PubMed Central

    Coghlan, Megan L.; Haile, James; Houston, Jayne; Murray, Dáithí C.; White, Nicole E.; Moolhuijzen, Paula; Bellgard, Matthew I.; Bunce, Michael

    2012-01-01

    Traditional Chinese medicine (TCM) has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES) legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS) of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus) and Saiga antelope (Saiga tatarica). Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety especially when plant reference databases become better established. PMID:22511890

  9. Deep sequencing and genome-wide analysis reveals the expansion of MicroRNA genes in the gall midge Mayetiola destructor

    PubMed Central

    2013-01-01

    Background MicroRNAs (miRNAs) are small non-coding RNAs that play critical roles in regulating post transcriptional gene expression. Gall midges encompass a large group of insects that are of economic importance and also possess fascinating biological traits. The gall midge Mayetiola destructor, commonly known as the Hessian fly, is a destructive pest of wheat and model organism for studying gall midge biology and insect – host plant interactions. Results In this study, we systematically analyzed miRNAs from the Hessian fly. Deep-sequencing a Hessian fly larval transcriptome led to the identification of 89 miRNA species that are either identical or very similar to known miRNAs from other insects, and 184 novel miRNAs that have not been reported from other species. A genome-wide search through a draft Hessian fly genome sequence identified a total of 611 putative miRNA-encoding genes based on sequence similarity and the existence of a stem-loop structure for miRNA precursors. Analysis of the 611 putative genes revealed a striking feature: the dramatic expansion of several miRNA gene families. The largest family contained 91 genes that encoded 20 different miRNAs. Microarray analyses revealed the expression of miRNA genes was strictly regulated during Hessian fly larval development and abundance of many miRNA genes were affected by host genotypes. Conclusion The identification of a large number of miRNAs for the first time from a gall midge provides a foundation for further studies of miRNA functions in gall midge biology and behavior. The dramatic expansion of identical or similar miRNAs provides a unique system to study functional relations among miRNA iso-genes as well as changes in sequence specificity due to small changes in miRNAs and in their mRNA targets. These results may also facilitate the identification of miRNA genes for potential pest control through transgenic approaches. PMID:23496979

  10. The Causality of Evolution on Different Fitness Landscapes

    NASA Astrophysics Data System (ADS)

    Vyawahare, Saurabh; Austin, Robert; Zhang, Qiucen; Kim, Hyunsung; Bestoso, John

    2013-03-01

    Evolution of antibiotic resistance is a growing problem. One major reason why most antibiotics fail is because of mutations on drug targets (e.g. essential enzymes). Sequencing of clinically resistant isolates have shown that multiple mutational-hotspots exist in coding regions, which could potentially prohibit the binding of drugs. However, it is not clear whether the appearance of each mutation is random or influenced by other factors. In this paper, we compare evolution of resistance to ciprofloxacin from two distinct but well characterized genetic backgrounds. By combining our recently developed evolution reactor and deep whole-genome sequencing, we show different alleles of σs factor lead to fixation of different mutations in gyrA gene that confer ciprofloxacin resistance to bacteria Escherichia coli. Such causality of evolution in different genes provides an opportunity to control the evolution of antibiotic resistance. Sponsored by the NCI/NIH Physical Sciences Oncology Centers

  11. Fungal diversity in deep-sea sediments of a hydrothermal vent system in the Southwest Indian Ridge

    NASA Astrophysics Data System (ADS)

    Xu, Wei; Gong, Lin-feng; Pang, Ka-Lai; Luo, Zhu-Hua

    2018-01-01

    Deep-sea hydrothermal sediment is known to support remarkably diverse microbial consortia. In deep sea environments, fungal communities remain less studied despite their known taxonomic and functional diversity. High-throughput sequencing methods have augmented our capacity to assess eukaryotic diversity and their functions in microbial ecology. Here we provide the first description of the fungal community diversity found in deep sea sediments collected at the Southwest Indian Ridge (SWIR) using culture-dependent and high-throughput sequencing approaches. A total of 138 fungal isolates were cultured from seven different sediment samples using various nutrient media, and these isolates were identified to 14 fungal taxa, including 11 Ascomycota taxa (7 genera) and 3 Basidiomycota taxa (2 genera) based on internal transcribed spacers (ITS1, ITS2 and 5.8S) of rDNA. Using illumina HiSeq sequencing, a total of 757,467 fungal ITS2 tags were recovered from the samples and clustered into 723 operational taxonomic units (OTUs) belonging to 79 taxa (Ascomycota and Basidiomycota contributed to 99% of all samples) based on 97% sequence similarity. Results from both approaches suggest that there is a high fungal diversity in the deep-sea sediments collected in the SWIR and fungal communities were shown to be slightly different by location, although all were collected from adjacent sites at the SWIR. This study provides baseline data of the fungal diversity and biogeography, and a glimpse to the microbial ecology associated with the deep-sea sediments of the hydrothermal vent system of the Southwest Indian Ridge.

  12. Pyrosequencing analysis of microbial communities reveals dominant cosmopolitan phylotypes in deep-sea sediments of the eastern Mediterranean Sea.

    PubMed

    Polymenakou, Paraskevi N; Christakis, Christos A; Mandalakis, Manolis; Oulas, Anastasis

    2015-06-01

    The deep eastern basin of the Mediterranean Sea is considered to be one of the world's most oligotrophic areas in the world. Here we performed pyrosequenicng analysis of bacterial and archaeal communities in oxic nutrient-poor sediments collected from the eastern Mediterranean at 1025-4393 m depth. Microbial communities were surveyed by targeting the hypervariable V5-V6 regions of the 16S ribosomal RNA gene using bar-coded pyrosequencing. With a total of 13,194 operational taxonomic units (OTUs) or phylotypes at 97% sequence similarities, the phylogenetic affiliation of microbes was assigned to 23 bacterial and 2 archaeal known phyla, 23 candidate divisions at the phylum level and distributed into 186 families. It was further revealed that the microbial consortia inhabiting all sampling sites were highly diverse, but dominated by phylotypes closely related to members of the genus Pseudomonas and Marine Group I archaea. Such pronounced and widespread enrichment probably manifests the cosmopolitan character of these species and raises questions about their metabolic adaptation to the physical stressors and low nutrient availability of the deep eastern Mediterranean Sea. Copyright © 2015 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  13. Expanding anchored hybrid enrichment to resolve both deep and shallow relationships within the spider tree of life.

    PubMed

    Hamilton, Chris A; Lemmon, Alan R; Lemmon, Emily Moriarty; Bond, Jason E

    2016-10-13

    Despite considerable effort, progress in spider molecular systematics has lagged behind many other comparable arthropod groups, thereby hindering family-level resolution, classification, and testing of important macroevolutionary hypotheses. Recently, alternative targeted sequence capture techniques have provided molecular systematics a powerful tool for resolving relationships across the Tree of Life. One of these approaches, Anchored Hybrid Enrichment (AHE), is designed to recover hundreds of unique orthologous loci from across the genome, for resolving both shallow and deep-scale evolutionary relationships within non-model systems. Herein we present a modification of the AHE approach that expands its use for application in spiders, with a particular emphasis on the infraorder Mygalomorphae. Our aim was to design a set of probes that effectively capture loci informative at a diversity of phylogenetic timescales. Following identification of putative arthropod-wide loci, we utilized homologous transcriptome sequences from 17 species across all spiders to identify exon boundaries. Conserved regions with variable flanking regions were then sought across the tick genome, three published araneomorph spider genomes, and raw genomic reads of two mygalomorph taxa. Following development of the 585 target loci in the Spider Probe Kit, we applied AHE across three taxonomic depths to evaluate performance: deep-level spider family relationships (33 taxa, 327 loci); family and generic relationships within the mygalomorph family Euctenizidae (25 taxa, 403 loci); and species relationships in the North American tarantula genus Aphonopelma (83 taxa, 581 loci). At the deepest level, all three major spider lineages (the Mesothelae, Mygalomorphae, and Araneomorphae) were supported with high bootstrap support. Strong support was also found throughout the Euctenizidae, including generic relationships within the family and species relationships within the genus Aptostichus. As in the Euctenizidae, virtually identical topologies were inferred with high support throughout Aphonopelma. The Spider Probe Kit, the first implementation of AHE methodology in Class Arachnida, holds great promise for gathering the types and quantities of molecular data needed to accelerate an understanding of the spider Tree of Life by providing a mechanism whereby different researchers can confidently and effectively use the same loci for independent projects, yet allowing synthesis of data across independent research groups.

  14. Design and assessment of engineered CRISPR-Cpf1 and its use for genome editing.

    PubMed

    Li, Bin; Zeng, Chunxi; Dong, Yizhou

    2018-05-01

    Cpf1, a CRISPR endonuclease discovered in Prevotella and Francisella 1 bacteria, offers an alternative platform for CRISPR-based genome editing beyond the commonly used CRISPR-Cas9 system originally discovered in Streptococcus pyogenes. This protocol enables the design of engineered CRISPR-Cpf1 components, both CRISPR RNAs (crRNAs) to guide the endonuclease and Cpf1 mRNAs to express the endonuclease protein, and provides experimental procedures for effective genome editing using this system. We also describe quantification of genome-editing activity and off-target effects of the engineered CRISPR-Cpf1 in human cell lines using both T7 endonuclease I (T7E1) assay and targeted deep sequencing. This protocol enables rapid construction and identification of engineered crRNAs and Cpf1 mRNAs to enhance genome-editing efficiency using the CRISPR-Cpf1 system, as well as assessment of target specificity within 2 months. This protocol may also be appropriate for fine-tuning other types of CRISPR systems.

  15. BUSCA: an integrative web server to predict subcellular localization of proteins.

    PubMed

    Savojardo, Castrense; Martelli, Pier Luigi; Fariselli, Piero; Profiti, Giuseppe; Casadio, Rita

    2018-04-30

    Here, we present BUSCA (http://busca.biocomp.unibo.it), a novel web server that integrates different computational tools for predicting protein subcellular localization. BUSCA combines methods for identifying signal and transit peptides (DeepSig and TPpred3), GPI-anchors (PredGPI) and transmembrane domains (ENSEMBLE3.0 and BetAware) with tools for discriminating subcellular localization of both globular and membrane proteins (BaCelLo, MemLoci and SChloro). Outcomes from the different tools are processed and integrated for annotating subcellular localization of both eukaryotic and bacterial protein sequences. We benchmark BUSCA against protein targets derived from recent CAFA experiments and other specific data sets, reporting performance at the state-of-the-art. BUSCA scores better than all other evaluated methods on 2732 targets from CAFA2, with a F1 value equal to 0.49 and among the best methods when predicting targets from CAFA3. We propose BUSCA as an integrated and accurate resource for the annotation of protein subcellular localization.

  16. DNA-based species level detection of Glomeromycota: one PCR primer set for all arbuscular mycorrhizal fungi.

    PubMed

    Krüger, Manuela; Stockinger, Herbert; Krüger, Claudia; Schüssler, Arthur

    2009-01-01

    * At present, molecular ecological studies of arbuscular mycorrhizal fungi (AMF) are only possible above species level when targeting entire communities. To improve molecular species characterization and to allow species level community analyses in the field, a set of newly designed AMF specific PCR primers was successfully tested. * Nuclear rDNA fragments from diverse phylogenetic AMF lineages were sequenced and analysed to design four primer mixtures, each targeting one binding site in the small subunit (SSU) or large subunit (LSU) rDNA. To allow species resolution, they span a fragment covering the partial SSU, whole internal transcribed spacer (ITS) rDNA region and partial LSU. * The new primers are suitable for specifically amplifying AMF rDNA from material that may be contaminated by other organisms (e.g., samples from pot cultures or the field), characterizing the diversity of AMF species from field samples, and amplifying a SSU-ITS-LSU fragment that allows phylogenetic analyses with species level resolution. * The PCR primers can be used to monitor entire AMF field communities, based on a single rDNA marker region. Their application will improve the base for deep sequencing approaches; moreover, they can be efficiently used as DNA barcoding primers.

  17. Genome-wide computational identification of microRNAs and their targets in the deep-branching eukaryote Giardia lamblia.

    PubMed

    Zhang, Yan-Qiong; Chen, Dong-Liang; Tian, Hai-Feng; Zhang, Bao-Hong; Wen, Jian-Fan

    2009-10-01

    Using a combined computational program, we identified 50 potential microRNAs (miRNAs) in Giardia lamblia, one of the most primitive unicellular eukaryotes. These miRNAs are unique to G. lamblia and no homologues have been found in other organisms; miRNAs, currently known in other species, were not found in G. lamblia. This suggests that miRNA biogenesis and miRNA-mediated gene regulation pathway may evolve independently, especially in evolutionarily distant lineages. A majority (43) of the predicted miRNAs are located at one single locus; however, some miRNAs have two or more copies in the genome. Among the 58 miRNA genes, 28 are located in the intergenic regions whereas 30 are present in the anti-sense strands of the protein-coding sequences. Five predicted miRNAs are expressed in G. lamblia trophozoite cells evidenced by expressed sequence tags or RT-PCR. Thirty-seven identified miRNAs may target 50 protein-coding genes, including seven variant-specific surface proteins (VSPs). Our findings provide a clue that miRNA-mediated gene regulation may exist in the early stage of eukaryotic evolution, suggesting that it is an important regulation system ubiquitous in eukaryotes.

  18. Deep Sequencing of Random Mutant Libraries Reveals the Active Site of the Narrow Specificity CphA Metallo-β-Lactamase is Fragile to Mutations.

    PubMed

    Sun, Zhizeng; Mehta, Shrenik C; Adamski, Carolyn J; Gibbs, Richard A; Palzkill, Timothy

    2016-09-12

    CphA is a Zn(2+)-dependent metallo-β-lactamase that efficiently hydrolyzes only carbapenem antibiotics. To understand the sequence requirements for CphA function, single codon random mutant libraries were constructed for residues in and near the active site and mutants were selected for E. coli growth on increasing concentrations of imipenem, a carbapenem antibiotic. At high concentrations of imipenem that select for phenotypically wild-type mutants, the active-site residues exhibit stringent sequence requirements in that nearly all residues in positions that contact zinc, the substrate, or the catalytic water do not tolerate amino acid substitutions. In addition, at high imipenem concentrations a number of residues that do not directly contact zinc or substrate are also essential and do not tolerate substitutions. Biochemical analysis confirmed that amino acid substitutions at essential positions decreased the stability or catalytic activity of the CphA enzyme. Therefore, the CphA active - site is fragile to substitutions, suggesting active-site residues are optimized for imipenem hydrolysis. These results also suggest that resistance to inhibitors targeted to the CphA active site would be slow to develop because of the strong sequence constraints on function.

  19. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus

    PubMed Central

    2010-01-01

    Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR) and their RNA transcription level by quantitative PCR (qPCR) experiments. Conclusions We have established the first tissue transcriptional analysis of a deep-sea hydrothermal vent animal and generated a searchable catalog of genes that provides a direct method of identifying and retrieving vast numbers of novel coding sequences which can be applied in gene expression profiling experiments from a non-conventional model organism. This provides the most comprehensive sequence resource for identifying novel genes currently available for a deep-sea vent organism, in particular, genes putatively involved in immune and inflammatory reactions in vent mussels. The characterization of the B. azoricus transcriptome will facilitate research into biological processes underlying physiological adaptations to hydrothermal vent environments and will provide a basis for expanding our understanding of genes putatively involved in adaptations processes during post-capture long term acclimatization experiments, at "sea-level" conditions, using B. azoricus as a model organism. PMID:20937131

  20. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

    PubMed

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

    2018-02-15

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  1. A Universal Method for Species Identification of Mammals Utilizing Next Generation Sequencing for the Analysis of DNA Mixtures

    PubMed Central

    Tillmar, Andreas O.; Dell'Amico, Barbara; Welander, Jenny; Holmlund, Gunilla

    2013-01-01

    Species identification can be interesting in a wide range of areas, for example, in forensic applications, food monitoring and in archeology. The vast majority of existing DNA typing methods developed for species determination, mainly focuses on a single species source. There are, however, many instances where all species from mixed sources need to be determined, even when the species in minority constitutes less than 1 % of the sample. The introduction of next generation sequencing opens new possibilities for such challenging samples. In this study we present a universal deep sequencing method using 454 GS Junior sequencing of a target on the mitochondrial gene 16S rRNA. The method was designed through phylogenetic analyses of DNA reference sequences from more than 300 mammal species. Experiments were performed on artificial species-species mixture samples in order to verify the method’s robustness and its ability to detect all species within a mixture. The method was also tested on samples from authentic forensic casework. The results showed to be promising, discriminating over 99.9 % of mammal species and the ability to detect multiple donors within a mixture and also to detect minor components as low as 1 % of a mixed sample. PMID:24358309

  2. TargetSpy: a supervised machine learning approach for microRNA target prediction.

    PubMed

    Sturm, Martin; Hackenberg, Michael; Langenberger, David; Frishman, Dmitrij

    2010-05-28

    Virtually all currently available microRNA target site prediction algorithms require the presence of a (conserved) seed match to the 5' end of the microRNA. Recently however, it has been shown that this requirement might be too stringent, leading to a substantial number of missed target sites. We developed TargetSpy, a novel computational approach for predicting target sites regardless of the presence of a seed match. It is based on machine learning and automatic feature selection using a wide spectrum of compositional, structural, and base pairing features covering current biological knowledge. Our model does not rely on evolutionary conservation, which allows the detection of species-specific interactions and makes TargetSpy suitable for analyzing unconserved genomic sequences.In order to allow for an unbiased comparison of TargetSpy to other methods, we classified all algorithms into three groups: I) no seed match requirement, II) seed match requirement, and III) conserved seed match requirement. TargetSpy predictions for classes II and III are generated by appropriate postfiltering. On a human dataset revealing fold-change in protein production for five selected microRNAs our method shows superior performance in all classes. In Drosophila melanogaster not only our class II and III predictions are on par with other algorithms, but notably the class I (no-seed) predictions are just marginally less accurate. We estimate that TargetSpy predicts between 26 and 112 functional target sites without a seed match per microRNA that are missed by all other currently available algorithms. Only a few algorithms can predict target sites without demanding a seed match and TargetSpy demonstrates a substantial improvement in prediction accuracy in that class. Furthermore, when conservation and the presence of a seed match are required, the performance is comparable with state-of-the-art algorithms. TargetSpy was trained on mouse and performs well in human and drosophila, suggesting that it may be applicable to a broad range of species. Moreover, we have demonstrated that the application of machine learning techniques in combination with upcoming deep sequencing data results in a powerful microRNA target site prediction tool http://www.targetspy.org.

  3. TargetSpy: a supervised machine learning approach for microRNA target prediction

    PubMed Central

    2010-01-01

    Background Virtually all currently available microRNA target site prediction algorithms require the presence of a (conserved) seed match to the 5' end of the microRNA. Recently however, it has been shown that this requirement might be too stringent, leading to a substantial number of missed target sites. Results We developed TargetSpy, a novel computational approach for predicting target sites regardless of the presence of a seed match. It is based on machine learning and automatic feature selection using a wide spectrum of compositional, structural, and base pairing features covering current biological knowledge. Our model does not rely on evolutionary conservation, which allows the detection of species-specific interactions and makes TargetSpy suitable for analyzing unconserved genomic sequences. In order to allow for an unbiased comparison of TargetSpy to other methods, we classified all algorithms into three groups: I) no seed match requirement, II) seed match requirement, and III) conserved seed match requirement. TargetSpy predictions for classes II and III are generated by appropriate postfiltering. On a human dataset revealing fold-change in protein production for five selected microRNAs our method shows superior performance in all classes. In Drosophila melanogaster not only our class II and III predictions are on par with other algorithms, but notably the class I (no-seed) predictions are just marginally less accurate. We estimate that TargetSpy predicts between 26 and 112 functional target sites without a seed match per microRNA that are missed by all other currently available algorithms. Conclusion Only a few algorithms can predict target sites without demanding a seed match and TargetSpy demonstrates a substantial improvement in prediction accuracy in that class. Furthermore, when conservation and the presence of a seed match are required, the performance is comparable with state-of-the-art algorithms. TargetSpy was trained on mouse and performs well in human and drosophila, suggesting that it may be applicable to a broad range of species. Moreover, we have demonstrated that the application of machine learning techniques in combination with upcoming deep sequencing data results in a powerful microRNA target site prediction tool http://www.targetspy.org. PMID:20509939

  4. Dendrites, deep learning, and sequences in the hippocampus.

    PubMed

    Bhalla, Upinder S

    2017-10-12

    The hippocampus places us both in time and space. It does so over remarkably large spans: milliseconds to years, and centimeters to kilometers. This works for sensory representations, for memory, and for behavioral context. How does it fit in such wide ranges of time and space scales, and keep order among the many dimensions of stimulus context? A key organizing principle for a wide sweep of scales and stimulus dimensions is that of order in time, or sequences. Sequences of neuronal activity are ubiquitous in sensory processing, in motor control, in planning actions, and in memory. Against this strong evidence for the phenomenon, there are currently more models than definite experiments about how the brain generates ordered activity. The flip side of sequence generation is discrimination. Discrimination of sequences has been extensively studied at the behavioral, systems, and modeling level, but again physiological mechanisms are fewer. It is against this backdrop that I discuss two recent developments in neural sequence computation, that at face value share little beyond the label "neural." These are dendritic sequence discrimination, and deep learning. One derives from channel physiology and molecular signaling, the other from applied neural network theory - apparently extreme ends of the spectrum of neural circuit detail. I suggest that each of these topics has deep lessons about the possible mechanisms, scales, and capabilities of hippocampal sequence computation. © 2017 Wiley Periodicals, Inc.

  5. De novo transcriptome assembly and positive selection analysis of an individual deep-sea fish.

    PubMed

    Lan, Yi; Sun, Jin; Xu, Ting; Chen, Chong; Tian, Renmao; Qiu, Jian-Wen; Qian, Pei-Yuan

    2018-05-24

    High hydrostatic pressure and low temperatures make the deep sea a harsh environment for life forms. Actin organization and microtubules assembly, which are essential for intracellular transport and cell motility, can be disrupted by high hydrostatic pressure. High hydrostatic pressure can also damage DNA. Nucleic acids exposed to low temperatures can form secondary structures that hinder genetic information processing. To study how deep-sea creatures adapt to such a hostile environment, one of the most straightforward ways is to sequence and compare their genes with those of their shallow-water relatives. We captured an individual of the fish species Aldrovandia affinis, which is a typical deep-sea inhabitant, from the Okinawa Trough at a depth of 1550 m using a remotely operated vehicle (ROV). We sequenced its transcriptome and analyzed its molecular adaptation. We obtained 27,633 protein coding sequences using an Illumina platform and compared them with those of several shallow-water fish species. Analysis of 4918 single-copy orthologs identified 138 positively selected genes in A. affinis, including genes involved in microtubule regulation. Particularly, functional domains related to cold shock as well as DNA repair are exposed to positive selection pressure in both deep-sea fish and hadal amphipod. Overall, we have identified a set of positively selected genes related to cytoskeleton structures, DNA repair and genetic information processing, which shed light on molecular adaptation to the deep sea. These results suggest that amino acid substitutions of these positively selected genes may contribute crucially to the adaptation of deep-sea animals. Additionally, we provide a high-quality transcriptome of a deep-sea fish for future deep-sea studies.

  6. Identification and characterization of lipid metabolism-related microRNAs in the liver of genetically improved farmed tilapia (GIFT, Oreochromis niloticus) by deep sequencing.

    PubMed

    Tao, Yi-Fan; Qiang, Jun; Yin, Guo-Jun; Xu, Pao; Shi, Qiong; Bao, Jing-Wen

    2017-10-01

    MicroRNAs (miRNAs) play vital roles in modulating diverse metabolic processes in the liver, including lipid metabolism. Genetically improved farmed tilapia (GIFT, Oreochromis niloticus), an important aquaculture species in China, is susceptible to hepatic steatosis when reared in intensive culture systems. To investigate the miRNAs involved in GIFT lipid metabolism, two hepatic small RNA libraries from high-fat diet-fed and normal-fat diet-fed GIFT were constructed and sequenced using high-throughput sequencing technology. A total of 204 known and 56 novel miRNAs were identified by aligning the sequencing data with known Danio rerio miRNAs listed in miRBase 21.0. Six known miRNAs (miR-30a-5p, miR-34a, miR-145-5p, miR-29a, miR-205-5p, and miR-23a-3p) that were differentially expressed between the high-fat diet and normal-fat diet groups were validated by quantitative real-time PCR. Bioinformatics tools were used to predict the potential target genes of these differentially expressed miRNAs, and Gene Ontology enrichment analysis indicated that these miRNAs may play important roles in diet-induced hepatic steatosis in GIFT. Our results provide a foundation for further studies of the role of miRNAs in tilapia lipid homeostasis regulation, and may help to identify novel targets for therapeutic interventions to reduce the occurrence of fatty liver disease in farmed tilapia. Copyright © 2017. Published by Elsevier Ltd.

  7. A Diverse Repertoire of Human Immunoglobulin Variable Genes in a Chicken B Cell Line is Generated by Both Gene Conversion and Somatic Hypermutation.

    PubMed

    Leighton, Philip A; Schusser, Benjamin; Yi, Henry; Glanville, Jacob; Harriman, William

    2015-01-01

    Chicken immune responses to human proteins are often more robust than rodent responses because of the phylogenetic relationship between the different species. For discovery of a diverse panel of unique therapeutic antibody candidates, chickens therefore represent an attractive host for human-derived targets. Recent advances in monoclonal antibody technology, specifically new methods for the molecular cloning of antibody genes directly from primary B cells, has ushered in a new era of generating monoclonal antibodies from non-traditional host animals that were previously inaccessible through hybridoma technology. However, such monoclonals still require post-discovery humanization in order to be developed as therapeutics. To obviate the need for humanization, a modified strain of chickens could be engineered to express a human-sequence immunoglobulin variable region repertoire. Here, human variable genes introduced into the chicken immunoglobulin loci through gene targeting were evaluated for their ability to be recognized and diversified by the native chicken recombination machinery that is present in the B-lineage cell line DT40. After expansion in culture the DT40 population accumulated genetic mutants that were detected via deep sequencing. Bioinformatic analysis revealed that the human targeted constructs are performing as expected in the cell culture system, and provide a measure of confidence that they will be functional in transgenic animals.

  8. Kamenetsk—A new impact structure in the Ukrainian Shield

    NASA Astrophysics Data System (ADS)

    Gurov, Eugene; Nikolaenko, Nikolay; Shevchuk, Helena; Yamnichenko, Anatoly

    2017-12-01

    The Kamenetsk impact structure is a deeply eroded simple crater that formed in crystalline rocks of the Ukrainian Shield. This study presents structural, lithologic, and shock metamorphic evidence for an impact origin of the Kamenetsk structure, which was previously described as a paleovolcano. The Kamenetsk structure is an oval depression that is 1.0-1.2 km in diameter and 130 m deep. The structure is deeply eroded, and only the lower part of the sequence of lithic breccia has been preserved in the deepest part of the crater to recent time, while the predominant part of impact rocks and postimpact sediments was eroded. Manifestations of shock metamorphism of minerals, especially planar deformation features in quartz and feldspars, were determined by petrographic investigations of lithic breccia that allowed us to determine the impact origin of the Kamenetsk structure. The erosion of the crater and surrounding target to a minimal depth of 220 m preceded the deposition of the postimpact sediments. The time of the formation of the Kamenetsk structure is bracketed within a wide interval from 2.0 to 2.1 Ga, the age of the crystalline target rocks, to the Late Miocene age of the sediments overlaying the crater. The deep erosion of the structure suggests it is probably Paleozoic in age.

  9. Revealing the unexplored fungal communities in deep groundwater of crystalline bedrock fracture zones in Olkiluoto, Finland.

    PubMed

    Sohlberg, Elina; Bomberg, Malin; Miettinen, Hanna; Nyyssönen, Mari; Salavirta, Heikki; Vikman, Minna; Itävaara, Merja

    2015-01-01

    The diversity and functional role of fungi, one of the ecologically most important groups of eukaryotic microorganisms, remains largely unknown in deep biosphere environments. In this study we investigated fungal communities in packer-isolated bedrock fractures in Olkiluoto, Finland at depths ranging from 296 to 798 m below surface level. DNA- and cDNA-based high-throughput amplicon sequencing analysis of the fungal internal transcribed spacer (ITS) gene markers was used to examine the total fungal diversity and to identify the active members in deep fracture zones at different depths. Results showed that fungi were present in fracture zones at all depths and fungal diversity was higher than expected. Most of the observed fungal sequences belonged to the phylum Ascomycota. Phyla Basidiomycota and Chytridiomycota were only represented as a minor part of the fungal community. Dominating fungal classes in the deep bedrock aquifers were Sordariomycetes, Eurotiomycetes, and Dothideomycetes from the Ascomycota phylum and classes Microbotryomycetes and Tremellomycetes from the Basidiomycota phylum, which are the most frequently detected fungal taxa reported also from deep sea environments. In addition some fungal sequences represented potentially novel fungal species. Active fungi were detected in most of the fracture zones, which proves that fungi are able to maintain cellular activity in these oligotrophic conditions. Possible roles of fungi and their origin in deep bedrock groundwater can only be speculated in the light of current knowledge but some species may be specifically adapted to deep subsurface environment and may play important roles in the utilization and recycling of nutrients and thus sustaining the deep subsurface microbial community.

  10. The Cosmic Skidmark: witnessing galaxy transformation at z = 0.19

    NASA Astrophysics Data System (ADS)

    Murphy, David N. A.

    2015-02-01

    We present an early-look analysis of the ``Cosmic Skidmark''. Discovered following visual inspection of the Geach, Murphy & Bower (2011) SDSS Stripe 82 cluster catalogue generated by ORCA (an automated cluster algorithm searching for red-sequences; Murphy, Geach & Bower 2012), this z = 0.19 1.4L* galaxy appears to have been caught in the rare act of transformation while accreting onto an estimated 1013-1014 h -1 M⊙-mass galaxy group. SDSS spectroscopy reveals clear signatures of star formation whilst deep optical imaging reveals a pronounced 50 kpc cometary tail. Pending completion of our ALMA Cycle 2 and IFU observations, we show here preliminary analysis of this target.

  11. Ultra-deep mutant spectrum profiling: improving sequencing accuracy using overlapping read pairs.

    PubMed

    Chen-Harris, Haiyin; Borucki, Monica K; Torres, Clinton; Slezak, Tom R; Allen, Jonathan E

    2013-02-12

    High throughput sequencing is beginning to make a transformative impact in the area of viral evolution. Deep sequencing has the potential to reveal the mutant spectrum within a viral sample at high resolution, thus enabling the close examination of viral mutational dynamics both within- and between-hosts. The challenge however, is to accurately model the errors in the sequencing data and differentiate real viral mutations, particularly those that exist at low frequencies, from sequencing errors. We demonstrate that overlapping read pairs (ORP) -- generated by combining short fragment sequencing libraries and longer sequencing reads -- significantly reduce sequencing error rates and improve rare variant detection accuracy. Using this sequencing protocol and an error model optimized for variant detection, we are able to capture a large number of genetic mutations present within a viral population at ultra-low frequency levels (<0.05%). Our rare variant detection strategies have important implications beyond viral evolution and can be applied to any basic and clinical research area that requires the identification of rare mutations.

  12. Deep RNNs for video denoising

    NASA Astrophysics Data System (ADS)

    Chen, Xinyuan; Song, Li; Yang, Xiaokang

    2016-09-01

    Video denoising can be described as the problem of mapping from a specific length of noisy frames to clean one. We propose a deep architecture based on Recurrent Neural Network (RNN) for video denoising. The model learns a patch-based end-to-end mapping between the clean and noisy video sequences. It takes the corrupted video sequences as the input and outputs the clean one. Our deep network, which we refer to as deep Recurrent Neural Networks (deep RNNs or DRNNs), stacks RNN layers where each layer receives the hidden state of the previous layer as input. Experiment shows (i) the recurrent architecture through temporal domain extracts motion information and does favor to video denoising, and (ii) deep architecture have large enough capacity for expressing mapping relation between corrupted videos as input and clean videos as output, furthermore, (iii) the model has generality to learned different mappings from videos corrupted by different types of noise (e.g., Poisson-Gaussian noise). By training on large video databases, we are able to compete with some existing video denoising methods.

  13. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    PubMed

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  14. Small RNA Deep Sequencing and the Effects of microRNA408 on Root Gravitropic Bending in Arabidopsis

    NASA Astrophysics Data System (ADS)

    Li, Huasheng; Lu, Jinying; Sun, Qiao; Chen, Yu; He, Dacheng; Liu, Min

    2015-11-01

    MicroRNA (miRNA) is a non-coding small RNA composed of 20 to 24 nucleotides that influences plant root development. This study analyzed the miRNA expression in Arabidopsis root tip cells using Illumina sequencing and real-time PCR before (sample 0) and 15 min after (sample 15) a 3-D clinostat rotational treatment was administered. After stimulation was performed, the expression levels of seven miRNA genes, including Arabidopsis miR160, miR161, miR394, miR402, miR403, miR408, and miR823, were significantly upregulated. Illumina sequencing results also revealed two novel miRNAsthat have not been previously reported, The target genes of these miRNAs included pentatricopeptide repeat-containing protein and diadenosine tetraphosphate hydrolase. An overexpression vector of Arabidopsis miR408 was constructed and transferred to Arabidopsis plant. The roots of plants over expressing miR408 exhibited a slower reorientation upon gravistimulation in comparison with those of wild-type. This result indicate that miR408 could play a role in root gravitropic response.

  15. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition

    PubMed Central

    Alberti, Adriana; Poulain, Julie; Engelen, Stefan; Labadie, Karine; Romac, Sarah; Ferrera, Isabel; Albini, Guillaume; Aury, Jean-Marc; Belser, Caroline; Bertrand, Alexis; Cruaud, Corinne; Da Silva, Corinne; Dossat, Carole; Gavory, Frédérick; Gas, Shahinaz; Guy, Julie; Haquelle, Maud; Jacoby, E'krame; Jaillon, Olivier; Lemainque, Arnaud; Pelletier, Eric; Samson, Gaëlle; Wessner, Mark; Bazire, Pascal; Beluche, Odette; Bertrand, Laurie; Besnard-Gonnet, Marielle; Bordelais, Isabelle; Boutard, Magali; Dubois, Maria; Dumont, Corinne; Ettedgui, Evelyne; Fernandez, Patricia; Garcia, Espérance; Aiach, Nathalie Giordanenco; Guerin, Thomas; Hamon, Chadia; Brun, Elodie; Lebled, Sandrine; Lenoble, Patricia; Louesse, Claudine; Mahieu, Eric; Mairey, Barbara; Martins, Nathalie; Megret, Catherine; Milani, Claire; Muanga, Jacqueline; Orvain, Céline; Payen, Emilie; Perroud, Peggy; Petit, Emmanuelle; Robert, Dominique; Ronsin, Murielle; Vacherie, Benoit; Acinas, Silvia G.; Royo-Llonch, Marta; Cornejo-Castillo, Francisco M.; Logares, Ramiro; Fernández-Gómez, Beatriz; Bowler, Chris; Cochrane, Guy; Amid, Clara; Hoopen, Petra Ten; De Vargas, Colomban; Grimsley, Nigel; Desgranges, Elodie; Kandels-Lewis, Stefanie; Ogata, Hiroyuki; Poulton, Nicole; Sieracki, Michael E.; Stepanauskas, Ramunas; Sullivan, Matthew B.; Brum, Jennifer R.; Duhaime, Melissa B.; Poulos, Bonnie T.; Hurwitz, Bonnie L.; Acinas, Silvia G.; Bork, Peer; Boss, Emmanuel; Bowler, Chris; De Vargas, Colomban; Follows, Michael; Gorsky, Gabriel; Grimsley, Nigel; Hingamp, Pascal; Iudicone, Daniele; Jaillon, Olivier; Kandels-Lewis, Stefanie; Karp-Boss, Lee; Karsenti, Eric; Not, Fabrice; Ogata, Hiroyuki; Pesant, Stéphane; Raes, Jeroen; Sardet, Christian; Sieracki, Michael E.; Speich, Sabrina; Stemmann, Lars; Sullivan, Matthew B.; Sunagawa, Shinichi; Wincker, Patrick; Pesant, Stéphane; Karsenti, Eric; Wincker, Patrick

    2017-01-01

    A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009–2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world’s planktonic ecosystems. PMID:28763055

  16. Current and future molecular diagnostics for ocular infectious diseases.

    PubMed

    Doan, Thuy; Pinsky, Benjamin A

    2016-11-01

    Confirmation of ocular infections can pose great challenges to the clinician. A fundamental limitation is the small amounts of specimen that can be obtained from the eye. Molecular diagnostics can circumvent this limitation and have been shown to be more sensitive than conventional culture. The purpose of this review is to describe new molecular methods and to discuss the applications of next-generation sequencing-based approaches in the diagnosis of ocular infections. Efforts have focused on improving the sensitivity of pathogen detection using molecular methods. This review describes a new molecular target for Toxoplasma gondii-directed polymerase chain reaction assays. Molecular diagnostics for Chlamydia trachomatis and Acanthamoeba species are also discussed. Finally, we describe a hypothesis-free approach, metagenomic deep sequencing, which can detect DNA and RNA pathogens from a single specimen in one test. In some cases, this method can provide the geographic location and timing of the infection. Pathogen-directed PCRs have been powerful tools in the diagnosis of ocular infections for over 20 years. The use of next-generation sequencing-based approaches, when available, will further improve sensitivity of detection with the potential to improve patient care.

  17. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

    PubMed

    Alberti, Adriana; Poulain, Julie; Engelen, Stefan; Labadie, Karine; Romac, Sarah; Ferrera, Isabel; Albini, Guillaume; Aury, Jean-Marc; Belser, Caroline; Bertrand, Alexis; Cruaud, Corinne; Da Silva, Corinne; Dossat, Carole; Gavory, Frédérick; Gas, Shahinaz; Guy, Julie; Haquelle, Maud; Jacoby, E'krame; Jaillon, Olivier; Lemainque, Arnaud; Pelletier, Eric; Samson, Gaëlle; Wessner, Mark; Acinas, Silvia G; Royo-Llonch, Marta; Cornejo-Castillo, Francisco M; Logares, Ramiro; Fernández-Gómez, Beatriz; Bowler, Chris; Cochrane, Guy; Amid, Clara; Hoopen, Petra Ten; De Vargas, Colomban; Grimsley, Nigel; Desgranges, Elodie; Kandels-Lewis, Stefanie; Ogata, Hiroyuki; Poulton, Nicole; Sieracki, Michael E; Stepanauskas, Ramunas; Sullivan, Matthew B; Brum, Jennifer R; Duhaime, Melissa B; Poulos, Bonnie T; Hurwitz, Bonnie L; Pesant, Stéphane; Karsenti, Eric; Wincker, Patrick

    2017-08-01

    A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems.

  18. deepTools2: a next generation web server for deep-sequencing data analysis.

    PubMed

    Ramírez, Fidel; Ryan, Devon P; Grüning, Björn; Bhardwaj, Vivek; Kilpert, Fabian; Richter, Andreas S; Heyne, Steffen; Dündar, Friederike; Manke, Thomas

    2016-07-08

    We present an update to our Galaxy-based web server for processing and visualizing deeply sequenced data. Its core tool set, deepTools, allows users to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches. Since we first described our deepTools Galaxy server in 2014, we have implemented new solutions for many requests from the community and our users. Here, we introduce significant enhancements and new tools to further improve data visualization and interpretation. deepTools continue to be open to all users and freely available as a web service at deeptools.ie-freiburg.mpg.de The new deepTools2 suite can be easily deployed within any Galaxy framework via the toolshed repository, and we also provide source code for command line usage under Linux and Mac OS X. A public and documented API for access to deepTools functionality is also available. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Deep ART Neural Model for Biologically Inspired Episodic Memory and Its Application to Task Performance of Robots.

    PubMed

    Park, Gyeong-Moon; Yoo, Yong-Ho; Kim, Deok-Hwa; Kim, Jong-Hwan; Gyeong-Moon Park; Yong-Ho Yoo; Deok-Hwa Kim; Jong-Hwan Kim; Yoo, Yong-Ho; Park, Gyeong-Moon; Kim, Jong-Hwan; Kim, Deok-Hwa

    2018-06-01

    Robots are expected to perform smart services and to undertake various troublesome or difficult tasks in the place of humans. Since these human-scale tasks consist of a temporal sequence of events, robots need episodic memory to store and retrieve the sequences to perform the tasks autonomously in similar situations. As episodic memory, in this paper we propose a novel Deep adaptive resonance theory (ART) neural model and apply it to the task performance of the humanoid robot, Mybot, developed in the Robot Intelligence Technology Laboratory at KAIST. Deep ART has a deep structure to learn events, episodes, and even more like daily episodes. Moreover, it can retrieve the correct episode from partial input cues robustly. To demonstrate the effectiveness and applicability of the proposed Deep ART, experiments are conducted with the humanoid robot, Mybot, for performing the three tasks of arranging toys, making cereal, and disposing of garbage.

  20. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.

    PubMed

    Gao, Yujuan; Wang, Sheng; Deng, Minghua; Xu, Jinbo

    2018-05-08

    Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study.

  1. Optimization of whole-transcriptome amplification from low cell density deep-sea microbial samples for metatranscriptomic analysis.

    PubMed

    Wu, Jieying; Gao, Weimin; Zhang, Weiwen; Meldrum, Deirdre R

    2011-01-01

    Limitation in sample quality and quantity is one of the big obstacles for applying metatranscriptomic technologies to explore gene expression and functionality of microbial communities in natural environments. In this study, several amplification methods were evaluated for whole-transcriptome amplification of deep-sea microbial samples, which are of low cell density and high impurity. The best amplification method was identified and incorporated into a complete protocol to isolate and amplify deep-sea microbial samples. In the protocol, total RNA was first isolated by a modified method combining Trizol (Invitrogen, CA) and RNeasy (QIAGEN, CA) method, amplified with a WT-Ovation™ Pico RNA Amplification System (NuGEN, CA), and then converted to double-strand DNA from single-strand cDNA with a WT-Ovation™ Exon Module (NuGEN, CA). The products from the whole-transcriptome amplification of deep-sea microbial samples were assessed first through random clone library sequencing. The BLAST search results showed that marine-based sequences are dominant in the libraries, consistent with the ecological source of the samples. The products were then used for next-generation Roche GS FLX Titanium sequencing to obtain metatranscriptome data. Preliminary analysis of the metatranscriptomic data showed good sequencing quality. Although the protocol was designed and demonstrated to be effective for deep-sea microbial samples, it should be applicable to similar samples from other extreme environments in exploring community structure and functionality of microbial communities. Copyright © 2010 Elsevier B.V. All rights reserved.

  2. The complete mitochondrial genome of the deep-sea sponge Poecillastra laminaris (Astrophorida, Vulcanellidae).

    PubMed

    Zeng, Cong; Thomas, Leighton J; Kelly, Michelle; Gardner, Jonathan P A

    2016-05-01

    The complete mitochondrial genome of a New Zealand specimen of the deep-sea sponge Poecillastra laminaris (Sollas, 1886) (Astrophorida, Vulcanellidae), from the Colville Ridge, New Zealand, was sequenced using the 454 Life Science pyrosequencing system. To identify homologous mitochondrial sequences, the 454 reads were mapped to the complete mitochondrial genome sequence of Geodia neptuni (GeneBank No. NC_006990). The P. laminaris genome is 18,413 bp in length and includes 14 protein-coding genes, 24 transfer RNA genes and 2 ribosomal RNA genes. Gene order resembled that of other demosponges. The base composition of the genome is A (29.1%), T (35.2%), C (14.0%) and G (21.7%). This is the second published mitogenome for a sponge of the order Astrophorida and will be useful in future phylogenetic analysis of deep-sea sponges.

  3. A Plane Target Detection Algorithm in Remote Sensing Images based on Deep Learning Network Technology

    NASA Astrophysics Data System (ADS)

    Shuxin, Li; Zhilong, Zhang; Biao, Li

    2018-01-01

    Plane is an important target category in remote sensing targets and it is of great value to detect the plane targets automatically. As remote imaging technology developing continuously, the resolution of the remote sensing image has been very high and we can get more detailed information for detecting the remote sensing targets automatically. Deep learning network technology is the most advanced technology in image target detection and recognition, which provided great performance improvement in the field of target detection and recognition in the everyday scenes. We combined the technology with the application in the remote sensing target detection and proposed an algorithm with end to end deep network, which can learn from the remote sensing images to detect the targets in the new images automatically and robustly. Our experiments shows that the algorithm can capture the feature information of the plane target and has better performance in target detection with the old methods.

  4. MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction.

    PubMed

    Wang, Duolin; Zeng, Shuai; Xu, Chunhui; Qiu, Wangren; Liang, Yanchun; Joshi, Trupti; Xu, Dong

    2017-12-15

    Computational methods for phosphorylation site prediction play important roles in protein function studies and experimental design. Most existing methods are based on feature extraction, which may result in incomplete or biased features. Deep learning as the cutting-edge machine learning method has the ability to automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of phosphorylation site prediction. We present MusiteDeep, the first deep-learning framework for predicting general and kinase-specific phosphorylation sites. MusiteDeep takes raw sequence data as input and uses convolutional neural networks with a novel two-dimensional attention mechanism. It achieves over a 50% relative improvement in the area under the precision-recall curve in general phosphorylation site prediction and obtains competitive results in kinase-specific prediction compared to other well-known tools on the benchmark data. MusiteDeep is provided as an open-source tool available at https://github.com/duolinwang/MusiteDeep. xudong@missouri.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  5. Deep transfer learning for automatic target classification: MWIR to LWIR

    NASA Astrophysics Data System (ADS)

    Ding, Zhengming; Nasrabadi, Nasser; Fu, Yun

    2016-05-01

    Publisher's Note: This paper, originally published on 5/12/2016, was replaced with a corrected/revised version on 5/18/2016. If you downloaded the original PDF but are unable to access the revision, please contact SPIE Digital Library Customer Service for assistance. When dealing with sparse or no labeled data in the target domain, transfer learning shows its appealing performance by borrowing the supervised knowledge from external domains. Recently deep structure learning has been exploited in transfer learning due to its attractive power in extracting effective knowledge through multi-layer strategy, so that deep transfer learning is promising to address the cross-domain mismatch. In general, cross-domain disparity can be resulted from the difference between source and target distributions or different modalities, e.g., Midwave IR (MWIR) and Longwave IR (LWIR). In this paper, we propose a Weighted Deep Transfer Learning framework for automatic target classification through a task-driven fashion. Specifically, deep features and classifier parameters are obtained simultaneously for optimal classification performance. In this way, the proposed deep structures can extract more effective features with the guidance of the classifier performance; on the other hand, the classifier performance is further improved since it is optimized on more discriminative features. Furthermore, we build a weighted scheme to couple source and target output by assigning pseudo labels to target data, therefore we can transfer knowledge from source (i.e., MWIR) to target (i.e., LWIR). Experimental results on real databases demonstrate the superiority of the proposed algorithm by comparing with others.

  6. The organization and evolution of the Responder satellite in species of the Drosophila melanogaster group: dynamic evolution of a target of meiotic drive.

    PubMed

    Larracuente, Amanda M

    2014-11-25

    Satellite DNA can make up a substantial fraction of eukaryotic genomes and has roles in genome structure and chromosome segregation. The rapid evolution of satellite DNA can contribute to genomic instability and genetic incompatibilities between species. Despite its ubiquity and its contribution to genome evolution, we currently know little about the dynamics of satellite DNA evolution. The Responder (Rsp) satellite DNA family is found in the pericentric heterochromatin of chromosome 2 of Drosophila melanogaster. Rsp is well-known for being the target of Segregation Distorter (SD)- an autosomal meiotic drive system in D. melanogaster. I present an evolutionary genetic analysis of the Rsp family of repeats in D. melanogaster and its closely-related species in the melanogaster group (D. simulans, D. sechellia, D. mauritiana, D. erecta, and D. yakuba) using a combination of available BAC sequences, whole genome shotgun Sanger reads, Illumina short read deep sequencing, and fluorescence in situ hybridization. I show that Rsp repeats have euchromatic locations throughout the D. melanogaster genome, that Rsp arrays show evidence for concerted evolution, and that Rsp repeats exist outside of D. melanogaster, in the melanogaster group. The repeats in these species are considerably diverged at the sequence level compared to D. melanogaster, and have a strikingly different genomic distribution, even between closely-related sister taxa. The genomic organization of the Rsp repeat in the D. melanogaster genome is complex-it exists of large blocks of tandem repeats in the heterochromatin and small blocks of tandem repeats in the euchromatin. My discovery of heterochromatic Rsp-like sequences outside of D. melanogaster suggests that SD evolved after its target satellite and that the evolution of the Rsp satellite family is highly dynamic over a short evolutionary time scale (<240,000 years).

  7. Deep Extragalactic VIsible Legacy Survey (DEVILS): Motivation, Design and Target Catalogue

    NASA Astrophysics Data System (ADS)

    Davies, L. J. M.; Robotham, A. S. G.; Driver, S. P.; Lagos, C. P.; Cortese, L.; Mannering, E.; Foster, C.; Lidman, C.; Hashemizadeh, A.; Koushan, S.; O'Toole, S.; Baldry, I. K.; Bilicki, M.; Bland-Hawthorn, J.; Bremer, M. N.; Brown, M. J. I.; Bryant, J. J.; Catinella, B.; Croom, S. M.; Grootes, M. W.; Holwerda, B. W.; Jarvis, M. J.; Maddox, N.; Meyer, M.; Moffett, A. J.; Phillipps, S.; Taylor, E. N.; Windhorst, R. A.; Wolf, C.

    2018-06-01

    The Deep Extragalactic VIsible Legacy Survey (DEVILS) is a large spectroscopic campaign at the Anglo-Australian Telescope (AAT) aimed at bridging the near and distant Universe by producing the highest completeness survey of galaxies and groups at intermediate redshifts (0.3 < z < 1.0). Our sample consists of ˜60,000 galaxies to Y<21.2 mag, over ˜6 deg2 in three well-studied deep extragalactic fields (Cosmic Origins Survey field, COSMOS, Extended Chandra Deep Field South, ECDFS and the X-ray Multi-Mirror Mission Large-Scale Structure region, XMM-LSS - all Large Synoptic Survey Telescope deep-drill fields). This paper presents the broad experimental design of DEVILS. Our target sample has been selected from deep Visible and Infrared Survey Telescope for Astronomy (VISTA) Y-band imaging (VISTA Deep Extragalactic Observations, VIDEO and UltraVISTA), with photometry measured by PROFOUND. Photometric star/galaxy separation is done on the basis of NIR colours, and has been validated by visual inspection. To maximise our observing efficiency for faint targets we employ a redshift feedback strategy, which continually updates our target lists, feeding back the results from the previous night's observations. We also present an overview of the initial spectroscopic observations undertaken in late 2017 and early 2018.

  8. LookSeq: a browser-based viewer for deep sequencing data.

    PubMed

    Manske, Heinrich Magnus; Kwiatkowski, Dominic P

    2009-11-01

    Sequencing a genome to great depth can be highly informative about heterogeneity within an individual or a population. Here we address the problem of how to visualize the multiple layers of information contained in deep sequencing data. We propose an interactive AJAX-based web viewer for browsing large data sets of aligned sequence reads. By enabling seamless browsing and fast zooming, the LookSeq program assists the user to assimilate information at different levels of resolution, from an overview of a genomic region to fine details such as heterogeneity within the sample. A specific problem, particularly if the sample is heterogeneous, is how to depict information about structural variation. LookSeq provides a simple graphical representation of paired sequence reads that is more revealing about potential insertions and deletions than are conventional methods.

  9. Discovery and functional characterization of microRNAs and their potential roles for gonadal development in spotted knifejaw, Oplegnathus punctatus.

    PubMed

    Du, Xinxin; Liu, Xiaobing; Zhang, Kai; Liu, Yuxiang; Cheng, Jie; Zhang, Quanqi

    2018-05-16

    The spotted knifejaw (Oplegnathus punctatus) is a newly emerging economical fishery species in China. Studies focused on the regulation of gonadal development and gametogenesis of spotted knifejaw are still insufficient. As a key post-transcriptional regulator, miRNAs have been shown to play important roles in development and reproduction systems. In this study, small RNA deep sequencing in ovary and testis of spotted knifejaw were performed to screen miRNA expression patterns. After sequencing and bioinformatics analysis, a total of 247 conserved known miRNAs and 41 novel miRNAs were identified in spotted knifejaw gonads for the first time. In addition, 36 miRNAs were differentially expressed between testis and ovary. The putative target genes of differentially expressed (DE) miRNAs were significantly enriched in several pathways related to sexual differentiation and gonadal development, such as steroid hormone biosynthesis. Sequencing data was validated through qRT-PCR analysis of selected DE miRNAs. Dual-luciferase reporter analyses of filtered miRNA-target gene pairs confirmed that opu-miR-27b-3p targeted in piwi2 and mov10l1 3' UTRs and down-regulated their expressions in spotted knifejaw. The notion that mov10l1 and piwi2 enhance germ cells proliferation and regulate gonadal development and gametogenesis suggests that opu-miR-27b-3p may attenuated this process in the gonads of spotted knifejaw. These findings provided insights into regulatory roles of gonadal miRNAs and supplied fundamental resources for further studies on miRNA-mediated post-transcriptional regulation in reproductive system of spotted knifejaw. Copyright © 2018. Published by Elsevier Inc.

  10. Gene Editing Vectors for Studying Nicotinic Acetylcholine Receptors in Cholinergic Transmission.

    PubMed

    Peng, Can; Yan, Yijin; Kim, Veronica J; Engle, Staci E; Berry, Jennifer N; McIntosh, J Michael; Neve, Rachael L; Drenan, Ryan M

    2018-05-19

    Nicotinic acetylcholine receptors (nAChRs), prototype members of the cys-loop ligand gated ion channel family, are key mediators of cholinergic transmission in the central nervous system. Despite their importance, technical gaps exist in our ability to dissect the function of individual subunits in the brain. To overcome these barriers, we designed CRISPR/Cas9 small guide RNA sequences (sgRNAs) for production of loss-of-function alleles in mouse nAChR genes. These sgRNAs were validated in vitro via deep sequencing. We subsequently targeted candidate nAChR genes in vivo by creating herpes simplex virus (HSV) vectors delivering sgRNAs and Cas9 expression to mouse brain. Production of loss-of-function insertions or deletions (indels) by these "all-in-one" HSV vectors was confirmed using brain slice patch clamp electrophysiology coupled with pharmacological analysis. Next, we developed a scheme for cell type-specific gene editing in mouse brain. Knockin mice expressing Cas9 in a Cre-dependent manner were validated using viral microinjections and genetic crosses to common Cre-driver mouse lines. We subsequently confirmed functional Cas9 activity by targeting the ubiquitous neuronal protein, NeuN, using adeno associated virus (AAV) delivery of sgRNAs. Finally, the mouse β2 nAChR gene was successfully targeted in dopamine transporter (DAT) positive neurons via CRISPR/Cas9. The sgRNA sequences and viral vectors, including our scheme for Cre-dependent gene editing, should be generally useful to the scientific research community. These tools could lead to new discoveries related to the function of nAChRs in neurotransmission and behavioral processes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  11. Identification of the two new, functional actinoporins, CJTOX I and CJTOX II, from the deep-sea anemone Cribrinopsis japonica.

    PubMed

    Tsutsui, Kenta; Sato, Tomomi

    2018-06-15

    Actinoporins are pore-forming proteins found in sea anemones. Although we now have a large collection of data on actinoporins, our knowledge is based heavily on those identified in shallow-water anemones. Because the deep sea differs considerably from shallow waters in hydrostatic pressures, temperatures, and the prey composition, the deep-sea actinoporin may have evolved in unique ways. This study, therefore, aimed to obtain new actinoporins in the deep-sea anemone Cribrinopis japonica (Actiniaria, Actiniidae). An actinoporin-like sequence was identified from the previously established C. japonica RNA-Seq database, and the complete length (663 bp) of the deep-sea actinoporin gene, Cjtox I, was obtained. In addition, a similar gene, Cjtox II (666 bp), was also identified from RNA of actinopharynx. CJTOX I and CJTOX II were similar in their primary structures, but CJTOX I lacked one residue in the middle of the protein. There was also a difference in the gene expression in live animals, where only Cjtox I was expressed in tentacles of C. japonica. In the heterologous expression where BL21 (DE3) strain was retransformed with the plasmid containing either Cjtox I or Cjtox II gene, the supernatants of both cell lysates showed hemolytic activity on the equine erythrocytes. Preincubation of the supernatants with sphingomyelin caused reduced activity, implying that the CJTOX I and II would target sphingomyelin as with other actinoporins. Because of the structures similarity to the known actinoporins and the sphingomyelin-inhibitable hemolytic activity, both CJTOX I and II were concluded to be new actinoporins, which were identified for the first time from a deep-sea anemone. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Unveiling the Biodiversity of Deep-Sea Nematodes through Metabarcoding: Are We Ready to Bypass the Classical Taxonomy?

    PubMed Central

    2015-01-01

    Nematodes inhabiting benthic deep-sea ecosystems account for >90% of the total metazoan abundances and they have been hypothesised to be hyper-diverse, but their biodiversity is still largely unknown. Metabarcoding could facilitate the census of biodiversity, especially for those tiny metazoans for which morphological identification is difficult. We compared, for the first time, different DNA extraction procedures based on the use of two commercial kits and a previously published laboratory protocol and tested their suitability for sequencing analyses of 18S rDNA of marine nematodes. We also investigated the reliability of Roche 454 sequencing analyses for assessing the biodiversity of deep-sea nematode assemblages previously morphologically identified. Finally, intra-genomic variation in 18S rRNA gene repeats was investigated by Illumina MiSeq in different deep-sea nematode morphospecies to assess the influence of polymorphisms on nematode biodiversity estimates. Our results indicate that the two commercial kits should be preferred for the molecular analysis of biodiversity of deep-sea nematodes since they consistently provide amplifiable DNA suitable for sequencing. We report that the morphological identification of deep-sea nematodes matches the results obtained by metabarcoding analysis only at the order-family level and that a large portion of Operational Clustered Taxonomic Units (OCTUs) was not assigned. We also show that independently from the cut-off criteria and bioinformatic pipelines used, the number of OCTUs largely exceeds the number of individuals and that 18S rRNA gene of different morpho-species of nematodes displayed intra-genomic polymorphisms. Our results indicate that metabarcoding is an important tool to explore the diversity of deep-sea nematodes, but still fails in identifying most of the species due to limited number of sequences deposited in the public databases, and in providing quantitative data on the species encountered. These aspects should be carefully taken into account before using metabarcoding in quantitative ecological research and monitoring programmes of marine biodiversity. PMID:26701112

  13. Unveiling the Biodiversity of Deep-Sea Nematodes through Metabarcoding: Are We Ready to Bypass the Classical Taxonomy?

    PubMed

    Dell'Anno, Antonio; Carugati, Laura; Corinaldesi, Cinzia; Riccioni, Giulia; Danovaro, Roberto

    2015-01-01

    Nematodes inhabiting benthic deep-sea ecosystems account for >90% of the total metazoan abundances and they have been hypothesised to be hyper-diverse, but their biodiversity is still largely unknown. Metabarcoding could facilitate the census of biodiversity, especially for those tiny metazoans for which morphological identification is difficult. We compared, for the first time, different DNA extraction procedures based on the use of two commercial kits and a previously published laboratory protocol and tested their suitability for sequencing analyses of 18S rDNA of marine nematodes. We also investigated the reliability of Roche 454 sequencing analyses for assessing the biodiversity of deep-sea nematode assemblages previously morphologically identified. Finally, intra-genomic variation in 18S rRNA gene repeats was investigated by Illumina MiSeq in different deep-sea nematode morphospecies to assess the influence of polymorphisms on nematode biodiversity estimates. Our results indicate that the two commercial kits should be preferred for the molecular analysis of biodiversity of deep-sea nematodes since they consistently provide amplifiable DNA suitable for sequencing. We report that the morphological identification of deep-sea nematodes matches the results obtained by metabarcoding analysis only at the order-family level and that a large portion of Operational Clustered Taxonomic Units (OCTUs) was not assigned. We also show that independently from the cut-off criteria and bioinformatic pipelines used, the number of OCTUs largely exceeds the number of individuals and that 18S rRNA gene of different morpho-species of nematodes displayed intra-genomic polymorphisms. Our results indicate that metabarcoding is an important tool to explore the diversity of deep-sea nematodes, but still fails in identifying most of the species due to limited number of sequences deposited in the public databases, and in providing quantitative data on the species encountered. These aspects should be carefully taken into account before using metabarcoding in quantitative ecological research and monitoring programmes of marine biodiversity.

  14. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome

    PubMed Central

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-01-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield. PMID:25333064

  15. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.

    PubMed

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-09-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.

  16. Shifts in the bacterial community composition along deep soil profiles in monospecific and mixed stands of Eucalyptus grandis and Acacia mangium.

    PubMed

    Pereira, Arthur Prudêncio de Araujo; Andrade, Pedro Avelino Maia de; Bini, Daniel; Durrer, Ademir; Robin, Agnès; Bouillet, Jean Pierre; Andreote, Fernando Dini; Cardoso, Elke Jurandy Bran Nogueira

    2017-01-01

    Our knowledge of the rhizosphere bacterial communities in deep soils and the role of Eucalyptus and Acacia on the structure of these communities remains very limited. In this study, we targeted the bacterial community along a depth profile (0 to 800 cm) and compared community structure in monospecific or mixed plantations of Acacia mangium and Eucalyptus grandis. We applied quantitative PCR (qPCR) and sequence the V6 region of the 16S rRNA gene to characterize composition of bacterial communities. We identified a decrease in bacterial abundance with soil depth, and differences in community patterns between monospecific and mixed cultivations. Sequence analysis indicated a prevalent effect of soil depth on bacterial communities in the mixed plant cultivation system, and a remarkable differentiation of bacterial communities in areas solely cultivated with Eucalyptus. The groups most influenced by soil depth were Proteobacteria and Acidobacteria (more frequent in samples between 0 and 300 cm). The predominant bacterial groups differentially displayed in the monospecific stands of Eucalyptus were Firmicutes and Proteobacteria. Our results suggest that the addition of an N2-fixing tree in a monospecific cultivation system modulates bacterial community composition even at a great depth. We conclude that co-cultivation systems may represent a key strategy to improve soil resources and to establish more sustainable cultivation of Eucalyptus in Brazil.

  17. ASR5 is involved in the regulation of miRNA expression in rice.

    PubMed

    Neto, Lauro Bücker; Arenhart, Rafael Augusto; de Oliveira, Luiz Felipe Valter; de Lima, Júlio Cesar; Bodanese-Zanettini, Maria Helena; Margis, Rogerio; Margis-Pinheiro, Márcia

    2015-11-01

    The work describes an ASR knockdown transcriptomic analysis by deep sequencing of rice root seedlings and the transactivation of ASR cis-acting elements in the upstream region of a MIR gene. MicroRNAs are key regulators of gene expression that guide post-transcriptional control of plant development and responses to environmental stresses. ASR (ABA, Stress and Ripening) proteins are plant-specific transcription factors with key roles in different biological processes. In rice, ASR proteins have been suggested to participate in the regulation of stress response genes. This work describes the transcriptomic analysis by deep sequencing two libraries, comparing miRNA abundance from the roots of transgenic ASR5 knockdown rice seedlings with that of the roots of wild-type non-transformed rice seedlings. Members of 59 miRNA families were detected, and 276 mature miRNAs were identified. Our analysis detected 112 miRNAs that were differentially expressed between the two libraries. A predicted inverse correlation between miR167abc and its target gene (LOC_Os07g29820) was confirmed using RT-qPCR. Protoplast transactivation assays showed that ASR5 is able to recognize binding sites upstream of the MIR167a gene and drive its expression in vivo. Together, our data establish a comparative study of miRNAome profiles and is the first study to suggest the involvement of ASR proteins in miRNA gene regulation.

  18. Deep targeted sequencing in pediatric acute lymphoblastic leukemia unveils distinct mutational patterns between genetic subtypes and novel relapse-associated genes.

    PubMed

    Lindqvist, C Mårten; Lundmark, Anders; Nordlund, Jessica; Freyhult, Eva; Ekman, Diana; Carlsson Almlöf, Jonas; Raine, Amanda; Övernäs, Elin; Abrahamsson, Jonas; Frost, Britt-Marie; Grandér, Dan; Heyman, Mats; Palle, Josefine; Forestier, Erik; Lönnerholm, Gudmar; Berglund, Eva C; Syvänen, Ann-Christine

    2016-09-27

    To characterize the mutational patterns of acute lymphoblastic leukemia (ALL) we performed deep next generation sequencing of 872 cancer genes in 172 diagnostic and 24 relapse samples from 172 pediatric ALL patients. We found an overall greater mutational burden and more driver mutations in T-cell ALL (T-ALL) patients compared to B-cell precursor ALL (BCP-ALL) patients. In addition, the majority of the mutations in T-ALL had occurred in the original leukemic clone, while most of the mutations in BCP-ALL were subclonal. BCP-ALL patients carrying any of the recurrent translocations ETV6-RUNX1, BCR-ABL or TCF3-PBX1 harbored few mutations in driver genes compared to other BCP-ALL patients. Specifically in BCP-ALL, we identified ATRX as a novel putative driver gene and uncovered an association between somatic mutations in the Notch signaling pathway at ALL diagnosis and increased risk of relapse. Furthermore, we identified EP300, ARID1A and SH2B3 as relapse-associated genes. The genes highlighted in our study were frequently involved in epigenetic regulation, associated with germline susceptibility to ALL, and present in minor subclones at diagnosis that became dominant at relapse. We observed a high degree of clonal heterogeneity and evolution between diagnosis and relapse in both BCP-ALL and T-ALL, which could have implications for the treatment efficiency.

  19. DNA barcoding reveals seasonal shifts in diet and consumption of deep-sea fishes in wedge-tailed shearwaters

    PubMed Central

    Ando, Haruko; Horikoshi, Kazuo; Suzuki, Hajime; Isagi, Yuji

    2018-01-01

    The foraging ecology of pelagic seabirds is difficult to characterize because of their large foraging areas. In the face of this difficulty, DNA metabarcoding may be a useful approach to analyze diet compositions and foraging behaviors. Using this approach, we investigated the diet composition and its seasonal variation of a common seabird species on the Ogasawara Islands, Japan: the wedge-tailed shearwater Ardenna pacifica. We collected fecal samples during the prebreeding (N = 73) and rearing (N = 96) periods. The diet composition of wedge-tailed shearwater was analyzed by Ion Torrent sequencing using two universal polymerase chain reaction primers for the 12S and 16S mitochondrial DNA regions that targeted vertebrates and mollusks, respectively. The results of a BLAST search of obtained sequences detected 31 and 1 vertebrate and mollusk taxa, respectively. The results of the diet composition analysis showed that wedge-tailed shearwaters frequently consumed deep-sea fishes throughout the sampling season, indicating the importance of these fishes as a stable food resource. However, there was a marked seasonal shift in diet, which may reflect seasonal changes in food resource availability and wedge-tailed shearwater foraging behavior. The collected data regarding the shearwater diet may be useful for in situ conservation efforts. Future research that combines DNA metabarcoding with other tools, such as data logging, may provide further insight into the foraging ecology of pelagic seabirds. PMID:29630670

  20. Shifts in the bacterial community composition along deep soil profiles in monospecific and mixed stands of Eucalyptus grandis and Acacia mangium

    PubMed Central

    de Andrade, Pedro Avelino Maia; Bini, Daniel; Durrer, Ademir; Robin, Agnès; Bouillet, Jean Pierre; Andreote, Fernando Dini; Cardoso, Elke Jurandy Bran Nogueira

    2017-01-01

    Our knowledge of the rhizosphere bacterial communities in deep soils and the role of Eucalyptus and Acacia on the structure of these communities remains very limited. In this study, we targeted the bacterial community along a depth profile (0 to 800 cm) and compared community structure in monospecific or mixed plantations of Acacia mangium and Eucalyptus grandis. We applied quantitative PCR (qPCR) and sequence the V6 region of the 16S rRNA gene to characterize composition of bacterial communities. We identified a decrease in bacterial abundance with soil depth, and differences in community patterns between monospecific and mixed cultivations. Sequence analysis indicated a prevalent effect of soil depth on bacterial communities in the mixed plant cultivation system, and a remarkable differentiation of bacterial communities in areas solely cultivated with Eucalyptus. The groups most influenced by soil depth were Proteobacteria and Acidobacteria (more frequent in samples between 0 and 300 cm). The predominant bacterial groups differentially displayed in the monospecific stands of Eucalyptus were Firmicutes and Proteobacteria. Our results suggest that the addition of an N2-fixing tree in a monospecific cultivation system modulates bacterial community composition even at a great depth. We conclude that co-cultivation systems may represent a key strategy to improve soil resources and to establish more sustainable cultivation of Eucalyptus in Brazil. PMID:28686690

  1. Molecular Phylogenetic Analysis of Archaeal Intron-Containing Genes Coding for rRNA Obtained from a Deep-Subsurface Geothermal Water Pool

    PubMed Central

    Takai, Ken; Horikoshi, Koki

    1999-01-01

    Molecular phylogenetic analysis of a naturally occurring microbial community in a deep-subsurface geothermal environment indicated that the phylogenetic diversity of the microbial population in the environment was extremely limited and that only hyperthermophilic archaeal members closely related to Pyrobaculum were present. All archaeal ribosomal DNA sequences contained intron-like sequences, some of which had open reading frames with repeated homing-endonuclease motifs. The sequence similarity analysis and the phylogenetic analysis of these homing endonucleases suggested the possible phylogenetic relationship among archaeal rRNA-encoded homing endonucleases. PMID:10584021

  2. Genomic variation in macrophage-cultured European porcine reproductive and respiratory syndrome virus Olot/91 revealed using ultra-deep next generation sequencing.

    PubMed

    Lu, Zen H; Brown, Alexander; Wilson, Alison D; Calvert, Jay G; Balasch, Monica; Fuentes-Utrilla, Pablo; Loecherbach, Julia; Turner, Frances; Talbot, Richard; Archibald, Alan L; Ait-Ali, Tahar

    2014-03-04

    Porcine Reproductive and Respiratory Syndrome (PRRS) is a disease of major economic impact worldwide. The etiologic agent of this disease is the PRRS virus (PRRSV). Increasing evidence suggest that microevolution within a coexisting quasispecies population can give rise to high sequence heterogeneity in PRRSV. We developed a pipeline based on the ultra-deep next generation sequencing approach to first construct the complete genome of a European PRRSV, strain Olot/9, cultured on macrophages and then capture the rare variants representative of the mixed quasispecies population. Olot/91 differs from the reference Lelystad strain by about 5% and a total of 88 variants, with frequencies as low as 1%, were detected in the mixed population. These variants included 16 non-synonymous variants concentrated in the genes encoding structural and nonstructural proteins; including Glycoprotein 2a and 5. Using an ultra-deep sequencing methodology, the complete genome of Olot/91 was constructed without any prior knowledge of the sequence. Rare variants that constitute minor fractions of the heterogeneous PRRSV population could successfully be detected to allow further exploration of microevolutionary events.

  3. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes

    PubMed Central

    Wang, Ruijia; Nambiar, Ram; Zheng, Dinghai

    2018-01-01

    Abstract PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3′ region extraction and deep sequencing (3′READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3′ ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data. PMID:29069441

  4. Deep whole-genome sequencing of 90 Han Chinese genomes.

    PubMed

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects. © The Authors 2017. Published by Oxford University Press.

  5. Optimization of conditions to sequence long cDNAs from viruses

    USDA-ARS?s Scientific Manuscript database

    Fourth generation sequencing with the Minion nanopore sequencer provides opportunity to obtain deep coverage and long read for single molecules. This will benefit studies on RNA viruses. In the past, Sanger, Illumina, and Ion Torrent sequencing have been utilized to study RNA viruses. Both technique...

  6. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    USDA-ARS?s Scientific Manuscript database

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  7. Pm-miR-133 hosting in one potential lncRNA regulates RhoA expression in pearl oyster Pinctada martensii.

    PubMed

    Zheng, Zhe; Huang, RongLian; Tian, RongRong; Jiao, Yu; Du, Xiaodong

    2016-10-15

    Long non-coding RNAs (LncRNAs) are abundant in the genome of higher forms of eukaryotes and implicated in regulating the diversity of biological processes partly because they host microRNAs (miRNAs), which are repressors of target gene expression. In vertebrates, miR-133 regulates the differentiation and proliferation of cardiac and skeletal muscles. Pinctada martensii miR-133 (pm-miR-133) was identified in our previous research through Solexa deep sequencing. In the present study, the precise sequence of mature pm-miR-133 was validated through miR-RACE. Stem loop qRT-PCR analysis demonstrated that mature pm-miR-133 was constitutively expressed in the adductor muscle, gonad, hepatopancreas, mantle, foot, and gill of P. martensii. Among these tissues, the adductor muscle exhibited the highest pm-miR-133 expression. Target analysis indicated that pm-RhoA was the potential regulatory target of pm-miR-133. Bioinformatics analyses revealed that a potential LncRNA (designated as Lnc133) with a mature pm-miR-133 could generate a hairpin structure that was highly homologous to that of Lottia gigantea. Lnc133 was also highly expressed in the adductor muscle, gill, hepatopancreas, and gonad. Phylogenetic analysis further showed that the miR-133s derived from chordate and achordate were separated into two classes. Therefore, Lnc133 hosting pm-miR-133 could be involved in regulating the cell proliferation of adductor muscles by targeting pm-RhoA. Copyright © 2016 Elsevier B.V. All rights reserved.

  8. Identification of miRNAs during mouse postnatal ovarian development and superovulation.

    PubMed

    Khan, Hamid Ali; Zhao, Yi; Wang, Li; Li, Qian; Du, Yu-Ai; Dan, Yi; Huo, Li-Jun

    2015-07-08

    MicroRNAs are small noncoding RNAs that play critical roles in regulation of gene expression in wide array of tissues including the ovary through sequence complementarity at post-transcriptional level. Tight regulation of multitude of genes involved in ovarian development and folliculogenesis could be regulated at transcription level by these miRNAs. Therefore, tissue specific miRNAs identification is considered a key step towards understanding the role of miRNAs in biological processes. To investigate the role of microRNAs during ovarian development and folliculogenesis we sequenced eight different libraries using Illumina deep sequencing technology. Different developmental stages were selected to explore miRNAs expression pattern at different stages of gonadal maturation with/without treatment of PMSG/hCG for superovulation. From massive sequencing reads, clean reads of 16-26 bp were selected for further analysis of differential expression analysis and novel microRNA annotation. Expression analysis of all miRNAs at different developmental stages showed that some miRNAs were present ubiquitously while others were differentially expressed at different stages. Among differentially expressed miRNAs we reported 61 miRNAs with a fold change of more than 2 at different developmental stages among all libraries. Among the up-regulated miRNAs, mmu-mir-1298 had the highest fold change with 4.025 while mmu-mir-150 was down-regulated more than 3 fold. Furthermore, we found 2659 target genes for 20 differentially expressed microRNAs using seven different target predictions programs (DIANA-mT, miRanda, miRDB, miRWalk, RNAhybrid, PICTAR5, TargetScan). Analysis of the predicted targets showed certain ovary specific genes targeted by single or multiple microRNAs. Furthermore, pathway annotation and Gene ontology showed involvement of these microRNAs in basic cellular process. These results suggest the presence of different miRNAs at different stages of ovarian development and superovulation. Potential role of these microRNAs was elucidated using bioinformatics tools in regulation of different pathways, biological functions and cellular components underlying ovarian development and superovulation. These results provide a framework for extended analysis of miRNAs and their roles during ovarian development and superovulation. Furthermore, this study provides a base for characterization of individual miRNAs to discover their role in ovarian development and female fertility.

  9. MRI markers of small vessel disease in lobar and deep hemispheric intracerebral hemorrhage.

    PubMed

    Smith, Eric E; Nandigam, Kaveer R N; Chen, Yu-Wei; Jeng, Jed; Salat, David; Halpin, Amy; Frosch, Matthew; Wendell, Lauren; Fazen, Louis; Rosand, Jonathan; Viswanathan, Anand; Greenberg, Steven M

    2010-09-01

    MRI evidence of small vessel disease is common in intracerebral hemorrhage (ICH). We hypothesized that ICH caused by cerebral amyloid angiopathy (CAA) or hypertensive vasculopathy would have different distributions of MRI T2 white matter hyperintensity (WMH) and microbleeds. Data were analyzed from 133 consecutive patients with primary supratentorial ICH and adequate MRI sequences. CAA was diagnosed using the Boston criteria. WMH segmentation was performed using a validated semiautomated method. WMH and microbleeds were compared according to site of symptomatic hematoma origin (lobar versus deep) or by pattern of hemorrhages, including both hematomas and microbleeds, on MRI gradient recalled echo sequence (grouped as lobar only-probable CAA, lobar only-possible CAA, deep hemispheric only, or mixed lobar and deep hemorrhages). Patients with lobar and deep hemispheric hematoma had similar median normalized WMH volumes (19.5 cm versus 19.9 cm(3), P=0.74) and prevalence of >or=1 microbleed (54% versus 52%, P=0.99). The supratentorial WMH distribution was similar according to hemorrhage location category; however, the prevalence of brain stem T2 hyperintensity was lower in lobar hematoma versus deep hematoma (54% versus 70%, P=0.004). Mixed ICH was common (23%). Patients with mixed ICH had large normalized WMH volumes and a posterior distribution of cortical hemorrhages similar to that seen in CAA. WMH distribution is largely similar between CAA-related and non-CAA-related ICH. Mixed lobar and deep hemorrhages are seen on MRI gradient recalled echo sequence in up to one fourth of patients; in these patients, both hypertension and CAA may be contributing to the burden of WMH.

  10. Echolocation behaviour adapted to prey in foraging Blainville's beaked whale (Mesoplodon densirostris)

    PubMed Central

    Johnson, M; Hickmott, L.S; Aguilar Soto, N; Madsen, P.T

    2007-01-01

    Toothed whales echolocating in the wild generate clicks with low repetition rates to locate prey but then produce rapid sequences of clicks, called buzzes, when attempting to capture prey. However, little is known about the factors that determine clicking rates or how prey type and behaviour influence echolocation-based foraging. Here we study Blainville's beaked whales foraging in deep water using a multi-sensor DTAG that records both outgoing echolocation clicks and echoes returning from mesopelagic prey. We demonstrate that the clicking rate at the beginning of buzzes is related to the distance between whale and prey, supporting the presumption that whales focus on a specific prey target during the buzz. One whale showed a bimodal relationship between target range and clicking rate producing abnormally slow buzz clicks while attempting to capture large echoic targets, probably schooling prey, with echo duration indicating a school diameter of up to 4.3 m. These targets were only found when the whale performed tight circling manoeuvres spending up to five times longer in water volumes with large targets than with small targets. The result indicates that toothed whales in the wild can adjust their echolocation behaviour and movement for capture of different prey on the basis of structural echo information. PMID:17986434

  11. In search of actionable targets for agrigenomics and microalgal biofuel production: sequence-structural diversity studies on algal and higher plants with a focus on GPAT protein.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar

    2013-04-01

    The triacylglycerol (TAG) pathway provides several targets for genetic engineering to optimize microalgal lipid productivity. GPAT (glycerol-3-phosphate acyltransferase) is a crucial enzyme that catalyzes the initial step of TAG biosynthesis. Despite many recent biochemical studies, a comprehensive sequence-structure analysis of GPAT across diverse lipid-yielding organisms is lacking. Hence, we performed a comparative genomic analysis of plastid-located GPAT proteins from 7 microalgae and 3 higher plants species. The close evolutionary relationship observed between red algae/diatoms and green algae/plant lineages in the phylogenetic tree were further corroborated by motif and gene structure analysis. The predicted molecular weight, amino acid composition, Instability Index, and hydropathicity profile gave an overall representation of the biochemical features of GPAT protein across the species under study. Furthermore, homology models of GPAT from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Glycine max provided deep insights into the protein architecture and substrate binding sites. Despite low sequence identity found between algal and plant GPATs, the developed models exhibited strikingly conserved topology consisting of 14α helices and 9β sheets arranged in two domains. However, subtle variations in amino acids of fatty acyl binding site were identified that might influence the substrate selectivity of GPAT. Together, the results will provide useful resources to understand the functional and evolutionary relationship of GPAT and potentially benefit in development of engineered enzyme for augmenting algal biofuel production.

  12. MicroRNA-like RNAs from the same miRNA precursors play a role in cassava chilling responses.

    PubMed

    Zeng, Changying; Xia, Jing; Chen, Xin; Zhou, Yufei; Peng, Ming; Zhang, Weixiong

    2017-12-07

    MicroRNAs (miRNAs) are known to play important roles in various cellular processes and stress responses. MiRNAs can be identified by analyzing reads from high-throughput deep sequencing. The reads realigned to miRNA precursors besides canonical miRNAs were initially considered as sequencing noise and ignored from further analysis. Here we reported a small-RNA species of phased and half-phased miRNA-like RNAs different from canonical miRNAs from cassava miRNA precursors detected under four distinct chilling conditions. They can form abundant multiple small RNAs arranged along precursors in a tandem and phased or half-phased fashion. Some of these miRNA-like RNAs were experimentally confirmed by re-amplification and re-sequencing, and have a similar qRT-PCR detection ratio as their cognate canonical miRNAs. The target genes of those phased and half-phased miRNA-like RNAs function in process of cell growth metabolism and play roles in protein kinase. Half-phased miR171d.3 was confirmed to have cleavage activities on its target gene P-glycoprotein 11, a broad substrate efflux pump across cellular membranes, which is thought to provide protection for tropical cassava during sharp temperature decease. Our results showed that the RNAs from miRNA precursors are miRNA-like small RNAs that are viable negative gene regulators and may have potential functions in cassava chilling responses.

  13. Identifying EGFR-Expressed Cells and Detecting EGFR Multi-Mutations at Single-Cell Level by Microfluidic Chip

    NASA Astrophysics Data System (ADS)

    Li, Ren; Zhou, Mingxing; Li, Jine; Wang, Zihua; Zhang, Weikai; Yue, Chunyan; Ma, Yan; Peng, Hailin; Wei, Zewen; Hu, Zhiyuan

    2018-03-01

    EGFR mutations companion diagnostics have been proved to be crucial for the efficacy of tyrosine kinase inhibitor targeted cancer therapies. To uncover multiple mutations occurred in minority of EGFR-mutated cells, which may be covered by the noises from majority of un-mutated cells, is currently becoming an urgent clinical requirement. Here we present the validation of a microfluidic-chip-based method for detecting EGFR multi-mutations at single-cell level. By trapping and immunofluorescently imaging single cells in specifically designed silicon microwells, the EGFR-expressed cells were easily identified. By in situ lysing single cells, the cell lysates of EGFR-expressed cells were retrieved without cross-contamination. Benefited from excluding the noise from cells without EGFR expression, the simple and cost-effective Sanger's sequencing, but not the expensive deep sequencing of the whole cell population, was used to discover multi-mutations. We verified the new method with precisely discovering three most important EGFR drug-related mutations from a sample in which EGFR-mutated cells only account for a small percentage of whole cell population. The microfluidic chip is capable of discovering not only the existence of specific EGFR multi-mutations, but also other valuable single-cell-level information: on which specific cells the mutations occurred, or whether different mutations coexist on the same cells. This microfluidic chip constitutes a promising method to promote simple and cost-effective Sanger's sequencing to be a routine test before performing targeted cancer therapy.[Figure not available: see fulltext.

  14. Deep Sequencing Analysis of Apple Infecting Viruses in Korea

    PubMed Central

    Cho, In-Sook; Igori, Davaajargal; Lim, Seungmo; Choi, Gug-Seoun; Hammond, John; Lim, Hyoun-Sub; Moon, Jae Sun

    2016-01-01

    Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV) were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt) sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. Sequences of ASPV and ASGV were the most abundantly represented by the 52 contigs assembled. The presence of the five viruses in the samples was confirmed by RT-PCR using specific primers based on the sequences of each assembled contig. All five viruses were detected in three of the samples, whereas all samples had mixed infections with at least two viruses. The most frequently detected virus was ASPV, followed by ASGV, ApLV, ACLSV, and AGCaV which were withal found in mixed infections in the tested samples. AGCaV was identified in assembled contigs ID 1012480 and 93549, which showed 82% and 78% nt sequence identity with ORF1 of AGCaV isolate Aurora-1. ApLV was identified in three assembled contigs, ID 65587, 1802365, and 116777, which showed 77%, 78%, and 76% nt sequence identity respectively with ORF1 of ApLV isolate LA2. Deep sequencing assay was shown to be a valuable and powerful tool for detection and identification of known and unknown virome in infected apple trees, here identifying ApLV and AGCaV in commercial orchards in Korea for the first time. PMID:27721694

  15. Deep sequencing approaches for the analysis of prokaryotic transcriptional boundaries and dynamics.

    PubMed

    James, Katherine; Cockell, Simon J; Zenkin, Nikolay

    2017-05-01

    The identification of the protein-coding regions of a genome is straightforward due to the universality of start and stop codons. However, the boundaries of the transcribed regions, conditional operon structures, non-coding RNAs and the dynamics of transcription, such as pausing of elongation, are non-trivial to identify, even in the comparatively simple genomes of prokaryotes. Traditional methods for the study of these areas, such as tiling arrays, are noisy, labour-intensive and lack the resolution required for densely-packed bacterial genomes. Recently, deep sequencing has become increasingly popular for the study of the transcriptome due to its lower costs, higher accuracy and single nucleotide resolution. These methods have revolutionised our understanding of prokaryotic transcriptional dynamics. Here, we review the deep sequencing and data analysis techniques that are available for the study of transcription in prokaryotes, and discuss the bioinformatic considerations of these analyses. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Insertion sequences enrichment in extreme Red sea brine pool vent.

    PubMed

    Elbehery, Ali H A; Aziz, Ramy K; Siam, Rania

    2017-03-01

    Mobile genetic elements are major agents of genome diversification and evolution. Limited studies addressed their characteristics, including abundance, and role in extreme habitats. One of the rare natural habitats exposed to multiple-extreme conditions, including high temperature, salinity and concentration of heavy metals, are the Red Sea brine pools. We assessed the abundance and distribution of different mobile genetic elements in four Red Sea brine pools including the world's largest known multiple-extreme deep-sea environment, the Red Sea Atlantis II Deep. We report a gradient in the abundance of mobile genetic elements, dramatically increasing in the harshest environment of the pool. Additionally, we identified a strong association between the abundance of insertion sequences and extreme conditions, being highest in the harshest and deepest layer of the Red Sea Atlantis II Deep. Our comparative analyses of mobile genetic elements in secluded, extreme and relatively non-extreme environments, suggest that insertion sequences predominantly contribute to polyextremophiles genome plasticity.

  17. DeepLoc: prediction of protein subcellular localization using deep learning.

    PubMed

    Almagro Armenteros, José Juan; Sønderby, Casper Kaae; Sønderby, Søren Kaae; Nielsen, Henrik; Winther, Ole

    2017-11-01

    The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only. Here, we present a prediction algorithm using deep neural networks to predict protein subcellular localization relying only on sequence information. At its core, the prediction model uses a recurrent neural network that processes the entire protein sequence and an attention mechanism identifying protein regions important for the subcellular localization. The model was trained and tested on a protein dataset extracted from one of the latest UniProt releases, in which experimentally annotated proteins follow more stringent criteria than previously. We demonstrate that our model achieves a good accuracy (78% for 10 categories; 92% for membrane-bound or soluble), outperforming current state-of-the-art algorithms, including those relying on homology information. The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc. Example code is available at https://github.com/JJAlmagro/subcellular_localization. The dataset is available at http://www.cbs.dtu.dk/services/DeepLoc/data.php. jjalma@dtu.dk. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Musculoskeletal MRI findings of juvenile localized scleroderma.

    PubMed

    Eutsler, Eric P; Horton, Daniel B; Epelman, Monica; Finkel, Terri; Averill, Lauren W

    2017-04-01

    Juvenile localized scleroderma comprises a group of autoimmune conditions often characterized clinically by an area of skin hardening. In addition to superficial changes in the skin and subcutaneous tissues, juvenile localized scleroderma may involve the deep soft tissues, bones and joints, possibly resulting in functional impairment and pain in addition to cosmetic changes. There is literature documenting the spectrum of findings for deep involvement of localized scleroderma (fascia, muscles, tendons, bones and joints) in adults, but there is limited literature for the condition in children. We aimed to document the spectrum of musculoskeletal magnetic resonance imaging (MRI) findings of both superficial and deep juvenile localized scleroderma involvement in children and to evaluate the utility of various MRI sequences for detecting those findings. Two radiologists retrospectively evaluated 20 MRI studies of the extremities in 14 children with juvenile localized scleroderma. Each imaging sequence was also given a subjective score of 0 (not useful), 1 (somewhat useful) or 2 (most useful for detecting the findings). Deep tissue involvement was detected in 65% of the imaged extremities. Fascial thickening and enhancement were seen in 50% of imaged extremities. Axial T1, axial T1 fat-suppressed (FS) contrast-enhanced and axial fluid-sensitive sequences were rated most useful. Fascial thickening and enhancement were the most commonly encountered deep tissue findings in extremity MRIs of children with juvenile localized scleroderma. Because abnormalities of the skin, subcutaneous tissues and fascia tend to run longitudinally in an affected limb, axial T1, axial fluid-sensitive and axial T1-FS contrast-enhanced sequences should be included in the imaging protocol.

  19. Dissecting enzyme function with microfluidic-based deep mutational scanning.

    PubMed

    Romero, Philip A; Tran, Tuan M; Abate, Adam R

    2015-06-09

    Natural enzymes are incredibly proficient catalysts, but engineering them to have new or improved functions is challenging due to the complexity of how an enzyme's sequence relates to its biochemical properties. Here, we present an ultrahigh-throughput method for mapping enzyme sequence-function relationships that combines droplet microfluidic screening with next-generation DNA sequencing. We apply our method to map the activity of millions of glycosidase sequence variants. Microfluidic-based deep mutational scanning provides a comprehensive and unbiased view of the enzyme function landscape. The mapping displays expected patterns of mutational tolerance and a strong correspondence to sequence variation within the enzyme family, but also reveals previously unreported sites that are crucial for glycosidase function. We modified the screening protocol to include a high-temperature incubation step, and the resulting thermotolerance landscape allowed the discovery of mutations that enhance enzyme thermostability. Droplet microfluidics provides a general platform for enzyme screening that, when combined with DNA-sequencing technologies, enables high-throughput mapping of enzyme sequence space.

  20. The sponge microbiome project.

    PubMed

    Moitinho-Silva, Lucas; Nielsen, Shaun; Amir, Amnon; Gonzalez, Antonio; Ackermann, Gail L; Cerrano, Carlo; Astudillo-Garcia, Carmen; Easson, Cole; Sipkema, Detmer; Liu, Fang; Steinert, Georg; Kotoulas, Giorgos; McCormack, Grace P; Feng, Guofang; Bell, James J; Vicente, Jan; Björk, Johannes R; Montoya, Jose M; Olson, Julie B; Reveillaud, Julie; Steindler, Laura; Pineda, Mari-Carmen; Marra, Maria V; Ilan, Micha; Taylor, Michael W; Polymenakou, Paraskevi; Erwin, Patrick M; Schupp, Peter J; Simister, Rachel L; Knight, Rob; Thacker, Robert W; Costa, Rodrigo; Hill, Russell T; Lopez-Legentil, Susanna; Dailianis, Thanos; Ravasi, Timothy; Hentschel, Ute; Li, Zhiyong; Webster, Nicole S; Thomas, Torsten

    2017-10-01

    Marine sponges (phylum Porifera) are a diverse, phylogenetically deep-branching clade known for forming intimate partnerships with complex communities of microorganisms. To date, 16S rRNA gene sequencing studies have largely utilised different extraction and amplification methodologies to target the microbial communities of a limited number of sponge species, severely limiting comparative analyses of sponge microbial diversity and structure. Here, we provide an extensive and standardised dataset that will facilitate sponge microbiome comparisons across large spatial, temporal, and environmental scales. Samples from marine sponges (n = 3569 specimens), seawater (n = 370), marine sediments (n = 65) and other environments (n = 29) were collected from different locations across the globe. This dataset incorporates at least 268 different sponge species, including several yet unidentified taxa. The V4 region of the 16S rRNA gene was amplified and sequenced from extracted DNA using standardised procedures. Raw sequences (total of 1.1 billion sequences) were processed and clustered with (i) a standard protocol using QIIME closed-reference picking resulting in 39 543 operational taxonomic units (OTU) at 97% sequence identity, (ii) a de novo clustering using Mothur resulting in 518 246 OTUs, and (iii) a new high-resolution Deblur protocol resulting in 83 908 unique bacterial sequences. Abundance tables, representative sequences, taxonomic classifications, and metadata are provided. This dataset represents a comprehensive resource of sponge-associated microbial communities based on 16S rRNA gene sequences that can be used to address overarching hypotheses regarding host-associated prokaryotes, including host specificity, convergent evolution, environmental drivers of microbiome structure, and the sponge-associated rare biosphere. © The Authors 2017. Published by Oxford University Press.

  1. Complete genome sequence of Southern tomato virus naturally infecting tomatoes in Bangladesh using small RNA deep sequencing

    USDA-ARS?s Scientific Manuscript database

    The complete genome sequence of a Southern tomato virus (STV) isolate on tomato plants in a seed production field in Bangladesh was obtained for the first time using next generation sequencing. The identified isolate STV_BD-13 shares high degree of sequence identity (99%) with several known STV isol...

  2. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    USDA-ARS?s Scientific Manuscript database

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  3. Deep whole-genome sequencing of 100 southeast Asian Malays.

    PubMed

    Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2013-01-10

    Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  4. Deep Whole-Genome Sequencing of 100 Southeast Asian Malays

    PubMed Central

    Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2013-01-01

    Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies. PMID:23290073

  5. AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2016-09-01

    Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.

  6. AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling

    PubMed Central

    Wang, Sheng; Sun, Siqi

    2017-01-01

    Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168

  7. Molecular Definition of Vaginal Microbiota in East African Commercial Sex Workers ▿ †

    PubMed Central

    Schellenberg, John J.; Links, Matthew G.; Hill, Janet E.; Dumonceaux, Tim J.; Kimani, Joshua; Jaoko, Walter; Wachihi, Charles; Mungai, Jane Njeri; Peters, Geoffrey A.; Tyler, Shaun; Graham, Morag; Severini, Alberto; Fowke, Keith R.; Ball, T. Blake; Plummer, Francis A.

    2011-01-01

    Resistance to HIV infection in a cohort of commercial sex workers living in Nairobi, Kenya, is linked to mucosal and antiinflammatory factors that may be influenced by the vaginal microbiota. Since bacterial vaginosis (BV), a polymicrobial dysbiosis characterized by low levels of protective Lactobacillus organisms, is an established risk factor for HIV infection, we investigated whether vaginal microbiology was associated with HIV-exposed seronegative (HESN) or HIV-seropositive (HIV+) status in this cohort. A subset of 44 individuals was selected for deep-sequencing analysis based on the chaperonin 60 (cpn60) universal target (UT), including HESN individuals (n = 16), other HIV-seronegative controls (HIV-N, n = 16), and HIV+ individuals (n = 12). Our findings indicate exceptionally high phylogenetic resolution of the cpn60 UT using reads as short as 200 bp, with 54 species in 29 genera detected in this group. Contrary to our initial hypothesis, few differences between HESN and HIV-N women were observed. Several HIV+ women had distinct profiles dominated by Escherichia coli. The deep-sequencing phylogenetic profile of the vaginal microbiota corresponds closely to BV+ and BV− diagnoses by microscopy, elucidating BV at the molecular level. A cluster of samples with intermediate abundance of Lactobacillus and dominant Gardnerella was identified, defining a distinct BV phenotype that may represent a transitional stage between BV+ and BV−. Several alpha- and betaproteobacteria, including the recently described species Variovorax paradoxus, were found to correlate positively with increased Lactobacillus levels that define the BV− (“normal”) phenotype. We conclude that cpn60 UT is ideally suited to next-generation sequencing technologies for further investigation of microbial community dynamics and mucosal immunity underlying HIV resistance in this cohort. PMID:21531840

  8. Deep sequencing-based analysis of the anaerobic stimulon in Neisseria gonorrhoeae

    PubMed Central

    2011-01-01

    Background Maintenance of an anaerobic denitrification system in the obligate human pathogen, Neisseria gonorrhoeae, suggests that an anaerobic lifestyle may be important during the course of infection. Furthermore, mounting evidence suggests that reduction of host-produced nitric oxide has several immunomodulary effects on the host. However, at this point there have been no studies analyzing the complete gonococcal transcriptome response to anaerobiosis. Here we performed deep sequencing to compare the gonococcal transcriptomes of aerobically and anaerobically grown cells. Using the information derived from this sequencing, we discuss the implications of the robust transcriptional response to anaerobic growth. Results We determined that 198 chromosomal genes were differentially expressed (~10% of the genome) in response to anaerobic conditions. We also observed a large induction of genes encoded within the cryptic plasmid, pJD1. Validation of RNA-seq data using translational-lacZ fusions or RT-PCR demonstrated the RNA-seq results to be very reproducible. Surprisingly, many genes of prophage origin were induced anaerobically, as well as several transcriptional regulators previously unknown to be involved in anaerobic growth. We also confirmed expression and regulation of a small RNA, likely a functional equivalent of fnrS in the Enterobacteriaceae family. We also determined that many genes found to be responsive to anaerobiosis have also been shown to be responsive to iron and/or oxidative stress. Conclusions Gonococci will be subject to many forms of environmental stress, including oxygen-limitation, during the course of infection. Here we determined that the anaerobic stimulon in gonococci was larger than previous studies would suggest. Many new targets for future research have been uncovered, and the results derived from this study may have helped to elucidate factors or mechanisms of virulence that may have otherwise been overlooked. PMID:21251255

  9. Modeling genome coverage in single-cell sequencing

    PubMed Central

    Daley, Timothy; Smith, Andrew D.

    2014-01-01

    Motivation: Single-cell DNA sequencing is necessary for examining genetic variation at the cellular level, which remains hidden in bulk sequencing experiments. But because they begin with such small amounts of starting material, the amount of information that is obtained from single-cell sequencing experiment is highly sensitive to the choice of protocol employed and variability in library preparation. In particular, the fraction of the genome represented in single-cell sequencing libraries exhibits extreme variability due to quantitative biases in amplification and loss of genetic material. Results: We propose a method to predict the genome coverage of a deep sequencing experiment using information from an initial shallow sequencing experiment mapped to a reference genome. The observed coverage statistics are used in a non-parametric empirical Bayes Poisson model to estimate the gain in coverage from deeper sequencing. This approach allows researchers to know statistical features of deep sequencing experiments without actually sequencing deeply, providing a basis for optimizing and comparing single-cell sequencing protocols or screening libraries. Availability and implementation: The method is available as part of the preseq software package. Source code is available at http://smithlabresearch.org/preseq. Contact: andrewds@usc.edu Supplementary information: Supplementary material is available at Bioinformatics online. PMID:25107873

  10. Targeted Enrichment of Large Gene Families for Phylogenetic Inference: Phylogeny and Molecular Evolution of Photosynthesis Genes in the Portullugo Clade (Caryophyllales).

    PubMed

    Moore, Abigail J; Vos, Jurriaan M De; Hancock, Lillian P; Goolsby, Eric; Edwards, Erika J

    2018-05-01

    Hybrid enrichment is an increasingly popular approach for obtaining hundreds of loci for phylogenetic analysis across many taxa quickly and cheaply. The genes targeted for sequencing are typically single-copy loci, which facilitate a more straightforward sequence assembly and homology assignment process. However, this approach limits the inclusion of most genes of functional interest, which often belong to multi-gene families. Here, we demonstrate the feasibility of including large gene families in hybrid enrichment protocols for phylogeny reconstruction and subsequent analyses of molecular evolution, using a new set of bait sequences designed for the "portullugo" (Caryophyllales), a moderately sized lineage of flowering plants (~ 2200 species) that includes the cacti and harbors many evolutionary transitions to C$_{\\mathrm{4}}$ and CAM photosynthesis. Including multi-gene families allowed us to simultaneously infer a robust phylogeny and construct a dense sampling of sequences for a major enzyme of C$_{\\mathrm{4}}$ and CAM photosynthesis, which revealed the accumulation of adaptive amino acid substitutions associated with C$_{\\mathrm{4}}$ and CAM origins in particular paralogs. Our final set of matrices for phylogenetic analyses included 75-218 loci across 74 taxa, with ~ 50% matrix completeness across data sets. Phylogenetic resolution was greatly improved across the tree, at both shallow and deep levels. Concatenation and coalescent-based approaches both resolve the sister lineage of the cacti with strong support: Anacampserotaceae $+$ Portulacaceae, two lineages of mostly diminutive succulent herbs of warm, arid regions. In spite of this congruence, BUCKy concordance analyses demonstrated strong and conflicting signals across gene trees. Our results add to the growing number of examples illustrating the complexity of phylogenetic signals in genomic-scale data.

  11. Discovery of Pod Shatter-Resistant Associated SNPs by Deep Sequencing of a Representative Library Followed by Bulk Segregant Analysis in Rapeseed

    PubMed Central

    Huang, Shunmou; Yang, Hongli; Zhan, Gaomiao; Wang, Xinfa; Liu, Guihua; Wang, Hanzhong

    2012-01-01

    Background Single nucleotide polymorphisms (SNPs) are an important class of genetic marker for target gene mapping. As of yet, there is no rapid and effective method to identify SNPs linked with agronomic traits in rapeseed and other crop species. Methodology/Principal Findings We demonstrate a novel method for identifying SNP markers in rapeseed by deep sequencing a representative library and performing bulk segregant analysis. With this method, SNPs associated with rapeseed pod shatter-resistance were discovered. Firstly, a reduced representation of the rapeseed genome was used. Genomic fragments ranging from 450–550 bp were prepared from the susceptible bulk (ten F2 plants with the silique shattering resistance index, SSRI <0.10) and the resistance bulk (ten F2 plants with SSRI >0.90), and also Solexa sequencing-produced 90 bp reads. Approximately 50 million of these sequence reads were assembled into contigs to a depth of 20-fold coverage. Secondly, 60,396 ‘simple SNPs’ were identified, and the statistical significance was evaluated using Fisher's exact test. There were 70 associated SNPs whose –log10 p value over 16 were selected to be further analyzed. The distribution of these SNPs appeared a tight cluster, which consisted of 14 associated SNPs within a 396 kb region on chromosome A09. Our evidence indicates that this region contains a major quantitative trait locus (QTL). Finally, two associated SNPs from this region were mapped on a major QTL region. Conclusions/Significance 70 associated SNPs were discovered and a major QTL for rapeseed pod shatter-resistance was found on chromosome A09 using our novel method. The associated SNP markers were used for mapping of the QTL, and may be useful for improving pod shatter-resistance in rapeseed through marker-assisted selection and map-based cloning. This approach will accelerate the discovery of major QTLs and the cloning of functional genes for important agronomic traits in rapeseed and other crop species. PMID:22529909

  12. Looking beyond the exome: a phenotype-first approach to molecular diagnostic resolution in rare and undiagnosed diseases.

    PubMed

    Pena, Loren D M; Jiang, Yong-Hui; Schoch, Kelly; Spillmann, Rebecca C; Walley, Nicole; Stong, Nicholas; Rapisardo Horn, Sarah; Sullivan, Jennifer A; McConkie-Rosell, Allyn; Kansagra, Sujay; Smith, Edward C; El-Dairi, Mays; Bellet, Jane; Keels, Martha Ann; Jasien, Joan; Kranz, Peter G; Noel, Richard; Nagaraj, Shashi K; Lark, Robert K; Wechsler, Daniel S G; Del Gaudio, Daniela; Leung, Marco L; Hendon, Laura G; Parker, Collette C; Jones, Kelly L; Goldstein, David B; Shashi, Vandana

    2018-04-01

    PurposeTo describe examples of missed pathogenic variants on whole-exome sequencing (WES) and the importance of deep phenotyping for further diagnostic testing.MethodsGuided by phenotypic information, three children with negative WES underwent targeted single-gene testing.ResultsIndividual 1 had a clinical diagnosis consistent with infantile systemic hyalinosis, although WES and a next-generation sequencing (NGS)-based ANTXR2 test were negative. Sanger sequencing of ANTXR2 revealed a homozygous single base pair insertion, previously missed by the WES variant caller software. Individual 2 had neurodevelopmental regression and cerebellar atrophy, with no diagnosis on WES. New clinical findings prompted Sanger sequencing and copy number testing of PLA2G6. A novel homozygous deletion of the noncoding exon 1 (not included in the WES capture kit) was detected, with extension into the promoter, confirming the clinical suspicion of infantile neuroaxonal dystrophy. Individual 3 had progressive ataxia, spasticity, and magnetic resonance image changes of vanishing white matter leukoencephalopathy. An NGS leukodystrophy gene panel and WES showed a heterozygous pathogenic variant in EIF2B5; no deletions/duplications were detected. Sanger sequencing of EIF2B5 showed a frameshift indel, probably missed owing to failure of alignment.ConclusionThese cases illustrate potential pitfalls of WES/NGS testing and the importance of phenotype-guided molecular testing in yielding diagnoses.

  13. DNMT1-interacting RNAs block gene specific DNA methylation

    PubMed Central

    Di Ruscio, Annalisa; Ebralidze, Alexander K.; Benoukraf, Touati; Amabile, Giovanni; Goff, Loyal A.; Terragni, Joylon; Figueroa, Maria Eugenia; De Figureido Pontes, Lorena Lobo; Alberich-Jorda, Meritxell; Zhang, Pu; Wu, Mengchu; D’Alò, Francesco; Melnick, Ari; Leone, Giuseppe; Ebralidze, Konstantin K.; Pradhan, Sriharsa; Rinn, John L.; Tenen, Daniel G.

    2013-01-01

    Summary DNA methylation was described almost a century ago. However, the rules governing its establishment and maintenance remain elusive. Here, we present data demonstrating that active transcription regulates levels of genomic methylation. We identified a novel RNA arising from the CEBPA gene locus critical in regulating the local DNA methylation profile. This RNA binds to DNMT1 and prevents CEBPA gene locus methylation. Deep sequencing of transcripts associated with DNMT1 combined with genome-scale methylation and expression profiling extended the generality of this finding to numerous gene loci. Collectively, these results delineate the nature of DNMT1-RNA interactions and suggest strategies for gene selective demethylation of therapeutic targets in disease. PMID:24107992

  14. The Deep Space Network

    NASA Technical Reports Server (NTRS)

    1975-01-01

    The primary objectives during this portion of the extended mission were to assure survival of the spacecraft for a third Mercury encounter through conservation of attitude control gas and to conduct trajectory correction maneuvers (TCMs) as necessary to target the spacecraft for a solar occultation zone pass. Special support activities included TCMs 6 and 7 conducted on October 30, 1974 and on February 12-13, 1975, respectively. This period also saw the DSN interface organization involved in (1) the allocation of sufficient coverage to assure accurate orbit determination solutions, (2) monitoring of DSN implementation for Viking to assure maintenance of compatible interfaces and capabilities required for Mariner 10, and (3) the development of encounter coverage, sequences, and readiness test plans.

  15. Rapidly evolving homing CRISPR barcodes

    PubMed Central

    Kalhor, Reza; Mali, Prashant; Church, George M.

    2017-01-01

    We present here an approach for engineering evolving DNA barcodes in living cells. The methodology entails using a homing guide RNA (hgRNA) scaffold that directs the Cas9-hgRNA complex to target the DNA locus of the hgRNA itself. We show that this homing CRISPR-Cas9 system acts as an expressed genetic barcode that diversifies its sequence and that the rate of diversification can be controlled in cultured cells. We further evaluate these barcodes in cell populations and show the barcode RNAs can be assayed as single molecules in situ . This integrated approach will have wide ranging applications, such as in deep lineage tracing, cellular barcoding, molecular recording, dissecting cancer biology, and connectome mapping. PMID:27918539

  16. Characterization of Microbial Communities Associated With Deep-Sea Hydrothermal Vent Animals of the East Pacific Rise and the Galápagos Rift

    NASA Astrophysics Data System (ADS)

    Ward, N.; Page, S.; Heidelberg, J.; Eisen, J. A.; Fraser, C. M.

    2002-12-01

    The composition of microbial communities associated with deep-sea hydrothermal vent animals is of interest because of the key role of bacterial symbionts in driving the chemosynthetic food chain of the vent system, and also because bacterial biofilms attached to animal exterior surfaces may play a part in settlement of larval forms. Sequence analysis of 16S ribosomal RNA (rRNA) genes from such communities provides a snapshot of community structure, as this gene is present in all Bacteria and Archaea, and a useful phylogenetic marker for both cultivated microbial species, and uncultivated species such as many of those found in the deep-sea environment. Specimens of giant tube worms (Riftia pachyptila), mussels (Bathymodiolus thermophilus), and clams (Calyptogena magnifica) were collected during the 2002 R/V Atlantis research cruises to the East Pacific Rise (9N) and Galápagos Rift. Microbial biofilms attached to the exterior surfaces of individual animals were sampled, as were tissues known to harbor chemosynthetic bacterial endosymbionts. Genomic DNA was extracted from the samples using a commercially available kit, and 16S rRNA genes amplified from the mixed bacterial communities using the polymerase chain reaction (PCR) and oligonucleotide primers targeting conserved terminal regions of the 16S rRNA gene. The PCR products obtained were cloned into a plasmid vector and the recombinant plasmids transformed into cells of Escherichia coli. Individual cloned 16S rRNA genes were sequenced at the 5' end of the gene (the most phylogenetically informative region in most taxa) and the sequence data compared to publicly available gene sequence databases, to allow a preliminary assignment of clones to taxonomic groups within the Bacteria and Archaea, and to determine the overall composition and phylogenetic diversity of the animal-associated microbial communities. Analysis of Riftia pachyptila exterior biofilm samples revealed the presence of members of the delta and epsilon proteobacteria, low GC Gram positive bacteria (firmicutes), spirochetes, CFB (Cytophaga-Flavobacterium-Bacteroides) group, green nonsulfur bacteria, acidobacteria, verrucomicrobia, and planctomycetes. The presence of the latter three taxonomic groups is of special interest, as they represent phylogenetically distinct groups within the Bacteria for which specific ecological functions have not yet been identified, but which have been found to be widely distributed and often numerically significant in diverse terrestrial and aquatic habitats. Although further sequencing is required to demonstrate the presence of a Riftia-associated microbial population distinct from that of the surrounding seawater, results available from three Riftia individuals from the East Pacific Rise suggest this to be the case. Analysis of microbial communities associated with the gill tissue of the mussel Bathymodiolus thermophilus shows a population dominated by gamma-Proteobacterial chemoautotrophic symbionts, although lower frequency novel phylotypes have been detected. Representatives of specific taxonomic groups have been selected for sequencing of the complete 16S rRNA gene, and the sequences used to reconstruct phylogenetic trees to more accurately determine the evolutionary relationships between the novel sequences, and available sequences for both cultured and non-cultured bacteria.

  17. CRISPRdirect: software for designing CRISPR/Cas guide RNA with reduced off-target sites

    PubMed Central

    Naito, Yuki; Hino, Kimihiro; Bono, Hidemasa; Ui-Tei, Kumiko

    2015-01-01

    Summary: CRISPRdirect is a simple and functional web server for selecting rational CRISPR/Cas targets from an input sequence. The CRISPR/Cas system is a promising technique for genome engineering which allows target-specific cleavage of genomic DNA guided by Cas9 nuclease in complex with a guide RNA (gRNA), that complementarily binds to a ∼20 nt targeted sequence. The target sequence requirements are twofold. First, the 5′-NGG protospacer adjacent motif (PAM) sequence must be located adjacent to the target sequence. Second, the target sequence should be specific within the entire genome in order to avoid off-target editing. CRISPRdirect enables users to easily select rational target sequences with minimized off-target sites by performing exhaustive searches against genomic sequences. The server currently incorporates the genomic sequences of human, mouse, rat, marmoset, pig, chicken, frog, zebrafish, Ciona, fruit fly, silkworm, Caenorhabditis elegans, Arabidopsis, rice, Sorghum and budding yeast. Availability: Freely available at http://crispr.dbcls.jp/. Contact: y-naito@dbcls.rois.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25414360

  18. A deep learning approach for real time prostate segmentation in freehand ultrasound guided biopsy.

    PubMed

    Anas, Emran Mohammad Abu; Mousavi, Parvin; Abolmaesumi, Purang

    2018-06-01

    Targeted prostate biopsy, incorporating multi-parametric magnetic resonance imaging (mp-MRI) and its registration with ultrasound, is currently the state-of-the-art in prostate cancer diagnosis. The registration process in most targeted biopsy systems today relies heavily on accurate segmentation of ultrasound images. Automatic or semi-automatic segmentation is typically performed offline prior to the start of the biopsy procedure. In this paper, we present a deep neural network based real-time prostate segmentation technique during the biopsy procedure, hence paving the way for dynamic registration of mp-MRI and ultrasound data. In addition to using convolutional networks for extracting spatial features, the proposed approach employs recurrent networks to exploit the temporal information among a series of ultrasound images. One of the key contributions in the architecture is to use residual convolution in the recurrent networks to improve optimization. We also exploit recurrent connections within and across different layers of the deep networks to maximize the utilization of the temporal information. Furthermore, we perform dense and sparse sampling of the input ultrasound sequence to make the network robust to ultrasound artifacts. Our architecture is trained on 2,238 labeled transrectal ultrasound images, with an additional 637 and 1,017 unseen images used for validation and testing, respectively. We obtain a mean Dice similarity coefficient of 93%, a mean surface distance error of 1.10 mm and a mean Hausdorff distance error of 3.0 mm. A comparison of the reported results with those of a state-of-the-art technique indicates statistically significant improvement achieved by the proposed approach. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Investigation of a Canine Parvovirus Outbreak using Next Generation Sequencing.

    PubMed

    Parker, Jayme; Murphy, Molly; Hueffer, Karsten; Chen, Jack

    2017-08-29

    Canine parvovirus (CPV) outbreaks can have a devastating effect in communities with dense dog populations. The interior region of Alaska experienced a CPV outbreak in the winter of 2016 leading to the further investigation of the virus due to reports of increased morbidity and mortality occurring at dog mushing kennels in the area. Twelve rectal-swab specimens from dogs displaying clinical signs consistent with parvoviral-associated disease were processed using next-generation sequencing (NGS) methodologies by targeting RNA transcripts, and therefore detecting only replicating virus. All twelve specimens demonstrated the presence of the CPV transcriptome, with read depths ranging from 2.2X - 12,381X, genome coverage ranging from 44.8-96.5%, and representation of CPV sequencing reads to those of the metagenome background ranging from 0.0015-6.7%. Using the data generated by NGS, the presence of newly evolved, yet known, strains of both CPV-2a and CPV-2b were identified and grouped geographically. Deep-sequencing data provided additional diagnostic information in terms of investigating novel CPV in this outbreak. NGS data in addition to limited serological data provided strong diagnostic evidence that this outbreak most likely arose from unvaccinated or under-vaccinated canines, not from a novel CPV strain incapable of being neutralized by current vaccination efforts.

  20. Use of sequence-independent-single-primer-amplification (SISPA) for whole genome sequencing using illumina MiSeq platform for avian influenza virus, Newcastle disease virus, and infectious bronchitis virus

    USDA-ARS?s Scientific Manuscript database

    Over the past decade, Next Generation Sequencing (NGS) technologies, also called deep sequencing, have continued to evolve, increasing capacity and lower the cost necessary for large genome sequencing projects. The one of the advantage of NGS platforms is the possibility to sequence the samples with...

  1. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification.

    PubMed

    Yildirim, Özal

    2018-05-01

    Long-short term memory networks (LSTMs), which have recently emerged in sequential data analysis, are the most widely used type of recurrent neural networks (RNNs) architecture. Progress on the topic of deep learning includes successful adaptations of deep versions of these architectures. In this study, a new model for deep bidirectional LSTM network-based wavelet sequences called DBLSTM-WS was proposed for classifying electrocardiogram (ECG) signals. For this purpose, a new wavelet-based layer is implemented to generate ECG signal sequences. The ECG signals were decomposed into frequency sub-bands at different scales in this layer. These sub-bands are used as sequences for the input of LSTM networks. New network models that include unidirectional (ULSTM) and bidirectional (BLSTM) structures are designed for performance comparisons. Experimental studies have been performed for five different types of heartbeats obtained from the MIT-BIH arrhythmia database. These five types are Normal Sinus Rhythm (NSR), Ventricular Premature Contraction (VPC), Paced Beat (PB), Left Bundle Branch Block (LBBB), and Right Bundle Branch Block (RBBB). The results show that the DBLSTM-WS model gives a high recognition performance of 99.39%. It has been observed that the wavelet-based layer proposed in the study significantly improves the recognition performance of conventional networks. This proposed network structure is an important approach that can be applied to similar signal processing problems. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Protein Solvent-Accessibility Prediction by a Stacked Deep Bidirectional Recurrent Neural Network.

    PubMed

    Zhang, Buzhong; Li, Linqing; Lü, Qiang

    2018-05-25

    Residue solvent accessibility is closely related to the spatial arrangement and packing of residues. Predicting the solvent accessibility of a protein is an important step to understand its structure and function. In this work, we present a deep learning method to predict residue solvent accessibility, which is based on a stacked deep bidirectional recurrent neural network applied to sequence profiles. To capture more long-range sequence information, a merging operator was proposed when bidirectional information from hidden nodes was merged for outputs. Three types of merging operators were used in our improved model, with a long short-term memory network performing as a hidden computing node. The trained database was constructed from 7361 proteins extracted from the PISCES server using a cut-off of 25% sequence identity. Sequence-derived features including position-specific scoring matrix, physical properties, physicochemical characteristics, conservation score and protein coding were used to represent a residue. Using this method, predictive values of continuous relative solvent-accessible area were obtained, and then, these values were transformed into binary states with predefined thresholds. Our experimental results showed that our deep learning method improved prediction quality relative to current methods, with mean absolute error and Pearson's correlation coefficient values of 8.8% and 74.8%, respectively, on the CB502 dataset and 8.2% and 78%, respectively, on the Manesh215 dataset.

  3. MicroRNA and Transcription Factor: Key Players in Plant Regulatory Network.

    PubMed

    Samad, Abdul F A; Sajad, Muhammad; Nazaruddin, Nazaruddin; Fauzi, Izzat A; Murad, Abdul M A; Zainal, Zamri; Ismail, Ismanizan

    2017-01-01

    Recent achievements in plant microRNA (miRNA), a large class of small and non-coding RNAs, are very exciting. A wide array of techniques involving forward genetic, molecular cloning, bioinformatic analysis, and the latest technology, deep sequencing have greatly advanced miRNA discovery. A tiny miRNA sequence has the ability to target single/multiple mRNA targets. Most of the miRNA targets are transcription factors (TFs) which have paramount importance in regulating the plant growth and development. Various families of TFs, which have regulated a range of regulatory networks, may assist plants to grow under normal and stress environmental conditions. This present review focuses on the regulatory relationships between miRNAs and different families of TFs like; NF-Y, MYB, AP2, TCP, WRKY, NAC, GRF, and SPL. For instance NF-Y play important role during drought tolerance and flower development, MYB are involved in signal transduction and biosynthesis of secondary metabolites, AP2 regulate the floral development and nodule formation, TCP direct leaf development and growth hormones signaling. WRKY have known roles in multiple stress tolerances, NAC regulate lateral root formation, GRF are involved in root growth, flower, and seed development, and SPL regulate plant transition from juvenile to adult. We also studied the relation between miRNAs and TFs by consolidating the research findings from different plant species which will help plant scientists in understanding the mechanism of action and interaction between these regulators in the plant growth and development under normal and stress environmental conditions.

  4. Curcumin may serve an anticancer role in human osteosarcoma cell line U-2 OS by targeting ITPR1.

    PubMed

    Luo, Zhanpeng; Li, Dawei; Luo, Xiaobo; Li, Litao; Gu, Suxi; Yu, Long; Ma, Yuanzheng

    2018-04-01

    The present study aimed to determine the mechanisms of action of curcumin in osteosarcoma. Human osteosarcoma U-2 OS cells was purchased from the Cell Bank of the Chinese Academy of Sciences. RNA sequencing analysis was performed for 2 curcumin-treated samples and 2 control samples using Illumina deep sequencing technology. The differentially expressed genes were identified using Cufflink software. Enrichment and protein-protein interaction network analyses were performed separately using cluster Profiler package and Cytoscape software to identify key genes. Then, the mRNA levels of key genes were detected by quantitative reverse transcription polymerase chain reaction (RT-qPCR) in U-2 OS cells. Finally, cell apoptosis, proliferation, migration and invasion arrays were performed. In total, 201 DEGs were identified in the curcumin-treated group. EEF1A1 (degree=88), ATF7IP, HIF1A, SMAD7, CLTC, MCM10, ITPR1, ADAM15, WWP2 and ATP5C1, which were enriched in 'biological process', exhibited higher degrees than other genes in the PPI network. RT-qPCR demonstrated that treatment with curcumin was able to significantly increase the levels of CLTC and ITPR1 mRNA in curcumin-treated cells compared with control. In addition, targeting ITPR1 with curcumin significantly promoted apoptosis and suppressed proliferation, migration and invasion. Targeting ITPR1 via curcumin may serve an anticancer role by mediating apoptosis, proliferation, migration and invasion in U-2 OS cells.

  5. Population genomics of C. melanopterus using target gene capture data: demographic inferences and conservation perspectives

    PubMed Central

    Maisano Delser, Pierpaolo; Corrigan, Shannon; Hale, Matthew; Li, Chenhong; Veuille, Michel; Planes, Serge; Naylor, Gavin; Mona, Stefano

    2016-01-01

    Population genetics studies on non-model organisms typically involve sampling few markers from multiple individuals. Next-generation sequencing approaches open up the possibility of sampling many more markers from fewer individuals to address the same questions. Here, we applied a target gene capture method to deep sequence ~1000 independent autosomal regions of a non-model organism, the blacktip reef shark (Carcharhinus melanopterus). We devised a sampling scheme based on the predictions of theoretical studies of metapopulations to show that sampling few individuals, but many loci, can be extremely informative to reconstruct the evolutionary history of species. We collected data from a single deme (SID) from Northern Australia and from a scattered sampling representing various locations throughout the Indian Ocean (SCD). We explored the genealogical signature of population dynamics detected from both sampling schemes using an ABC algorithm. We then contrasted these results with those obtained by fitting the data to a non-equilibrium finite island model. Both approaches supported an Nm value ~40, consistent with philopatry in this species. Finally, we demonstrate through simulation that metapopulations exhibit greater resilience to recent changes in effective size compared to unstructured populations. We propose an empirical approach to detect recent bottlenecks based on our sampling scheme. PMID:27651217

  6. Identification and Characterization of microRNAs during Maize Grain Filling

    PubMed Central

    Lv, Panqing; Peng, Qian; Ding, Dong; Li, Weihua; Tang, Jihua

    2015-01-01

    The grain filling rate is closely associated with final grain yield of maize during the period of maize grain filling. To identify the key microRNAs (miRNAs) and miRNA-dependent gene regulation networks of grain filling in maize, a deep-sequencing technique was used to research the dynamic expression patternsof miRNAs at four distinct developmental grain filling stages in Zhengdan 958, which is an elite hybrid and cultivated widely in China. The sequencing result showed that the expression amount of almost all miRNAs was changing with the development of the grain filling and formed in seven groups. After normalization, 77 conserved miRNAs and 74 novel miRNAs were co-detected in these four samples. Eighty-one out of 162 targets of the conserved miRNAs belonged to transcriptional regulation (81, 50%), followed by oxidoreductase activity (18, 11%), signal transduction (16, 10%) and development (15, 9%). The result showed that miRNA 156, 393, 396 and 397, with their respective targets, might play key roles in the grain filling rate by regulating maize growth, development and environment stress response. The result also offered novel insights into the dynamic change of miRNAs during the developing process of maize kernels and assistedin the understanding of how miRNAs are functioning about the grain filling rate. PMID:25951054

  7. Identification and Characterization of microRNAs during Maize Grain Filling.

    PubMed

    Jin, Xining; Fu, Zhiyuan; Lv, Panqing; Peng, Qian; Ding, Dong; Li, Weihua; Tang, Jihua

    2015-01-01

    The grain filling rate is closely associated with final grain yield of maize during the period of maize grain filling. To identify the key microRNAs (miRNAs) and miRNA-dependent gene regulation networks of grain filling in maize, a deep-sequencing technique was used to research the dynamic expression patterns of miRNAs at four distinct developmental grain filling stages in Zhengdan 958, which is an elite hybrid and cultivated widely in China. The sequencing result showed that the expression amount of almost all miRNAs was changing with the development of the grain filling and formed in seven groups. After normalization, 77 conserved miRNAs and 74 novel miRNAs were co-detected in these four samples. Eighty-one out of 162 targets of the conserved miRNAs belonged to transcriptional regulation (81, 50%), followed by oxidoreductase activity (18, 11%), signal transduction (16, 10%) and development (15, 9%). The result showed that miRNA 156, 393, 396 and 397, with their respective targets, might play key roles in the grain filling rate by regulating maize growth, development and environment stress response. The result also offered novel insights into the dynamic change of miRNAs during the developing process of maize kernels and assisted in the understanding of how miRNAs are functioning about the grain filling rate.

  8. Population genomics of C. melanopterus using target gene capture data: demographic inferences and conservation perspectives.

    PubMed

    Maisano Delser, Pierpaolo; Corrigan, Shannon; Hale, Matthew; Li, Chenhong; Veuille, Michel; Planes, Serge; Naylor, Gavin; Mona, Stefano

    2016-09-21

    Population genetics studies on non-model organisms typically involve sampling few markers from multiple individuals. Next-generation sequencing approaches open up the possibility of sampling many more markers from fewer individuals to address the same questions. Here, we applied a target gene capture method to deep sequence ~1000 independent autosomal regions of a non-model organism, the blacktip reef shark (Carcharhinus melanopterus). We devised a sampling scheme based on the predictions of theoretical studies of metapopulations to show that sampling few individuals, but many loci, can be extremely informative to reconstruct the evolutionary history of species. We collected data from a single deme (SID) from Northern Australia and from a scattered sampling representing various locations throughout the Indian Ocean (SCD). We explored the genealogical signature of population dynamics detected from both sampling schemes using an ABC algorithm. We then contrasted these results with those obtained by fitting the data to a non-equilibrium finite island model. Both approaches supported an Nm value ~40, consistent with philopatry in this species. Finally, we demonstrate through simulation that metapopulations exhibit greater resilience to recent changes in effective size compared to unstructured populations. We propose an empirical approach to detect recent bottlenecks based on our sampling scheme.

  9. Prognostic value of deep sequencing method for minimal residual disease detection in multiple myeloma

    PubMed Central

    Lahuerta, Juan J.; Pepin, François; González, Marcos; Barrio, Santiago; Ayala, Rosa; Puig, Noemí; Montalban, María A.; Paiva, Bruno; Weng, Li; Jiménez, Cristina; Sopena, María; Moorhead, Martin; Cedena, Teresa; Rapado, Immaculada; Mateos, María Victoria; Rosiñol, Laura; Oriol, Albert; Blanchard, María J.; Martínez, Rafael; Bladé, Joan; San Miguel, Jesús; Faham, Malek; García-Sanz, Ramón

    2014-01-01

    We assessed the prognostic value of minimal residual disease (MRD) detection in multiple myeloma (MM) patients using a sequencing-based platform in bone marrow samples from 133 MM patients in at least very good partial response (VGPR) after front-line therapy. Deep sequencing was carried out in patients in whom a high-frequency myeloma clone was identified and MRD was assessed using the IGH-VDJH, IGH-DJH, and IGK assays. The results were contrasted with those of multiparametric flow cytometry (MFC) and allele-specific oligonucleotide polymerase chain reaction (ASO-PCR). The applicability of deep sequencing was 91%. Concordance between sequencing and MFC and ASO-PCR was 83% and 85%, respectively. Patients who were MRD– by sequencing had a significantly longer time to tumor progression (TTP) (median 80 vs 31 months; P < .0001) and overall survival (median not reached vs 81 months; P = .02), compared with patients who were MRD+. When stratifying patients by different levels of MRD, the respective TTP medians were: MRD ≥10−3 27 months, MRD 10−3 to 10−5 48 months, and MRD <10−5 80 months (P = .003 to .0001). Ninety-two percent of VGPR patients were MRD+. In complete response patients, the TTP remained significantly longer for MRD– compared with MRD+ patients (131 vs 35 months; P = .0009). PMID:24646471

  10. Deep sequencing and flow cytometric characterization of expanded effector memory CD8+CD57+ T cells frequently reveals T-cell receptor Vβ oligoclonality and CDR3 homology in acquired aplastic anemia.

    PubMed

    Giudice, Valentina; Feng, Xingmin; Lin, Zenghua; Hu, Wei; Zhang, Fanmao; Qiao, Wangmin; Ibanez, Maria Del Pilar Fernandez; Rios, Olga; Young, Neal S

    2018-05-01

    Oligoclonal expansion of CD8 + CD28 - lymphocytes has been considered indirect evidence for a pathogenic immune response in acquired aplastic anemia. A subset of CD8 + CD28 - cells with CD57 expression, termed effector memory cells, is expanded in several immune-mediated diseases and may have a role in immune surveillance. We hypothesized that effector memory CD8 + CD28 - CD57 + cells may drive aberrant oligoclonal expansion in aplastic anemia. We found CD8 + CD57 + cells frequently expanded in the blood of aplastic anemia patients, with oligoclonal characteristics by flow cytometric Vβ usage analysis: skewing in 1-5 Vβ families and frequencies of immunodominant clones ranging from 1.98% to 66.5%. Oligoclonal characteristics were also observed in total CD8 + cells from aplastic anemia patients with CD8 + CD57 + cell expansion by T-cell receptor deep sequencing, as well as the presence of 1-3 immunodominant clones. Oligoclonality was confirmed by T-cell receptor repertoire deep sequencing of enriched CD8 + CD57 + cells, which also showed decreased diversity compared to total CD4 + and CD8 + cell pools. From analysis of complementarity-determining region 3 sequences in the CD8 + cell pool, a total of 29 sequences were shared between patients and controls, but these sequences were highly expressed in aplastic anemia subjects and also present in their immunodominant clones. In summary, expansion of effector memory CD8 + T cells is frequent in aplastic anemia and mirrors Vβ oligoclonal expansion. Flow cytometric Vβ usage analysis combined with deep sequencing technologies allows high resolution characterization of the T-cell receptor repertoire, and might represent a useful tool in the diagnosis and periodic evaluation of aplastic anemia patients. (Registered at clinicaltrials.gov identifiers: 00001620, 01623167, 00001397, 00071045, 00081523, 00961064 ). Copyright © 2018 Ferrata Storti Foundation.

  11. Complete genome sequence of a tomato infecting tomato mottle mosaic virus in New York

    USDA-ARS?s Scientific Manuscript database

    Complete genome sequence of an emerging isolate of tomato mottle mosaic virus (ToMMV) infecting experimental nicotianan benthamiana plants in up-state New York was obtained using small RNA deep sequencing. ToMMV_NY-13 shared 99% sequence identity to ToMMV isolates from Mexico and Florida. Broader d...

  12. Method to amplify variable sequences without imposing primer sequences

    DOEpatents

    Bradbury, Andrew M.; Zeytun, Ahmet

    2006-11-14

    The present invention provides methods of amplifying target sequences without including regions flanking the target sequence in the amplified product or imposing amplification primer sequences on the amplified product. Also provided are methods of preparing a library from such amplified target sequences.

  13. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA

    PubMed Central

    Zill, Oliver A.; Sebisanovic, Dragan; Lopez, Rene; Blau, Sibel; Collisson, Eric A.; Divers, Stephen G.; Hoon, Dave S. B.; Kopetz, E. Scott; Lee, Jeeyun; Nikolinakos, Petros G.; Baca, Arthur M.; Kermani, Bahram G.; Eltoukhy, Helmy; Talasaz, AmirAli

    2015-01-01

    Next-generation sequencing of cell-free circulating solid tumor DNA addresses two challenges in contemporary cancer care. First this method of massively parallel and deep sequencing enables assessment of a comprehensive panel of genomic targets from a single sample, and second, it obviates the need for repeat invasive tissue biopsies. Digital SequencingTM is a novel method for high-quality sequencing of circulating tumor DNA simultaneously across a comprehensive panel of over 50 cancer-related genes with a simple blood test. Here we report the analytic and clinical validation of the gene panel. Analytic sensitivity down to 0.1% mutant allele fraction is demonstrated via serial dilution studies of known samples. Near-perfect analytic specificity (> 99.9999%) enables complete coverage of many genes without the false positives typically seen with traditional sequencing assays at mutant allele frequencies or fractions below 5%. We compared digital sequencing of plasma-derived cell-free DNA to tissue-based sequencing on 165 consecutive matched samples from five outside centers in patients with stage III-IV solid tumor cancers. Clinical sensitivity of plasma-derived NGS was 85.0%, comparable to 80.7% sensitivity for tissue. The assay success rate on 1,000 consecutive samples in clinical practice was 99.8%. Digital sequencing of plasma-derived DNA is indicated in advanced cancer patients to prevent repeated invasive biopsies when the initial biopsy is inadequate, unobtainable for genomic testing, or uninformative, or when the patient’s cancer has progressed despite treatment. Its clinical utility is derived from reduction in the costs, complications and delays associated with invasive tissue biopsies for genomic testing. PMID:26474073

  14. Middle East Respiratory Syndrome Coronavirus Intra-Host Populations Are Characterized by Numerous High Frequency Variants

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Borucki, Monica K.; Lao, Victoria; Hwang, Mona

    Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging human pathogen related to SARS virus. In vitro studies indicate this virus may have a broad host range suggesting an increased pandemic potential. Genetic and epidemiological evidence indicate camels serve as a reservoir for MERS virus but the mechanism of cross species transmission is unclear and many questions remain regarding the susceptibility of humans to infection. Deep sequencing data was obtained from the nasal samples of three camels that had been experimentally infected with a human MERS-CoV isolate. A majority of the genome was covered and average coverage was greater thanmore » 12,000x depth. Although only 5 mutations were detected in the consensus sequences, 473 intrahost single nucleotide variants were identified. Lastly, many of these variants were present at high frequencies and could potentially influence viral phenotype and the sensitivity of detection assays that target these regions for primer or probe binding.« less

  15. Jellyfish Bioactive Compounds: Methods for Wet-Lab Work

    PubMed Central

    Frazão, Bárbara; Antunes, Agostinho

    2016-01-01

    The study of bioactive compounds from marine animals has provided, over time, an endless source of interesting molecules. Jellyfish are commonly targets of study due to their toxic proteins. However, there is a gap in reviewing successful wet-lab methods employed in these animals, which compromises the fast progress in the detection of related biomolecules. Here, we provide a compilation of the most effective wet-lab methodologies for jellyfish venom extraction prior to proteomic analysis—separation, identification and toxicity assays. This includes SDS-PAGE, 2DE, gel chromatography, HPLC, DEAE, LC-MS, MALDI, Western blot, hemolytic assay, antimicrobial assay and protease activity assay. For a more comprehensive approach, jellyfish toxicity studies should further consider transcriptome sequencing. We reviewed such methodologies and other genomic techniques used prior to the deep sequencing of transcripts, including RNA extraction, construction of cDNA libraries and RACE. Overall, we provide an overview of the most promising methods and their successful implementation for optimizing time and effort when studying jellyfish. PMID:27077869

  16. Middle East Respiratory Syndrome Coronavirus Intra-Host Populations Are Characterized by Numerous High Frequency Variants

    DOE PAGES

    Borucki, Monica K.; Lao, Victoria; Hwang, Mona; ...

    2016-01-20

    Middle East respiratory syndrome coronavirus (MERS-CoV) is an emerging human pathogen related to SARS virus. In vitro studies indicate this virus may have a broad host range suggesting an increased pandemic potential. Genetic and epidemiological evidence indicate camels serve as a reservoir for MERS virus but the mechanism of cross species transmission is unclear and many questions remain regarding the susceptibility of humans to infection. Deep sequencing data was obtained from the nasal samples of three camels that had been experimentally infected with a human MERS-CoV isolate. A majority of the genome was covered and average coverage was greater thanmore » 12,000x depth. Although only 5 mutations were detected in the consensus sequences, 473 intrahost single nucleotide variants were identified. Lastly, many of these variants were present at high frequencies and could potentially influence viral phenotype and the sensitivity of detection assays that target these regions for primer or probe binding.« less

  17. Jellyfish Bioactive Compounds: Methods for Wet-Lab Work.

    PubMed

    Frazão, Bárbara; Antunes, Agostinho

    2016-04-12

    The study of bioactive compounds from marine animals has provided, over time, an endless source of interesting molecules. Jellyfish are commonly targets of study due to their toxic proteins. However, there is a gap in reviewing successful wet-lab methods employed in these animals, which compromises the fast progress in the detection of related biomolecules. Here, we provide a compilation of the most effective wet-lab methodologies for jellyfish venom extraction prior to proteomic analysis-separation, identification and toxicity assays. This includes SDS-PAGE, 2DE, gel chromatography, HPLC, DEAE, LC-MS, MALDI, Western blot, hemolytic assay, antimicrobial assay and protease activity assay. For a more comprehensive approach, jellyfish toxicity studies should further consider transcriptome sequencing. We reviewed such methodologies and other genomic techniques used prior to the deep sequencing of transcripts, including RNA extraction, construction of cDNA libraries and RACE. Overall, we provide an overview of the most promising methods and their successful implementation for optimizing time and effort when studying jellyfish.

  18. Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life

    PubMed Central

    Sheik, Cody S.; Reese, Brandi Kiel; Twing, Katrina I.; Sylvan, Jason B.; Grim, Sharon L.; Schrenk, Matthew O.; Sogin, Mitchell L.; Colwell, Frederick S.

    2018-01-01

    Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium, Aquabacterium, Ralstonia, and Acinetobacter. While the top five most frequently observed genera were Pseudomonas, Propionibacterium, Acinetobacter, Ralstonia, and Sphingomonas. The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth’s deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset. PMID:29780369

  19. Low-abundance HIV drug-resistant viral variants in treatment-experienced persons correlate with historical antiretroviral use.

    PubMed

    Le, Thuy; Chiarella, Jennifer; Simen, Birgitte B; Hanczaruk, Bozena; Egholm, Michael; Landry, Marie L; Dieckhaus, Kevin; Rosen, Marc I; Kozal, Michael J

    2009-06-29

    It is largely unknown how frequently low-abundance HIV drug-resistant variants at levels under limit of detection of conventional genotyping (<20% of quasi-species) are present in antiretroviral-experienced persons experiencing virologic failure. Further, the clinical implications of low-abundance drug-resistant variants at time of virologic failure are unknown. Plasma samples from 22 antiretroviral-experienced subjects collected at time of virologic failure (viral load 1380 to 304,000 copies/mL) were obtained from a specimen bank (from 2004-2007). The prevalence and profile of drug-resistant mutations were determined using Sanger sequencing and ultra-deep pyrosequencing. Genotypes were interpreted using Stanford HIV database algorithm. Antiretroviral treatment histories were obtained by chart review and correlated with drug-resistant mutations. Low-abundance drug-resistant mutations were detected in all 22 subjects by deep sequencing and only in 3 subjects by Sanger sequencing. In total they accounted for 90 of 247 mutations (36%) detected by deep sequencing; the majority of these (95%) were not detected by standard genotyping. A mean of 4 additional mutations per subject were detected by deep sequencing (p<0.0001, 95%CI: 2.85-5.53). The additional low-abundance drug-resistant mutations increased a subject's genotypic resistance to one or more antiretrovirals in 17 of 22 subjects (77%). When correlated with subjects' antiretroviral treatment histories, the additional low-abundance drug-resistant mutations correlated with the failing antiretroviral drugs in 21% subjects and correlated with historical antiretroviral use in 79% subjects (OR, 13.73; 95% CI, 2.5-74.3, p = 0.0016). Low-abundance HIV drug-resistant mutations in antiretroviral-experienced subjects at time of virologic failure can increase a subject's overall burden of resistance, yet commonly go unrecognized by conventional genotyping. The majority of unrecognized resistant mutations correlate with historical antiretroviral use. Ultra-deep sequencing can provide important historical resistance information for clinicians when planning subsequent antiretroviral regimens for highly treatment-experienced patients, particularly when their prior treatment histories and longitudinal genotypes are not available.

  20. Low-Abundance HIV Drug-Resistant Viral Variants in Treatment-Experienced Persons Correlate with Historical Antiretroviral Use

    PubMed Central

    Le, Thuy; Chiarella, Jennifer; Simen, Birgitte B.; Hanczaruk, Bozena; Egholm, Michael; Landry, Marie L.; Dieckhaus, Kevin; Rosen, Marc I.; Kozal, Michael J.

    2009-01-01

    Background It is largely unknown how frequently low-abundance HIV drug-resistant variants at levels under limit of detection of conventional genotyping (<20% of quasi-species) are present in antiretroviral-experienced persons experiencing virologic failure. Further, the clinical implications of low-abundance drug-resistant variants at time of virologic failure are unknown. Methodology/Principal Findings Plasma samples from 22 antiretroviral-experienced subjects collected at time of virologic failure (viral load 1380 to 304,000 copies/mL) were obtained from a specimen bank (from 2004–2007). The prevalence and profile of drug-resistant mutations were determined using Sanger sequencing and ultra-deep pyrosequencing. Genotypes were interpreted using Stanford HIV database algorithm. Antiretroviral treatment histories were obtained by chart review and correlated with drug-resistant mutations. Low-abundance drug-resistant mutations were detected in all 22 subjects by deep sequencing and only in 3 subjects by Sanger sequencing. In total they accounted for 90 of 247 mutations (36%) detected by deep sequencing; the majority of these (95%) were not detected by standard genotyping. A mean of 4 additional mutations per subject were detected by deep sequencing (p<0.0001, 95%CI: 2.85–5.53). The additional low-abundance drug-resistant mutations increased a subject's genotypic resistance to one or more antiretrovirals in 17 of 22 subjects (77%). When correlated with subjects' antiretroviral treatment histories, the additional low-abundance drug-resistant mutations correlated with the failing antiretroviral drugs in 21% subjects and correlated with historical antiretroviral use in 79% subjects (OR, 13.73; 95% CI, 2.5–74.3, p = 0.0016). Conclusions/Significance Low-abundance HIV drug-resistant mutations in antiretroviral-experienced subjects at time of virologic failure can increase a subject's overall burden of resistance, yet commonly go unrecognized by conventional genotyping. The majority of unrecognized resistant mutations correlate with historical antiretroviral use. Ultra-deep sequencing can provide important historical resistance information for clinicians when planning subsequent antiretroviral regimens for highly treatment-experienced patients, particularly when their prior treatment histories and longitudinal genotypes are not available. PMID:19562031

  1. Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life.

    PubMed

    Sheik, Cody S; Reese, Brandi Kiel; Twing, Katrina I; Sylvan, Jason B; Grim, Sharon L; Schrenk, Matthew O; Sogin, Mitchell L; Colwell, Frederick S

    2018-01-01

    Earth's subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium , Aquabacterium , Ralstonia , and Acinetobacter . While the top five most frequently observed genera were Pseudomonas , Propionibacterium , Acinetobacter , Ralstonia , and Sphingomonas . The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth's deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset.

  2. Deep, noninvasive imaging and surgical guidance of submillimeter tumors using targeted M13-stabilized single-walled carbon nanotubes

    PubMed Central

    Ghosh, Debadyuti; Bagley, Alexander F.; Na, Young Jeong; Birrer, Michael J.; Bhatia, Sangeeta N.; Belcher, Angela M.

    2014-01-01

    Highly sensitive detection of small, deep tumors for early diagnosis and surgical interventions remains a challenge for conventional imaging modalities. Second-window near-infrared light (NIR2, 950–1,400 nm) is promising for in vivo fluorescence imaging due to deep tissue penetration and low tissue autofluorescence. With their intrinsic fluorescence in the NIR2 regime and lack of photobleaching, single-walled carbon nanotubes (SWNTs) are potentially attractive contrast agents to detect tumors. Here, targeted M13 virus-stabilized SWNTs are used to visualize deep, disseminated tumors in vivo. This targeted nanoprobe, which uses M13 to stably display both tumor-targeting peptides and an SWNT imaging probe, demonstrates excellent tumor-to-background uptake and exhibits higher signal-to-noise performance compared with visible and near-infrared (NIR1) dyes for delineating tumor nodules. Detection and excision of tumors by a gynecological surgeon improved with SWNT image guidance and led to the identification of submillimeter tumors. Collectively, these findings demonstrate the promise of targeted SWNT nanoprobes for noninvasive disease monitoring and guided surgery. PMID:25214538

  3. Deep, noninvasive imaging and surgical guidance of submillimeter tumors using targeted M13-stabilized single-walled carbon nanotubes.

    PubMed

    Ghosh, Debadyuti; Bagley, Alexander F; Na, Young Jeong; Birrer, Michael J; Bhatia, Sangeeta N; Belcher, Angela M

    2014-09-23

    Highly sensitive detection of small, deep tumors for early diagnosis and surgical interventions remains a challenge for conventional imaging modalities. Second-window near-infrared light (NIR2, 950-1,400 nm) is promising for in vivo fluorescence imaging due to deep tissue penetration and low tissue autofluorescence. With their intrinsic fluorescence in the NIR2 regime and lack of photobleaching, single-walled carbon nanotubes (SWNTs) are potentially attractive contrast agents to detect tumors. Here, targeted M13 virus-stabilized SWNTs are used to visualize deep, disseminated tumors in vivo. This targeted nanoprobe, which uses M13 to stably display both tumor-targeting peptides and an SWNT imaging probe, demonstrates excellent tumor-to-background uptake and exhibits higher signal-to-noise performance compared with visible and near-infrared (NIR1) dyes for delineating tumor nodules. Detection and excision of tumors by a gynecological surgeon improved with SWNT image guidance and led to the identification of submillimeter tumors. Collectively, these findings demonstrate the promise of targeted SWNT nanoprobes for noninvasive disease monitoring and guided surgery.

  4. Graphical classification of DNA sequences of HLA alleles by deep learning.

    PubMed

    Miyake, Jun; Kaneshita, Yuhei; Asatani, Satoshi; Tagawa, Seiichi; Niioka, Hirohiko; Hirano, Takashi

    2018-04-01

    Alleles of human leukocyte antigen (HLA)-A DNAs are classified and expressed graphically by using artificial intelligence "Deep Learning (Stacked autoencoder)". Nucleotide sequence data corresponding to the length of 822 bp, collected from the Immuno Polymorphism Database, were compressed to 2-dimensional representation and were plotted. Profiles of the two-dimensional plots indicate that the alleles can be classified as clusters are formed. The two-dimensional plot of HLA-A DNAs gives a clear outlook for characterizing the various alleles.

  5. Enhanced arbovirus surveillance with deep sequencing: Identification of novel rhabdoviruses and bunyaviruses in Australian mosquitoes.

    PubMed

    Coffey, Lark L; Page, Brady L; Greninger, Alexander L; Herring, Belinda L; Russell, Richard C; Doggett, Stephen L; Haniotis, John; Wang, Chunlin; Deng, Xutao; Delwart, Eric L

    2014-01-05

    Viral metagenomics characterizes known and identifies unknown viruses based on sequence similarities to any previously sequenced viral genomes. A metagenomics approach was used to identify virus sequences in Australian mosquitoes causing cytopathic effects in inoculated mammalian cell cultures. Sequence comparisons revealed strains of Liao Ning virus (Reovirus, Seadornavirus), previously detected only in China, livestock-infecting Stretch Lagoon virus (Reovirus, Orbivirus), two novel dimarhabdoviruses, named Beaumont and North Creek viruses, and two novel orthobunyaviruses, named Murrumbidgee and Salt Ash viruses. The novel virus proteomes diverged by ≥ 50% relative to their closest previously genetically characterized viral relatives. Deep sequencing also generated genomes of Warrego and Wallal viruses, orbiviruses linked to kangaroo blindness, whose genomes had not been fully characterized. This study highlights viral metagenomics in concert with traditional arbovirus surveillance to characterize known and new arboviruses in field-collected mosquitoes. Follow-up epidemiological studies are required to determine whether the novel viruses infect humans. © 2013 Elsevier Inc. All rights reserved.

  6. 3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

    PubMed

    Goldfarb, Katherine C; Cech, Thomas R

    2013-09-21

    Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.

  7. Multiplicity and molecular epidemiology of Plasmodium vivax and Plasmodium falciparum infections in East Africa.

    PubMed

    Zhong, Daibin; Lo, Eugenia; Wang, Xiaoming; Yewhalaw, Delenasaw; Zhou, Guofa; Atieli, Harrysone E; Githeko, Andrew; Hemming-Schroeder, Elizabeth; Lee, Ming-Chieh; Afrane, Yaw; Yan, Guiyun

    2018-05-02

    Parasite genetic diversity and multiplicity of infection (MOI) affect clinical outcomes, response to drug treatment and naturally-acquired or vaccine-induced immunity. Traditional methods often underestimate the frequency and diversity of multiclonal infections due to technical sensitivity and specificity. Next-generation sequencing techniques provide a novel opportunity to study complexity of parasite populations and molecular epidemiology. Symptomatic and asymptomatic Plasmodium vivax samples were collected from health centres/hospitals and schools, respectively, from 2011 to 2015 in Ethiopia. Similarly, both symptomatic and asymptomatic Plasmodium falciparum samples were collected, respectively, from hospitals and schools in 2005 and 2015 in Kenya. Finger-pricked blood samples were collected and dried on filter paper. Long amplicon (> 400 bp) deep sequencing of merozoite surface protein 1 (msp1) gene was conducted to determine multiplicity and molecular epidemiology of P. vivax and P. falciparum infections. The results were compared with those based on short amplicon (117 bp) deep sequencing. A total of 139 P. vivax and 222 P. falciparum samples were pyro-sequenced for pvmsp1 and pfmsp1, yielding a total of 21 P. vivax and 99 P. falciparum predominant haplotypes. The average MOI for P. vivax and P. falciparum were 2.16 and 2.68, respectively, which were significantly higher than that of microsatellite markers and short amplicon (117 bp) deep sequencing. Multiclonal infections were detected in 62.2% of the samples for P. vivax and 74.8% of the samples for P. falciparum. Four out of the five subjects with recurrent P. vivax malaria were found to be a relapse 44-65 days after clearance of parasites. No difference was observed in MOI among P. vivax patients of different symptoms, ages and genders. Similar patterns were also observed in P. falciparum except for one study site in Kenyan lowland areas with significantly higher MOI. The study used a novel method to evaluate Plasmodium MOI and molecular epidemiological patterns by long amplicon ultra-deep sequencing. The complexity of infections were similar among age groups, symptoms, genders, transmission settings (spatial heterogeneity), as well as over years (pre- vs. post-scale-up interventions). This study demonstrated that long amplicon deep sequencing is a useful tool to investigate multiplicity and molecular epidemiology of Plasmodium parasite infections.

  8. In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data

    NASA Astrophysics Data System (ADS)

    Cai, Lei; Yuan, Wei; Zhang, Zhou; He, Lin; Chou, Kuo-Chen

    2016-11-01

    Four popular somatic single nucleotide variant (SNV) calling methods (Varscan, SomaticSniper, Strelka and MuTect2) were carefully evaluated on the real whole exome sequencing (WES, depth of ~50X) and ultra-deep targeted sequencing (UDT-Seq, depth of ~370X) data. The four tools returned poor consensus on candidates (only 20% of calls were with multiple hits by the callers). For both WES and UDT-Seq, MuTect2 and Strelka obtained the largest proportion of COSMIC entries as well as the lowest rate of dbSNP presence and high-alternative-alleles-in-control calls, demonstrating their superior sensitivity and accuracy. Combining different callers does increase reliability of candidates, but narrows the list down to very limited range of tumor read depth and variant allele frequency. Calling SNV on UDT-Seq data, which were of much higher read-depth, discovered additional true-positive variations, despite an even more tremendous growth in false positive predictions. Our findings not only provide valuable benchmark for state-of-the-art SNV calling methods, but also shed light on the access to more accurate SNV identification in the future.

  9. omiRas: a Web server for differential expression analysis of miRNAs derived from small RNA-Seq data.

    PubMed

    Müller, Sören; Rycak, Lukas; Winter, Peter; Kahl, Günter; Koch, Ina; Rotter, Björn

    2013-10-15

    Small RNA deep sequencing is widely used to characterize non-coding RNAs (ncRNAs) differentially expressed between two conditions, e.g. healthy and diseased individuals and to reveal insights into molecular mechanisms underlying condition-specific phenotypic traits. The ncRNAome is composed of a multitude of RNAs, such as transfer RNA, small nucleolar RNA and microRNA (miRNA), to name few. Here we present omiRas, a Web server for the annotation, comparison and visualization of interaction networks of ncRNAs derived from next-generation sequencing experiments of two different conditions. The Web tool allows the user to submit raw sequencing data and results are presented as: (i) static annotation results including length distribution, mapping statistics, alignments and quantification tables for each library as well as lists of differentially expressed ncRNAs between conditions and (ii) an interactive network visualization of user-selected miRNAs and their target genes based on the combination of several miRNA-mRNA interaction databases. The omiRas Web server is implemented in Python, PostgreSQL, R and can be accessed at: http://tools.genxpro.net/omiras/.

  10. A comprehensive framework for functional diversity patterns of marine chromophytic phytoplankton using rbcL phylogeny

    PubMed Central

    Samanta, Brajogopal; Bhadury, Punyasloke

    2016-01-01

    Marine chromophytes are taxonomically diverse group of algae and contribute approximately half of the total oceanic primary production. To understand the global patterns of functional diversity of chromophytic phytoplankton, robust bioinformatics and statistical analyses including deep phylogeny based on 2476 form ID rbcL gene sequences representing seven ecologically significant oceanographic ecoregions were undertaken. In addition, 12 form ID rbcL clone libraries were generated and analyzed (148 sequences) from Sundarbans Biosphere Reserve representing the world’s largest mangrove ecosystem as part of this study. Global phylogenetic analyses recovered 11 major clades of chromophytic phytoplankton in varying proportions with several novel rbcL sequences in each of the seven targeted ecoregions. Majority of OTUs was found to be exclusive to each ecoregion, whereas some were shared by two or more ecoregions based on beta-diversity analysis. Present phylogenetic and bioinformatics analyses provide a strong statistical support for the hypothesis that different oceanographic regimes harbor distinct and coherent groups of chromophytic phytoplankton. It has been also shown as part of this study that varying natural selection pressure on form ID rbcL gene under different environmental conditions could lead to functional differences and overall fitness of chromophytic phytoplankton populations. PMID:26861415

  11. Small RNA deep sequencing identifies novel and salt-stress-regulated microRNAs from roots of Medicago sativa and Medicago truncatula.

    PubMed

    Long, Rui-Cai; Li, Ming-Na; Kang, Jun-Mei; Zhang, Tie-Jun; Sun, Yan; Yang, Qing-Chuan

    2015-05-01

    Small 21- to 24-nucleotide (nt) ribonucleic acids (RNAs), notably the microRNA (miRNA), are emerging as a posttranscriptional regulation mechanism. Salt stress is one of the primary abiotic stresses that cause the crop losses worldwide. In saline lands, root growth and function of plant are determined by the action of environmental salt stress through specific genes that adapt root development to the restrictive condition. To elucidate the role of miRNAs in salt stress regulation in Medicago, we used a high-throughput sequencing approach to analyze four small RNA libraries from roots of Zhongmu-1 (Medicago sativa) and Jemalong A17 (Medicago truncatula), which were treated with 300 mM NaCl for 0 and 8 h. Each library generated about 20 million short sequences and contained predominantly small RNAs of 24-nt length, followed by 21-nt and 22-nt small RNAs. Using sequence analysis, we identified 385 conserved miRNAs from 96 families, along with 68 novel candidate miRNAs. Of all the 68 predicted novel miRNAs, 15 miRNAs were identified to have miRNA*. Statistical analysis on abundance of sequencing read revealed specific miRNA showing contrasting expression patterns between M. sativa and M. truncatula roots, as well as between roots treated for 0 and 8 h. The expression of 10 conserved and novel miRNAs was also quantified by quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR). The miRNA precursor and target genes were predicted by bioinformatics analysis. We concluded that the salt stress related conserved and novel miRNAs may have a large variety of target mRNAs, some of which might play key roles in salt stress regulation of Medicago. © 2014 Scandinavian Plant Physiology Society.

  12. Contribution of crenarchaeal autotrophic ammonia oxidizers to the dark primary production in Tyrrhenian deep waters (Central Mediterranean Sea)

    PubMed Central

    Yakimov, Michail M; Cono, Violetta La; Smedile, Francesco; DeLuca, Thomas H; Juárez, Silvia; Ciordia, Sergio; Fernández, Marisol; Albar, Juan Pablo; Ferrer, Manuel; Golyshin, Peter N; Giuliano, Laura

    2011-01-01

    Mesophilic Crenarchaeota have recently been thought to be significant contributors to nitrogen (N) and carbon (C) cycling. In this study, we examined the vertical distribution of ammonia-oxidizing Crenarchaeota at offshore site in Southern Tyrrhenian Sea. The median value of the crenachaeal cell to amoA gene ratio was close to one suggesting that virtually all deep-sea Crenarchaeota possess the capacity to oxidize ammonia. Crenarchaea-specific genes, nirK and ureC, for nitrite reductase and urease were identified and their affiliation demonstrated the presence of ‘deep-sea' clades distinct from ‘shallow' representatives. Measured deep-sea dark CO2 fixation estimates were comparable to the median value of photosynthetic biomass production calculated for this area of Tyrrhenian Sea, pointing to the significance of this process in the C cycle of aphotic marine ecosystems. To elucidate the pivotal organisms in this process, we targeted known marine crenarchaeal autotrophy-related genes, coding for acetyl-CoA carboxylase (accA) and 4-hydroxybutyryl-CoA dehydratase (4-hbd). As in case of nirK and ureC, these genes are grouped with deep-sea sequences being distantly related to those retrieved from the epipelagic zone. To pair the molecular data with specific functional attributes we performed [14C]HCO3 incorporation experiments followed by analyses of radiolabeled proteins using shotgun proteomics approach. More than 100 oligopeptides were attributed to 40 marine crenarchaeal-specific proteins that are involved in 10 different metabolic processes, including autotrophy. Obtained results provided a clear proof of chemolithoautotrophic physiology of bathypelagic crenarchaeota and indicated that this numerically predominant group of microorganisms facilitate a hitherto unrecognized sink for inorganic C of a global importance. PMID:21209665

  13. Cultivation and diversity of fungi buried in the Baltic Sea sediments

    NASA Astrophysics Data System (ADS)

    Xiao, N.

    2015-12-01

    @font-face { "MS 明朝"; }@font-face { "Century"; }@font-face { "Century"; }@font-face { "@MS 明朝"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0mm 0mm 0.0001pt; text-align: justify; font-size: 12pt; ; }.MsoChpDefault { ; }div.WordSection1 { page: WordSection1; } Studies on molecular biological and cultivation have been done for the prokaryotic microbial community in the deep biosphere. Compare to the prokaryotic community, few attempts have been done for eukaryotic microbial community. Here we report the study on fungi buried in deep-subsurface sediments by approaches of both cultivation and molecular diversity survey. Cultivation targeting fungi has been done using a sequential sediment samples obtained from the Baltic Sea, Landsort Deep site during the IODP expedition 347. 6 culture media with different nutrition and salt concentration have been tried for the fungi cultivation. 50 isolates of fungi were obtained from the sediment samples. The surface sediments showed richness of fungi strains but not for the deep sediments. Internal Transcribed Spacer (ITS) regions of RNA genes were amplified and for the identification of the isolates. The isolates were classified to 11 different genera. Pseudeurotium bakeri was the dominant strain throughout the glacial and interglacial sediments. We also found different representative fungal strains from glacial and interglacial sediments, suggesting the cultivated strains are buried from different sources. The survey of fungal diversity was done by sequencing the 18S RNA genes in the total DNA extracted from selected sediment samples. Fungi community showed different cluster in the glacial and interglacial sediments.Our results revealed the presence and activity of fungi in the deep biosphere of the Baltic sea and provided evidence of fungal community response to the climate change.

  14. Toward a real-time system for temporal enhanced ultrasound-guided prostate biopsy.

    PubMed

    Azizi, Shekoofeh; Van Woudenberg, Nathan; Sojoudi, Samira; Li, Ming; Xu, Sheng; Abu Anas, Emran M; Yan, Pingkun; Tahmasebi, Amir; Kwak, Jin Tae; Turkbey, Baris; Choyke, Peter; Pinto, Peter; Wood, Bradford; Mousavi, Parvin; Abolmaesumi, Purang

    2018-03-27

    We have previously proposed temporal enhanced ultrasound (TeUS) as a new paradigm for tissue characterization. TeUS is based on analyzing a sequence of ultrasound data with deep learning and has been demonstrated to be successful for detection of cancer in ultrasound-guided prostate biopsy. Our aim is to enable the dissemination of this technology to the community for large-scale clinical validation. In this paper, we present a unified software framework demonstrating near-real-time analysis of ultrasound data stream using a deep learning solution. The system integrates ultrasound imaging hardware, visualization and a deep learning back-end to build an accessible, flexible and robust platform. A client-server approach is used in order to run computationally expensive algorithms in parallel. We demonstrate the efficacy of the framework using two applications as case studies. First, we show that prostate cancer detection using near-real-time analysis of RF and B-mode TeUS data and deep learning is feasible. Second, we present real-time segmentation of ultrasound prostate data using an integrated deep learning solution. The system is evaluated for cancer detection accuracy on ultrasound data obtained from a large clinical study with 255 biopsy cores from 157 subjects. It is further assessed with an independent dataset with 21 biopsy targets from six subjects. In the first study, we achieve area under the curve, sensitivity, specificity and accuracy of 0.94, 0.77, 0.94 and 0.92, respectively, for the detection of prostate cancer. In the second study, we achieve an AUC of 0.85. Our results suggest that TeUS-guided biopsy can be potentially effective for the detection of prostate cancer.

  15. Contribution of crenarchaeal autotrophic ammonia oxidizers to the dark primary production in Tyrrhenian deep waters (Central Mediterranean Sea).

    PubMed

    Yakimov, Michail M; Cono, Violetta La; Smedile, Francesco; DeLuca, Thomas H; Juárez, Silvia; Ciordia, Sergio; Fernández, Marisol; Albar, Juan Pablo; Ferrer, Manuel; Golyshin, Peter N; Giuliano, Laura

    2011-06-01

    Mesophilic Crenarchaeota have recently been thought to be significant contributors to nitrogen (N) and carbon (C) cycling. In this study, we examined the vertical distribution of ammonia-oxidizing Crenarchaeota at offshore site in Southern Tyrrhenian Sea. The median value of the crenachaeal cell to amoA gene ratio was close to one suggesting that virtually all deep-sea Crenarchaeota possess the capacity to oxidize ammonia. Crenarchaea-specific genes, nirK and ureC, for nitrite reductase and urease were identified and their affiliation demonstrated the presence of 'deep-sea' clades distinct from 'shallow' representatives. Measured deep-sea dark CO(2) fixation estimates were comparable to the median value of photosynthetic biomass production calculated for this area of Tyrrhenian Sea, pointing to the significance of this process in the C cycle of aphotic marine ecosystems. To elucidate the pivotal organisms in this process, we targeted known marine crenarchaeal autotrophy-related genes, coding for acetyl-CoA carboxylase (accA) and 4-hydroxybutyryl-CoA dehydratase (4-hbd). As in case of nirK and ureC, these genes are grouped with deep-sea sequences being distantly related to those retrieved from the epipelagic zone. To pair the molecular data with specific functional attributes we performed [(14)C]HCO(3) incorporation experiments followed by analyses of radiolabeled proteins using shotgun proteomics approach. More than 100 oligopeptides were attributed to 40 marine crenarchaeal-specific proteins that are involved in 10 different metabolic processes, including autotrophy. Obtained results provided a clear proof of chemolithoautotrophic physiology of bathypelagic crenarchaeota and indicated that this numerically predominant group of microorganisms facilitate a hitherto unrecognized sink for inorganic C of a global importance.

  16. Comparison of Travel-Time and Amplitude Measurements for Deep-Focusing Time-Distance Helioseismology

    NASA Astrophysics Data System (ADS)

    Pourabdian, Majid; Fournier, Damien; Gizon, Laurent

    2018-04-01

    The purpose of deep-focusing time-distance helioseismology is to construct seismic measurements that have a high sensitivity to the physical conditions at a desired target point in the solar interior. With this technique, pairs of points on the solar surface are chosen such that acoustic ray paths intersect at this target (focus) point. Considering acoustic waves in a homogeneous medium, we compare travel-time and amplitude measurements extracted from the deep-focusing cross-covariance functions. Using a single-scattering approximation, we find that the spatial sensitivity of deep-focusing travel times to sound-speed perturbations is zero at the target location and maximum in a surrounding shell. This is unlike the deep-focusing amplitude measurements, which have maximum sensitivity at the target point. We compare the signal-to-noise ratio for travel-time and amplitude measurements for different types of sound-speed perturbations, under the assumption that noise is solely due to the random excitation of the waves. We find that, for highly localized perturbations in sound speed, the signal-to-noise ratio is higher for amplitude measurements than for travel-time measurements. We conclude that amplitude measurements are a useful complement to travel-time measurements in time-distance helioseismology.

  17. Dosimetric comparison of moderate deep inspiration breath-hold and free-breathing intensity-modulated radiotherapy for left-sided breast cancer.

    PubMed

    Chi, F; Wu, S; Zhou, J; Li, F; Sun, J; Lin, Q; Lin, H; Guan, X; He, Z

    2015-05-01

    This study determined the dosimetric comparison of moderate deep inspiration breath-hold using active breathing control and free-breathing intensity-modulated radiotherapy (IMRT) after breast-conserving surgery for left-sided breast cancer. Thirty-one patients were enrolled. One free breathe and two moderate deep inspiration breath-hold images were obtained. A field-in-field-IMRT free-breathing plan and two field-in-field-IMRT moderate deep inspiration breath-holding plans were compared in the dosimetry to target volume coverage of the glandular breast tissue and organs at risks for each patient. The breath-holding time under moderate deep inspiration extended significantly after breathing training (P<0.05). There was no significant difference between the free-breathing and moderate deep inspiration breath-holding in the target volume coverage. The volume of the ipsilateral lung in the free-breathing technique were significantly smaller than the moderate deep inspiration breath-holding techniques (P<0.05); however, there was no significant difference between the two moderate deep inspiration breath-holding plans. There were no significant differences in target volume coverage between the three plans for the field-in-field-IMRT (all P>0.05). The dose to ipsilateral lung, coronary artery and heart in the field-in-field-IMRT were significantly lower for the free-breathing plan than for the two moderate deep inspiration breath-holding plans (all P<0.05); however, there was no significant difference between the two moderate deep inspiration breath-holding plans. The whole-breast field-in-field-IMRT under moderate deep inspiration breath-hold with active breathing control after breast-conserving surgery in left-sided breast cancer can reduce the irradiation volume and dose to organs at risks. There are no significant differences between various moderate deep inspiration breath-holding states in the dosimetry of irradiation to the field-in-field-IMRT target volume coverage and organs at risks. Copyright © 2015 Société française de radiothérapie oncologique (SFRO). Published by Elsevier SAS. All rights reserved.

  18. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.

  19. Complete genome sequence of a novel genotype of squash mosaic virus

    USDA-ARS?s Scientific Manuscript database

    Complete genome sequence of a novel genotype of Squash mosaic virus (SqMV) infecting squash plants in Spain was obtained using deep sequencing of small ribonucleic acids and assembly. The low nucleotide sequence identities, with 87-88% on RNA1 and 84-86% on RNA2 to known SqMV isolates, suggest a new...

  20. First complete genome sequence of an emerging cucumber green mottle mosaic virus isolate in North America

    USDA-ARS?s Scientific Manuscript database

    The complete genome sequence (6,423 nt) of an emerging Cucumber green mottle mosaic virus (CGMMV) isolate on cucumber in North America was determined through deep sequencing of sRNA and rapid amplification of cDNA ends. It shares 99% nucleotide sequence identity to the Asian genotype, but only 90% t...

  1. Combined hairpin-antisense compositions and methods for modulating expression

    DOEpatents

    Shanklin, John; Nguyen, Tam

    2014-08-05

    A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.

  2. Combined hairpin-antisense compositions and methods for modulating expression

    DOEpatents

    Shanklin, John; Nguyen, Tam Huu

    2015-11-24

    A nucleotide construct comprising a nucleotide sequence that forms a stem and a loop, wherein the loop comprises a nucleotide sequence that modulates expression of a target, wherein the stem comprises a nucleotide sequence that modulates expression of a target, and wherein the target modulated by the nucleotide sequence in the loop and the target modulated by the nucleotide sequence in the stem may be the same or different. Vectors, methods of regulating target expression, methods of providing a cell, and methods of treating conditions comprising the nucleotide sequence are also disclosed.

  3. A Comprehensive Phylogenetic Analysis of the Scleractinia (Cnidaria, Anthozoa) Based on Mitochondrial CO1 Sequence Data

    PubMed Central

    Kitahara, Marcelo V.; Cairns, Stephen D.; Stolarski, Jarosław; Blair, David; Miller, David J.

    2010-01-01

    Background Classical morphological taxonomy places the approximately 1400 recognized species of Scleractinia (hard corals) into 27 families, but many aspects of coral evolution remain unclear despite the application of molecular phylogenetic methods. In part, this may be a consequence of such studies focusing on the reef-building (shallow water and zooxanthellate) Scleractinia, and largely ignoring the large number of deep-sea species. To better understand broad patterns of coral evolution, we generated molecular data for a broad and representative range of deep sea scleractinians collected off New Caledonia and Australia during the last decade, and conducted the most comprehensive molecular phylogenetic analysis to date of the order Scleractinia. Methodology Partial (595 bp) sequences of the mitochondrial cytochrome oxidase subunit 1 (CO1) gene were determined for 65 deep-sea (azooxanthellate) scleractinians and 11 shallow-water species. These new data were aligned with 158 published sequences, generating a 234 taxon dataset representing 25 of the 27 currently recognized scleractinian families. Principal Findings/Conclusions There was a striking discrepancy between the taxonomic validity of coral families consisting predominantly of deep-sea or shallow-water species. Most families composed predominantly of deep-sea azooxanthellate species were monophyletic in both maximum likelihood and Bayesian analyses but, by contrast (and consistent with previous studies), most families composed predominantly of shallow-water zooxanthellate taxa were polyphyletic, although Acroporidae, Poritidae, Pocilloporidae, and Fungiidae were exceptions to this general pattern. One factor contributing to this inconsistency may be the greater environmental stability of deep-sea environments, effectively removing taxonomic “noise” contributed by phenotypic plasticity. Our phylogenetic analyses imply that the most basal extant scleractinians are azooxanthellate solitary corals from deep-water, their divergence predating that of the robust and complex corals. Deep-sea corals are likely to be critical to understanding anthozoan evolution and the origins of the Scleractinia. PMID:20628613

  4. Comprehensive discovery of noncoding RNAs in acute myeloid leukemia cell transcriptomes.

    PubMed

    Zhang, Jin; Griffith, Malachi; Miller, Christopher A; Griffith, Obi L; Spencer, David H; Walker, Jason R; Magrini, Vincent; McGrath, Sean D; Ly, Amy; Helton, Nichole M; Trissal, Maria; Link, Daniel C; Dang, Ha X; Larson, David E; Kulkarni, Shashikant; Cordes, Matthew G; Fronick, Catrina C; Fulton, Robert S; Klco, Jeffery M; Mardis, Elaine R; Ley, Timothy J; Wilson, Richard K; Maher, Christopher A

    2017-11-01

    To detect diverse and novel RNA species comprehensively, we compared deep small RNA and RNA sequencing (RNA-seq) methods applied to a primary acute myeloid leukemia (AML) sample. We were able to discover previously unannotated small RNAs using deep sequencing of a library method using broader insert size selection. We analyzed the long noncoding RNA (lncRNA) landscape in AML by comparing deep sequencing from multiple RNA-seq library construction methods for the sample that we studied and then integrating RNA-seq data from 179 AML cases. This identified lncRNAs that are completely novel, differentially expressed, and associated with specific AML subtypes. Our study revealed the complexity of the noncoding RNA transcriptome through a combined strategy of strand-specific small RNA and total RNA-seq. This dataset will serve as an invaluable resource for future RNA-based analyses. Copyright © 2017 ISEH – Society for Hematology and Stem Cells. Published by Elsevier Inc. All rights reserved.

  5. Analysis of Ribosome Stalling and Translation Elongation Dynamics by Deep Learning.

    PubMed

    Zhang, Sai; Hu, Hailin; Zhou, Jingtian; He, Xuan; Jiang, Tao; Zeng, Jianyang

    2017-09-27

    Ribosome stalling is manifested by the local accumulation of ribosomes at specific codon positions of mRNAs. Here, we present ROSE, a deep learning framework to analyze high-throughput ribosome profiling data and estimate the probability of a ribosome stalling event occurring at each genomic location. Extensive validation tests on independent data demonstrated that ROSE possessed higher prediction accuracy than conventional prediction models, with an increase in the area under the receiver operating characteristic curve by up to 18.4%. In addition, genome-wide statistical analyses showed that ROSE predictions can be well correlated with diverse putative regulatory factors of ribosome stalling. Moreover, the genome-wide ribosome stalling landscapes of both human and yeast computed by ROSE recovered the functional interplays between ribosome stalling and cotranslational events in protein biogenesis, including protein targeting by the signal recognition particles and protein secondary structure formation. Overall, our study provides a novel method to complement the ribosome profiling techniques and further decipher the complex regulatory mechanisms underlying translation elongation dynamics encoded in the mRNA sequence. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Reactive Sequencing for Autonomous Navigation Evolving from Phoenix Entry, Descent, and Landing

    NASA Technical Reports Server (NTRS)

    Grasso, Christopher A.; Riedel, Joseph E.; Vaughan, Andrew T.

    2010-01-01

    Virtual Machine Language (VML) is an award-winning advanced procedural sequencing language in use on NASA deep-space missions since 1997, and was used for the successful entry, descent, and landing (EDL) of the Phoenix spacecraft onto the surface of Mars. Phoenix EDL utilized a state-oriented operations architecture which executed within the constraints of the existing VML 2.0 flight capability, compatible with the linear "land or die" nature of the mission. The intricacies of Phoenix EDL included the planned discarding of portions of the vehicle, the complex communications management for relay through on-orbit assets, the presence of temporally indeterminate physical events, and the need to rapidly catch up four days of sequencing should a reboot of the spacecraft flight computer occur shortly before atmospheric entry. These formidable operational challenges led to new techniques for packaging and coordinating reusable sequences called blocks using one-way synchronization via VML sequencing global variable events. The coordinated blocks acted as an ensemble to land the spacecraft, while individually managing various elements in as simple a fashion as possible. This paper outlines prototype VML 2.1 flight capabilities that have evolved from the one-way synchronization techniques in order to implement even more ambitious autonomous mission capabilities. Target missions for these new capabilities include autonomous touch-and-go sampling of cometary and asteroidal bodies, lunar landing of robotic missions, and ultimately landing of crewed lunar vehicles. Close proximity guidance, navigation, and control operations, on-orbit rendezvous, and descent and landing events featured in these missions require elaborate abort capability, manifesting highly non-linear scenarios that are so complex as to overtax traditional sequencing, or even the sort of one-way coordinated sequencing used during EDL. Foreseeing advanced command and control needs for small body and lunar landing guidance, navigation and control scenarios, work began three years ago on substantial upgrades to VML that are now being exercised in scenarios for lunar landing and comet/asteroid rendezvous. The advanced state-based approach includes coordinated state transition machines with distributed decision-making logic. These state machines are not merely sequences - they are reactive logic constructs capable of autonomous decision making within a well-defined domain. Combined with the JPL's AutoNav software used on Deep Space 1 and Deep Impact, the system allows spacecraft to autonomously navigate to an unmapped surface, soft-contact, and either land or ascend. The state machine architecture enabled by VML 2.1 has successfully performed sampling missions and lunar descent missions in a simulated environment, and is progressing toward flight capability. The authors are also investigating using the VML 2.1 flight director architecture to perform autonomous activities like rendezvous with a passive hypothetical Mars sample return capsule. The approach being pursued is similar to the touch-and-go sampling state machines, with the added complications associated with the search for, physical capture of, and securing of a separate spacecraft. Complications include optically finding and tracking the Orbiting Sample Capsule (OSC), keeping the OSC illuminated, making orbital adjustments, and physically capturing the OSC. Other applications could include autonomous science collection and fault compensation.

  7. Comparative miRNAs analysis of Two contrasting broccoli inbred lines with divergent head-forming capacity under temperature stress.

    PubMed

    Chen, Chi-Chien; Fu, Shih-Feng; Norikazu, Monma; Yang, Yau-Wen; Liu, Yu-Ju; Ikeo, Kazuho; Gojobori, Takashi; Huang, Hao-Jen

    2015-12-01

    MicroRNAs (miRNAs) play a vital role in growth, development, and stress response at the post-transcriptional level. Broccoli (Brassica oleracea L. var italic) is an important vegetable crop, and the yield and quality of broccoli are decreased by heat stress. The broccoli inbred lines that are capable of producing head at high temperature in summer are unique varieties in Taiwan. However, knowledge of miRNAomes during the broccoli head formation under heat stress is limited. In this study, molecular characterization of two nearly isogenic lines with contrasting head-forming capacity was investigated. Head-forming capacity was better for heat-tolerant (HT) than heat-sensitive (HS) broccoli under heat stress. By deep sequencing and computational analysis, 20 known miRNAs showed significant differential expression between HT and HS genotypes. According to the criteria for annotation of new miRNAs, 24 novel miRNA sequences with differential expression between the two genotypes were identified. To gain insight into functional significance, 213 unique potential targets of these 44 differentially expressed miRNAs were predicted. These targets were implicated in shoot apical development, phase change, response to temperature stimulus, hormone and energy metabolism. The head-forming capacity of the unique HT line was related to autonomous regulation of Bo-FT genes and less expression level of heat shock protein genes as compared to HS. For the genotypic comparison, a set of miRNAs and their targets had consistent expression patterns in various HT genotypes. This large-scale characterization of broccoli miRNAs and their potential targets is to unravel the regulatory roles of miRNAs underlying heat-tolerant head-forming capacity.

  8. Investigating uncultured microbes and their role in a deep subseafloor ammonium sink

    NASA Astrophysics Data System (ADS)

    Kirkpatrick, J. B.; Spivack, A. J.; Smith, D. C.; D'Hondt, S. L.

    2013-12-01

    The marine deep biosphere is thought to hold a large reservoir of both microbial cells and untapped genetic diversity. One potential driving force behind the vast amount of uncultured organisms are unconventional redox pairs which may not be favorable at benchtop conditions, but can support life in other circumstances. One instance of this is the previously documented thermodynamic favorability of ammonium oxidation with sulfate in sediments such as those investigated here from the Indian Ocean. Using 454 tag sequencing of 16S DNA, we identified uncultured archaea and bacteria potentially playing key roles at the sulfate and ammonium interface. First, the phylogenetic identity of organisms potentially involved in this reaction is inferred, as well as thermodynamic considerations of potential pathways. Several novel phyla, as well as Clostridiales, appear over-represented at the reaction zone. Secondly, to understand the metabolic capability of these target organisms, these sequences have been cross-referenced with assemblies from metagenomic data sets, and connections to functional genes are being elucidated. Finally, we discuss parallels with near-shore coastal sediment from Narragansett Bay, Rhode Island, where geochemical similarities have been found. While the thermodynamic regime is similar to the Indian Ocean, suggesting the potential for a broad geographic distribution, accessibility provides the opportunity to construct bioreactors to test rates and pathways of ammonium and sulfate fluxes. Iron content may be a key factor in determining reaction favorability. We present ongoing work in this area and the pros and cons of different bioreactor designs.

  9. Intelligent fault diagnosis of rolling bearings using an improved deep recurrent neural network

    NASA Astrophysics Data System (ADS)

    Jiang, Hongkai; Li, Xingqiu; Shao, Haidong; Zhao, Ke

    2018-06-01

    Traditional intelligent fault diagnosis methods for rolling bearings heavily depend on manual feature extraction and feature selection. For this purpose, an intelligent deep learning method, named the improved deep recurrent neural network (DRNN), is proposed in this paper. Firstly, frequency spectrum sequences are used as inputs to reduce the input size and ensure good robustness. Secondly, DRNN is constructed by the stacks of the recurrent hidden layer to automatically extract the features from the input spectrum sequences. Thirdly, an adaptive learning rate is adopted to improve the training performance of the constructed DRNN. The proposed method is verified with experimental rolling bearing data, and the results confirm that the proposed method is more effective than traditional intelligent fault diagnosis methods.

  10. Position-specific binding of FUS to nascent RNA regulates mRNA length

    PubMed Central

    Masuda, Akio; Takeda, Jun-ichi; Okuno, Tatsuya; Okamoto, Takaaki; Ohkawara, Bisei; Ito, Mikako; Ishigaki, Shinsuke; Sobue, Gen

    2015-01-01

    More than half of all human genes produce prematurely terminated polyadenylated short mRNAs. However, the underlying mechanisms remain largely elusive. CLIP-seq (cross-linking immunoprecipitation [CLIP] combined with deep sequencing) of FUS (fused in sarcoma) in neuronal cells showed that FUS is frequently clustered around an alternative polyadenylation (APA) site of nascent RNA. ChIP-seq (chromatin immunoprecipitation [ChIP] combined with deep sequencing) of RNA polymerase II (RNAP II) demonstrated that FUS stalls RNAP II and prematurely terminates transcription. When an APA site is located upstream of an FUS cluster, FUS enhances polyadenylation by recruiting CPSF160 and up-regulates the alternative short transcript. In contrast, when an APA site is located downstream from an FUS cluster, polyadenylation is not activated, and the RNAP II-suppressing effect of FUS leads to down-regulation of the alternative short transcript. CAGE-seq (cap analysis of gene expression [CAGE] combined with deep sequencing) and PolyA-seq (a strand-specific and quantitative method for high-throughput sequencing of 3' ends of polyadenylated transcripts) revealed that position-specific regulation of mRNA lengths by FUS is operational in two-thirds of transcripts in neuronal cells, with enrichment in genes involved in synaptic activities. PMID:25995189

  11. Deep sequencing detects very-low-grade somatic mosaicism in the unaffected mother of siblings with nemaline myopathy.

    PubMed

    Miyatake, Satoko; Koshimizu, Eriko; Hayashi, Yukiko K; Miya, Kazushi; Shiina, Masaaki; Nakashima, Mitsuko; Tsurusaki, Yoshinori; Miyake, Noriko; Saitsu, Hirotomo; Ogata, Kazuhiro; Nishino, Ichizo; Matsumoto, Naomichi

    2014-07-01

    When an expected mutation in a particular disease-causing gene is not identified in a suspected carrier, it is usually assumed to be due to germline mosaicism. We report here very-low-grade somatic mosaicism in ACTA1 in an unaffected mother of two siblings affected with a neonatal form of nemaline myopathy. The mosaicism was detected by deep resequencing using a next-generation sequencer. We identified a novel heterozygous mutation in ACTA1, c.448A>G (p.Thr150Ala), in the affected siblings. Three-dimensional structural modeling suggested that this mutation may affect polymerization and/or actin's interactions with other proteins. In this family, we expected autosomal dominant inheritance with either parent demonstrating germline or somatic mosaicism. Sanger sequencing identified no mutation. However, further deep resequencing of this mutation on a next-generation sequencer identified very-low-grade somatic mosaicism in the mother: 0.4%, 1.1%, and 8.3% in the saliva, blood leukocytes, and nails, respectively. Our study demonstrates the possibility of very-low-grade somatic mosaicism in suspected carriers, rather than germline mosaicism. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. Insights about minority HIV-1 strains in transmitted drug resistance mutation dynamics and disease progression.

    PubMed

    Leda, Ana Rachel; Hunter, James; Oliveira, Ursula Castro; Azevedo, Inacio Junqueira; Sucupira, Maria Cecilia Araripe; Diaz, Ricardo Sobhie

    2018-04-19

    The presence of minority transmitted drug resistance mutations was assessed using ultra-deep sequencing and correlated with disease progression among recently HIV-1-infected individuals from Brazil. Samples at baseline during recent infection and 1 year after the establishment of the infection were analysed. Viral RNA and proviral DNA from 25 individuals were subjected to ultra-deep sequencing of the reverse transcriptase and protease regions of HIV-1. Viral strains carrying transmitted drug resistance mutations were detected in 9 out of the 25 patients, for all major antiretroviral classes, ranging from one to five mutations per patient. Ultra-deep sequencing detected strains with frequencies as low as 1.6% and only strains with frequencies >20% were detected by population plasma sequencing (three patients). Transmitted drug resistance strains with frequencies <14.8% did not persist upon established infection. The presence of transmitted drug resistance mutations was negatively correlated with the viral load and with CD4+ T cell count decay. Transmitted drug resistance mutations representing small percentages of the viral population do not persist during infection because they are negatively selected in the first year after HIV-1 seroconversion.

  13. GenomeGems: evaluation of genetic variability from deep sequencing data

    PubMed Central

    2012-01-01

    Background Detection of disease-causing mutations using Deep Sequencing technologies possesses great challenges. In particular, organizing the great amount of sequences generated so that mutations, which might possibly be biologically relevant, are easily identified is a difficult task. Yet, for this assignment only limited automatic accessible tools exist. Findings We developed GenomeGems to gap this need by enabling the user to view and compare Single Nucleotide Polymorphisms (SNPs) from multiple datasets and to load the data onto the UCSC Genome Browser for an expanded and familiar visualization. As such, via automatic, clear and accessible presentation of processed Deep Sequencing data, our tool aims to facilitate ranking of genomic SNP calling. GenomeGems runs on a local Personal Computer (PC) and is freely available at http://www.tau.ac.il/~nshomron/GenomeGems. Conclusions GenomeGems enables researchers to identify potential disease-causing SNPs in an efficient manner. This enables rapid turnover of information and leads to further experimental SNP validation. The tool allows the user to compare and visualize SNPs from multiple experiments and to easily load SNP data onto the UCSC Genome browser for further detailed information. PMID:22748151

  14. Feasibility of 3.0T pelvic MR imaging in the evaluation of endometriosis.

    PubMed

    Manganaro, L; Fierro, F; Tomei, A; Irimia, D; Lodise, P; Sergi, M E; Vinci, V; Sollazzo, P; Porpora, M G; Delfini, R; Vittori, G; Marini, M

    2012-06-01

    Endometriosis represents an important clinical problem in women of reproductive age with high impact on quality of life, work productivity and health care management. The aim of this study is to define the role of 3T magnetom system MRI in the evaluation of endometriosis. Forty-six women, with transvaginal (TV) ultrasound examination positive for endometriosis, with pelvic pain, or infertile underwent an MR 3.0T examination with the following protocol: T2 weighted FRFSE HR sequences, T2 weighted FRFSE HR CUBE 3D sequences, T1 w FSE sequences, LAVA-flex sequences. Pelvic anatomy, macroscopic endometriosis implants, deep endometriosis implants, fallopian tube involvement, adhesions presence, fluid effusion in Douglas pouch, uterus and kidney pathologies or anomalies associated and sacral nervous routes were considered by two radiologists in consensus. Laparoscopy was considered the gold standard. MRI imaging diagnosed deep endometriosis in 22/46 patients, endometriomas not associated to deep implants in 9/46 patients, 15/46 patients resulted negative for endometriosis, 11 of 22 patients with deep endometriosis reported ovarian endometriosis cyst. We obtained high percentages of sensibility (96.97%), specificity (100.00%), VPP (100.00%), VPN (92.86%). Pelvic MRI performed with 3T system guarantees high spatial and contrast resolution, providing accurate information about endometriosis implants, with a good pre-surgery mapping of the lesions involving both bowels and bladder surface and recto-uterine ligaments. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  15. Deep magnetic capture of magnetically loaded cells for spatially targeted therapeutics.

    PubMed

    Huang, Zheyong; Pei, Ning; Wang, Yanyan; Xie, Xinxing; Sun, Aijun; Shen, Li; Zhang, Shuning; Liu, Xuebo; Zou, Yunzeng; Qian, Juying; Ge, Junbo

    2010-03-01

    Magnetic targeting has recently demonstrated potential in promoting magnetically loaded cell delivery to target lesion, but its application is limited by magnetic attenuation. For deep magnetic capture of cells for spatial targeting therapeutics, we designed a magnetic pole, in which the magnetic field density can be focused at a distance from the pole. As flowing through a tube served as a model of blood vessels, the magnetically loaded mesenchymal stem cells (MagMSCs) were highly enriched at the site distance from the magnetic pole. The cell capture efficiency was positively influenced by the magnetic flux density, and inversely influenced by the flow velocity, and well-fitted with the deductive value by theoretical considerations. It appeared to us that the spatially-focused property of the magnetic apparatus promises a new deep targeting strategy to promote homing and engraftment for cellular therapy. Copyright (c) 2009 Elsevier Ltd. All rights reserved.

  16. Occurrence Prospect of HDR and Target Site Selection Study in Southeastern of China

    NASA Astrophysics Data System (ADS)

    Lin, W.; Gan, H.

    2017-12-01

    Hot dry rock (HDR) geothermal resource is one of the most important clean energy in future. Site selection a HDR resource is a fundamental work to explore the HDR resources. This paper compiled all the HDR development projects domestic and abroad, and summarized the location of HDR geothermal geological index. After comparing the geological background of HDR in the southeast coastal area of China, Yangjiang Xinzhou in Guangdong province, Leizhou Peninsula area, Lingshui in Hainan province and Huangshadong in Guangzhou were selected from some key potential target area along the southeast coast of China. Deep geothermal field model of the study area is established based on the comprehensive analysis of the target area of deep geothermal geological background and deep thermal anomalies. This paper also compared the hot dry rock resources target locations, and proposed suggestions for the priority exploration target area and exploration scheme.

  17. Identification and characterization of microRNAs in white and brown alpaca skin

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are small, non-coding 21–25 nt RNA molecules that play an important role in regulating gene expression. Little is known about the expression profiles and functions of miRNAs in skin and their role in pigmentation. Alpacas have more than 22 natural coat colors, more than any other fiber producing species. To better understand the role of miRNAs in control of coat color we performed a comprehensive analysis of miRNA expression profiles in skin of white versus brown alpacas. Results Two small RNA libraries from white alpaca (WA) and brown alpaca (BA) skin were sequenced with the aid of Illumina sequencing technology. 272 and 267 conserved miRNAs were obtained from the WA and BA skin libraries, respectively. Of these conserved miRNAs, 35 and 13 were more abundant in WA and BA skin, respectively. The targets of these miRNAs were predicted and grouped based on Gene Ontology and KEGG pathway analysis. Many predicted target genes for these miRNAs are involved in the melanogenesis pathway controlling pigmentation. In addition to the conserved miRNAs, we also obtained 22 potentially novel miRNAs from the WA and BA skin libraries. Conclusion This study represents the first comprehensive survey of miRNAs expressed in skin of animals of different coat colors by deep sequencing analysis. We discovered a collection of miRNAs that are differentially expressed in WA and BA skin. The results suggest important potential functions of miRNAs in coat color regulation. PMID:23067000

  18. Deep sequencing leads to the identification of eukaryotic translation initiation factor 5A as a key element in Rsv1-mediated lethal systemic hypersensitive response to Soybean mosaic virus infection in soybean.

    PubMed

    Chen, Hui; Adam Arsovski, Andrej; Yu, Kangfu; Wang, Aiming

    2017-04-01

    Rsv1, a single dominant resistance locus in soybean, confers extreme resistance to the majority of Soybean mosaic virus (SMV) strains, but is susceptible to the G7 strain. In Rsv1-genotype soybean, G7 infection provokes a lethal systemic hypersensitive response (LSHR), a delayed host defence response. The Rsv1-mediated LSHR signalling pathway remains largely unknown. In this study, we employed a genome-wide investigation to gain an insight into the molecular interplay between SMV G7 and Rsv1-genotype soybean. Small RNA (sRNA), degradome and transcriptome sequencing analyses were used to identify differentially expressed genes (DEGs) and microRNAs (DEMs) in response to G7 infection. A number of DEGs, DEMs and microRNA targets, and the interaction network of DEMs and their target mRNAs responsive to G7 infection, were identified. Knock-down of one of the identified DEGs, the eukaryotic translation initiation factor 5A (eIF5A), diminished the LSHR and enhanced viral accumulation, suggesting the essential role of eIF5A in the G7-induced, Rsv1-mediated LSHR signalling pathway. This work provides an in-depth genome-wide analysis of high-throughput sequencing data, and identifies multiple genes and microRNA signatures that are associated with the Rsv1-mediated LSHR. © 2016 HER MAJESTY THE QUEEN IN RIGHT OF CANADA MOLECULAR PLANT PATHOLOGY © 2016 BSPP AND JOHN WILEY & SONS LTD.

  19. Estimating Exceptionally Rare Germline and Somatic Mutation Frequencies via Next Generation Sequencing

    PubMed Central

    Yoon, Song-Ro; Arnheim, Norman; Calabrese, Peter

    2016-01-01

    We used targeted next generation deep-sequencing (Safe Sequencing System) to measure ultra-rare de novo mutation frequencies in the human male germline by attaching a unique identifier code to each target DNA molecule. Segments from three different human genes (FGFR3, MECP2 and PTPN11) were studied. Regardless of the gene segment, the particular testis donor or the 73 different testis pieces used, the frequencies for any one of the six different mutation types were consistent. Averaging over the C>T/G>A and G>T/C>A mutation types the background mutation frequency was 2.6x10-5 per base pair, while for the four other mutation types the average background frequency was lower at 1.5x10-6 per base pair. These rates far exceed the well documented human genome average frequency per base pair (~10−8) suggesting a non-biological explanation for our data. By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles. Finally, we looked at a previously studied disease mutation in the PTPN11 gene and could easily distinguish true mutations from the SSS background. We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments. PMID:27341568

  20. ComplexContact: a web server for inter-protein contact prediction using deep learning.

    PubMed

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-05-22

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  1. Mixed heterolobosean and novel gregarine lineage genes from culture ATCC 50646: Long-branch artefacts, not lateral gene transfer, distort α-tubulin phylogeny.

    PubMed

    Cavalier-Smith, Thomas

    2015-04-01

    Contradictory and confusing results can arise if sequenced 'monoprotist' samples really contain DNA of very different species. Eukaryote-wide phylogenetic analyses using five genes from the amoeboflagellate culture ATCC 50646 previously implied it was an undescribed percolozoan related to percolatean flagellates (Stephanopogon, Percolomonas). Contrastingly, three phylogenetic analyses of 18S rRNA alone, did not place it within Percolozoa, but as an isolated deep-branching excavate. I resolve that contradiction by sequence phylogenies for all five genes individually, using up to 652 taxa. Its 18S rRNA sequence (GQ377652) is near-identical to one from stained-glass windows, somewhat more distant from one from cooling-tower water, all three related to terrestrial actinocephalid gregarines Hoplorhynchus and Pyxinia. All four protein-gene sequences (Hsp90; α-tubulin; β-tubulin; actin) are from an amoeboflagellate heterolobosean percolozoan, not especially deeply branching. Contrary to previous conclusions from trees combining protein and rRNA sequences or rDNA trees including Eozoa only, this culture does not represent a major novel deep-branching eukaryote lineage distinct from Heterolobosea, and thus lacks special significance for deep eukaryote phylogeny, though the rDNA sequence is important for gregarine phylogeny. α-Tubulin trees for over 250 eukaryotes refute earlier suggestions of lateral gene transfer within eukaryotes, being largely congruent with morphology and other gene trees. Copyright © 2015. Published by Elsevier GmbH.

  2. Identification and Expression Analyses of miRNAs from Two Contrasting Flower Color Cultivars of Canna by Deep Sequencing.

    PubMed

    Roy, Sribash; Tripathi, Abhinandan Mani; Yadav, Amrita; Mishra, Parneeta; Nautiyal, Chandra Shekhar

    2016-01-01

    miRNAs are endogenous small RNA (sRNA) that play critical roles in plant development processes. Canna is an ornamental plant belonging to family Cannaceae. Here, we report for the first time the identification and differential expression of miRNAs in two contrasting flower color cultivars of Canna, Tropical sunrise and Red president. A total of 313 known miRNAs belonging to 78 miRNA families were identified from both the cultivars. Thirty one miRNAs (17 miRNA families) were specific to Tropical sunrise and 43 miRNAs (10 miRNA families) were specific to Red president. Thirty two and 18 putative new miRNAs were identified from Tropical sunrise and Red president, respectively. One hundred and nine miRNAs were differentially expressed in the two cultivars targeting 1343 genes. Among these, 16 miRNAs families targeting 60 genes were involved in flower development related traits and five miRNA families targeting five genes were involved in phenyl propanoid and pigment metabolic processes. We further validated the expression analysis of a few miRNA and their target genes by qRT-PCR. Transcription factors were the major miRNA targets identified. Target validation of a few randomly selected miRNAs by RLM-RACE was performed but was successful with only miR162. These findings will help in understanding flower development processes, particularly the color development in Canna.

  3. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shi, CY; Yang, H; Wei, CL

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled intomore » 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.« less

  4. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    PubMed Central

    2011-01-01

    Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). Conclusions An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis. PMID:21356090

  5. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    PubMed

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  6. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  7. Construction of Pseudomolecule Sequences of the aus Rice Cultivar Kasalath for Comparative Genomics of Asian Cultivated Rice

    PubMed Central

    Sakai, Hiroaki; Kanamori, Hiroyuki; Arai-Kichise, Yuko; Shibata-Hatta, Mari; Ebana, Kaworu; Oono, Youko; Kurita, Kanako; Fujisawa, Hiroko; Katagiri, Satoshi; Mukai, Yoshiyuki; Hamada, Masao; Itoh, Takeshi; Matsumoto, Takashi; Katayose, Yuichi; Wakasa, Kyo; Yano, Masahiro; Wu, Jianzhong

    2014-01-01

    Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone. PMID:24578372

  8. Unified Deep Learning Architecture for Modeling Biology Sequence.

    PubMed

    Wu, Hongjie; Cao, Chengyuan; Xia, Xiaoyan; Lu, Qiang

    2017-10-09

    Prediction of the spatial structure or function of biological macromolecules based on their sequence remains an important challenge in bioinformatics. When modeling biological sequences using traditional sequencing models, characteristics, such as long-range interactions between basic units, the complicated and variable output of labeled structures, and the variable length of biological sequences, usually lead to different solutions on a case-by-case basis. This study proposed the use of bidirectional recurrent neural networks based on long short-term memory or a gated recurrent unit to capture long-range interactions by designing the optional reshape operator to adapt to the diversity of the output labels and implementing a training algorithm to support the training of sequence models capable of processing variable-length sequences. Additionally, the merge and pooling operators enhanced the ability to capture short-range interactions between basic units of biological sequences. The proposed deep-learning model and its training algorithm might be capable of solving currently known biological sequence-modeling problems through the use of a unified framework. We validated our model on one of the most difficult biological sequence-modeling problems currently known, with our results indicating the ability of the model to obtain predictions of protein residue interactions that exceeded the accuracy of current popular approaches by 10% based on multiple benchmarks.

  9. Deep feature extraction and combination for synthetic aperture radar target classification

    NASA Astrophysics Data System (ADS)

    Amrani, Moussa; Jiang, Feng

    2017-10-01

    Feature extraction has always been a difficult problem in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very important to select discriminative features to train a classifier, which is a prerequisite. Inspired by the great success of convolutional neural network (CNN), we address the problem of SAR target classification by proposing a feature extraction method, which takes advantage of exploiting the extracted deep features from CNNs on SAR images to introduce more powerful discriminative features and robust representation ability for them. First, the pretrained VGG-S net is fine-tuned on moving and stationary target acquisition and recognition (MSTAR) public release database. Second, after a simple preprocessing is performed, the fine-tuned network is used as a fixed feature extractor to extract deep features from the processed SAR images. Third, the extracted deep features are fused by using a traditional concatenation and a discriminant correlation analysis algorithm. Finally, for target classification, K-nearest neighbors algorithm based on LogDet divergence-based metric learning triplet constraints is adopted as a baseline classifier. Experiments on MSTAR are conducted, and the classification accuracy results demonstrate that the proposed method outperforms the state-of-the-art methods.

  10. DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations.

    PubMed

    Yuan, Yuchen; Shi, Yi; Li, Changyang; Kim, Jinman; Cai, Weidong; Han, Zeguang; Feng, David Dagan

    2016-12-23

    With the developments of DNA sequencing technology, large amounts of sequencing data have become available in recent years and provide unprecedented opportunities for advanced association studies between somatic point mutations and cancer types/subtypes, which may contribute to more accurate somatic point mutation based cancer classification (SMCC). However in existing SMCC methods, issues like high data sparsity, small volume of sample size, and the application of simple linear classifiers, are major obstacles in improving the classification performance. To address the obstacles in existing SMCC studies, we propose DeepGene, an advanced deep neural network (DNN) based classifier, that consists of three steps: firstly, the clustered gene filtering (CGF) concentrates the gene data by mutation occurrence frequency, filtering out the majority of irrelevant genes; secondly, the indexed sparsity reduction (ISR) converts the gene data into indexes of its non-zero elements, thereby significantly suppressing the impact of data sparsity; finally, the data after CGF and ISR is fed into a DNN classifier, which extracts high-level features for accurate classification. Experimental results on our curated TCGA-DeepGene dataset, which is a reformulated subset of the TCGA dataset containing 12 selected types of cancer, show that CGF, ISR and DNN all contribute in improving the overall classification performance. We further compare DeepGene with three widely adopted classifiers and demonstrate that DeepGene has at least 24% performance improvement in terms of testing accuracy. Based on deep learning and somatic point mutation data, we devise DeepGene, an advanced cancer type classifier, which addresses the obstacles in existing SMCC studies. Experiments indicate that DeepGene outperforms three widely adopted existing classifiers, which is mainly attributed to its deep learning module that is able to extract the high level features between combinatorial somatic point mutations and cancer types.

  11. Transposable elements in TDP-43-mediated neurodegenerative disorders.

    PubMed

    Li, Wanhe; Jin, Ying; Prazak, Lisa; Hammell, Molly; Dubnau, Josh

    2012-01-01

    Elevated expression of specific transposable elements (TEs) has been observed in several neurodegenerative disorders. TEs also can be active during normal neurogenesis. By mining a series of deep sequencing datasets of protein-RNA interactions and of gene expression profiles, we uncovered extensive binding of TE transcripts to TDP-43, an RNA-binding protein central to amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD). Second, we find that association between TDP-43 and many of its TE targets is reduced in FTLD patients. Third, we discovered that a large fraction of the TEs to which TDP-43 binds become de-repressed in mouse TDP-43 disease models. We propose the hypothesis that TE mis-regulation contributes to TDP-43 related neurodegenerative diseases.

  12. Single molecule targeted sequencing for cancer gene mutation detection.

    PubMed

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui

    2016-05-19

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis.

  13. Cross-species identification of genomic drivers of squamous cell carcinoma development across preneoplastic intermediates

    PubMed Central

    Chitsazzadeh, Vida; Coarfa, Cristian; Drummond, Jennifer A.; Nguyen, Tri; Joseph, Aaron; Chilukuri, Suneel; Charpiot, Elizabeth; Adelmann, Charles H.; Ching, Grace; Nguyen, Tran N.; Nicholas, Courtney; Thomas, Valencia D.; Migden, Michael; MacFarlane, Deborah; Thompson, Erika; Shen, Jianjun; Takata, Yoko; McNiece, Kayla; Polansky, Maxim A.; Abbas, Hussein A.; Rajapakshe, Kimal; Gower, Adam; Spira, Avrum; Covington, Kyle R.; Xiao, Weimin; Gunaratne, Preethi; Pickering, Curtis; Frederick, Mitchell; Myers, Jeffrey N.; Shen, Li; Yao, Hui; Su, Xiaoping; Rapini, Ronald P.; Wheeler, David A.; Hawk, Ernest T.; Flores, Elsa R.; Tsai, Kenneth Y.

    2016-01-01

    Cutaneous squamous cell carcinoma (cuSCC) comprises 15–20% of all skin cancers, accounting for over 700,000 cases in USA annually. Most cuSCC arise in association with a distinct precancerous lesion, the actinic keratosis (AK). To identify potential targets for molecularly targeted chemoprevention, here we perform integrated cross-species genomic analysis of cuSCC development through the preneoplastic AK stage using matched human samples and a solar ultraviolet radiation-driven Hairless mouse model. We identify the major transcriptional drivers of this progression sequence, showing that the key genomic changes in cuSCC development occur in the normal skin to AK transition. Our data validate the use of this ultraviolet radiation-driven mouse cuSCC model for cross-species analysis and demonstrate that cuSCC bears deep molecular similarities to multiple carcinogen-driven SCCs from diverse sites, suggesting that cuSCC may serve as an effective, accessible model for multiple SCC types and that common treatment and prevention strategies may be feasible. PMID:27574101

  14. RNA splicing regulated by RBFOX1 is essential for cardiac function in zebrafish.

    PubMed

    Frese, Karen S; Meder, Benjamin; Keller, Andreas; Just, Steffen; Haas, Jan; Vogel, Britta; Fischer, Simon; Backes, Christina; Matzas, Mark; Köhler, Doreen; Benes, Vladimir; Katus, Hugo A; Rottbauer, Wolfgang

    2015-08-15

    Alternative splicing is one of the major mechanisms through which the proteomic and functional diversity of eukaryotes is achieved. However, the complex nature of the splicing machinery, its associated splicing regulators and the functional implications of alternatively spliced transcripts are only poorly understood. Here, we investigated the functional role of the splicing regulator rbfox1 in vivo using the zebrafish as a model system. We found that loss of rbfox1 led to progressive cardiac contractile dysfunction and heart failure. By using deep-transcriptome sequencing and quantitative real-time PCR, we show that depletion of rbfox1 in zebrafish results in an altered isoform expression of several crucial target genes, such as actn3a and hug. This study underlines that tightly regulated splicing is necessary for unconstrained cardiac function and renders the splicing regulator rbfox1 an interesting target for investigation in human heart failure and cardiomyopathy. © 2015. Published by The Company of Biologists Ltd.

  15. Next-generation libraries for robust RNA interference-based genome-wide screens

    PubMed Central

    Kampmann, Martin; Horlbeck, Max A.; Chen, Yuwen; Tsai, Jordan C.; Bassik, Michael C.; Gilbert, Luke A.; Villalta, Jacqueline E.; Kwon, S. Chul; Chang, Hyeshik; Kim, V. Narry; Weissman, Jonathan S.

    2015-01-01

    Genetic screening based on loss-of-function phenotypes is a powerful discovery tool in biology. Although the recent development of clustered regularly interspaced short palindromic repeats (CRISPR)-based screening approaches in mammalian cell culture has enormous potential, RNA interference (RNAi)-based screening remains the method of choice in several biological contexts. We previously demonstrated that ultracomplex pooled short-hairpin RNA (shRNA) libraries can largely overcome the problem of RNAi off-target effects in genome-wide screens. Here, we systematically optimize several aspects of our shRNA library, including the promoter and microRNA context for shRNA expression, selection of guide strands, and features relevant for postscreen sample preparation for deep sequencing. We present next-generation high-complexity libraries targeting human and mouse protein-coding genes, which we grouped into 12 sublibraries based on biological function. A pilot screen suggests that our next-generation RNAi library performs comparably to current CRISPR interference (CRISPRi)-based approaches and can yield complementary results with high sensitivity and high specificity. PMID:26080438

  16. Reconnaissance of Young M Dwarfs: Locating the Elusive Majority of Nearby Moving Groups

    NASA Astrophysics Data System (ADS)

    Bowler, Brendan; Liu, Michael; Riaz, Basmah; Gizis, John; Shkolnik, Evgenya

    2013-08-01

    With ages between ~8-120 Myr and distances lsim;80 pc, young moving group members make excellent targets for detailed studies of pre-main sequence evolution and exoplanet imaging surveys. We propose a multi-semester spectroscopic program to confirm our sample of ~1300 X-ray-selected active M dwarfs, about one-third of which are expected to be members of young moving groups. Our program consists of three parts: a reconnaissance phase of low-resolution spectroscopy to vet unlikely association members, radial velocity observations to confirm group membership, and deep adaptive optics imaging to study the architecture and demographics of giant planets around low-mass stars. We will also exploit our rich sample to study the evolution of chromospheric and coronal activity in low-mass stars with unprecedented precision. Altogether, this program will roughly double the population of M dwarfs in young moving groups, providing new targets for a broad range of star and planet formation studies in the near-future.

  17. The Revolution Continues: Newly Discovered Systems Expand the CRISPR-Cas Toolkit.

    PubMed

    Murugan, Karthik; Babu, Kesavan; Sundaresan, Ramya; Rajan, Rakhi; Sashital, Dipali G

    2017-10-05

    CRISPR-Cas systems defend prokaryotes against bacteriophages and mobile genetic elements and serve as the basis for revolutionary tools for genetic engineering. Class 2 CRISPR-Cas systems use single Cas endonucleases paired with guide RNAs to cleave complementary nucleic acid targets, enabling programmable sequence-specific targeting with minimal machinery. Recent discoveries of previously unidentified CRISPR-Cas systems have uncovered a deep reservoir of potential biotechnological tools beyond the well-characterized Type II Cas9 systems. Here we review the current mechanistic understanding of newly discovered single-protein Cas endonucleases. Comparison of these Cas effectors reveals substantial mechanistic diversity, underscoring the phylogenetic divergence of related CRISPR-Cas systems. This diversity has enabled further expansion of CRISPR-Cas biotechnological toolkits, with wide-ranging applications from genome editing to diagnostic tools based on various Cas endonuclease activities. These advances highlight the exciting prospects for future tools based on the continually expanding set of CRISPR-Cas systems. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Targeted Re-Sequencing Emulsion PCR Panel for Myopathies: Results in 94 Cases.

    PubMed

    Punetha, Jaya; Kesari, Akanchha; Uapinyoying, Prech; Giri, Mamta; Clarke, Nigel F; Waddell, Leigh B; North, Kathryn N; Ghaoui, Roula; O'Grady, Gina L; Oates, Emily C; Sandaradura, Sarah A; Bönnemann, Carsten G; Donkervoort, Sandra; Plotz, Paul H; Smith, Edward C; Tesi-Rocha, Carolina; Bertorini, Tulio E; Tarnopolsky, Mark A; Reitter, Bernd; Hausmanowa-Petrusewicz, Irena; Hoffman, Eric P

    2016-05-27

    Molecular diagnostics in the genetic myopathies often requires testing of the largest and most complex transcript units in the human genome (DMD, TTN, NEB). Iteratively targeting single genes for sequencing has traditionally entailed high costs and long turnaround times. Exome sequencing has begun to supplant single targeted genes, but there are concerns regarding coverage and needed depth of the very large and complex genes that frequently cause myopathies. To evaluate efficiency of next-generation sequencing technologies to provide molecular diagnostics for patients with previously undiagnosed myopathies. We tested a targeted re-sequencing approach, using a 45 gene emulsion PCR myopathy panel, with subsequent sequencing on the Illumina platform in 94 undiagnosed patients. We compared the targeted re-sequencing approach to exome sequencing for 10 of these patients studied. We detected likely pathogenic mutations in 33 out of 94 patients with a molecular diagnostic rate of approximately 35%. The remaining patients showed variants of unknown significance (35/94 patients) or no mutations detected in the 45 genes tested (26/94 patients). Mutation detection rates for targeted re-sequencing vs. whole exome were similar in both methods; however exome sequencing showed better distribution of reads and fewer exon dropouts. Given that costs of highly parallel re-sequencing and whole exome sequencing are similar, and that exome sequencing now takes considerably less laboratory processing time than targeted re-sequencing, we recommend exome sequencing as the standard approach for molecular diagnostics of myopathies.

  19. Phylogenetic and Genome-Wide Deep-Sequencing Analyses of Canine Parvovirus Reveal Co-Infection with Field Variants and Emergence of a Recent Recombinant Strain

    PubMed Central

    Pérez, Ruben; Calleros, Lucía; Marandino, Ana; Sarute, Nicolás; Iraola, Gregorio; Grecco, Sofia; Blanc, Hervé; Vignuzzi, Marco; Isakov, Ofer; Shomron, Noam; Carrau, Lucía; Hernández, Martín; Francia, Lourdes; Sosa, Katia; Tomás, Gonzalo; Panzera, Yanina

    2014-01-01

    Canine parvovirus (CPV), a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c) with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population) and a major recombinant strain (86.7%). The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity. PMID:25365348

  20. Deep-sequencing to resolve complex diversity of apicomplexan parasites in platypuses and echidnas: Proof of principle for wildlife disease investigation.

    PubMed

    Šlapeta, Jan; Saverimuttu, Stefan; Vogelnest, Larry; Sangster, Cheryl; Hulst, Frances; Rose, Karrie; Thompson, Paul; Whittington, Richard

    2017-11-01

    The short-beaked echidna (Tachyglossus aculeatus) and the platypus (Ornithorhynchus anatinus) are iconic egg-laying monotremes (Mammalia: Monotremata) from Australasia. The aim of this study was to demonstrate the utility of diversity profiles in disease investigations of monotremes. Using small subunit (18S) rDNA amplicon deep-sequencing we demonstrated the presence of apicomplexan parasites and confirmed by direct and cloned amplicon gene sequencing Theileria ornithorhynchi, Theileria tachyglossi, Eimeria echidnae and Cryptosporidium fayeri. Using a combination of samples from healthy and diseased animals, we show a close evolutionary relationship between species of coccidia (Eimeria) and piroplasms (Theileria) from the echidna and platypus. The presence of E. echidnae was demonstrated in faeces and tissues affected by disseminated coccidiosis. Moreover, the presence of E. echidnae DNA in the blood of echidnas was associated with atoxoplasma-like stages in white blood cells, suggesting Hepatozoon tachyglossi blood stages are disseminated E. echidnae stages. These next-generation DNA sequencing technologies are suited to material and organisms that have not been previously characterised and for which the material is scarce. The deep sequencing approach supports traditional diagnostic methods, including microscopy, clinical pathology and histopathology, to better define the status quo. This approach is particularly suitable for wildlife disease investigation. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Evidence for thermal convection in the deep carbonate aquifer of the eastern sector of the Po Plain, Italy

    NASA Astrophysics Data System (ADS)

    Pasquale, V.; Chiozzi, P.; Verdoya, M.

    2013-05-01

    Temperatures recorded in wells as deep as 6 km drilled for hydrocarbon prospecting were used together with geological information to depict the thermal regime of the sedimentary sequence of the eastern sector of the Po Plain. After correction for drilling disturbance, temperature data were analyzed through an inversion technique based on a laterally constant thermal gradient model. The obtained thermal gradient is quite low within the deep carbonate unit (14 mK m- 1), while it is larger (53 mK m- 1) in the overlying impermeable formations. In the uppermost sedimentary layers, the thermal gradient is close to the regional average (21 mK m- 1). We argue that such a vertical change cannot be ascribed to thermal conductivity variation within the sedimentary sequence, but to deep groundwater flow. Since the hydrogeological characteristics (including litho-stratigraphic sequence and structural setting) hardly permit forced convection, we suggest that thermal convection might occur within the deep carbonate aquifer. The potential of this mechanism was evaluated by means of the Rayleigh number analysis. It turned out that permeability required for convection to occur must be larger than 3 10- 15 m2. The average over-heat ratio is 0.45. The lateral variation of hydrothermal regime was tested by using temperature data representing the aquifer thermal conditions. We found that thermal convection might be more developed and variable at the Ferrara High and its surroundings, where widespread fracturing may have increased permeability.

  2. Acquired mutations associated with ibrutinib resistance in Waldenström macroglobulinemia.

    PubMed

    Xu, Lian; Tsakmaklis, Nicholas; Yang, Guang; Chen, Jiaji G; Liu, Xia; Demos, Maria; Kofides, Amanda; Patterson, Christopher J; Meid, Kirsten; Gustine, Joshua; Dubeau, Toni; Palomba, M Lia; Advani, Ranjana; Castillo, Jorge J; Furman, Richard R; Hunter, Zachary R; Treon, Steven P

    2017-05-04

    Ibrutinib produces high response rates and durable remissions in Waldenström macroglobulinemia (WM) that are impacted by MYD88 and CXCR4 WHIM mutations. Disease progression can develop on ibrutinib, although the molecular basis remains to be clarified. We sequenced sorted CD19 + lymphoplasmacytic cells from 6 WM patients who progressed after achieving major responses on ibrutinib using Sanger, TA cloning and sequencing, and highly sensitive and allele-specific polymerase chain reaction (AS-PCR) assays that we developed for Bruton tyrosine kinase ( BTK ) mutations. AS-PCR assays were used to screen patients with and without progressive disease on ibrutinib, and ibrutinib-naïve disease. Targeted next-generation sequencing was used to validate AS-PCR findings, assess for other BTK mutations, and other targets in B-cell receptor and MYD88 signaling. Among the 6 progressing patients, 3 had BTK Cys481 variants that included BTK Cys481Ser(c.1635G>C and c.1634T>A) and BTK Cys481Arg(c.1634T>C) Two of these patients had multiple BTK mutations. Screening of 38 additional patients on ibrutinib without clinical progression identified BTK Cys481 mutations in 2 (5.1%) individuals, both of whom subsequently progressed. BTK Cys481 mutations were not detected in baseline samples or in 100 ibrutinib-naive WM patients. Using mutated MYD88 as a tumor marker, BTK Cys481 mutations were subclonal, with a highly variable clonal distribution. Targeted deep-sequencing confirmed AS-PCR findings, and identified an additional BTK Cys481Tyr(c.1634G>A) mutation in the 2 patients with multiple other BTK Cys481 mutations, as well as CARD11 Leu878Phe(c.2632C>T) and PLCγ2 Tyr495His(c.1483T>C) mutations. Four of the 5 patients with BTK C481 variants were CXCR4 mutated. BTK Cys481 mutations are common in WM patients with clinical progression on ibrutinib, and are associated with mutated CXCR4 . © 2017 by The American Society of Hematology.

  3. De Novo Deep Transcriptome Analysis of Medicinal Plants for Gene Discovery in Biosynthesis of Plant Natural Products.

    PubMed

    Han, R; Rai, A; Nakamura, M; Suzuki, H; Takahashi, H; Yamazaki, M; Saito, K

    2016-01-01

    Study on transcriptome, the entire pool of transcripts in an organism or single cells at certain physiological or pathological stage, is indispensable in unraveling the connection and regulation between DNA and protein. Before the advent of deep sequencing, microarray was the main approach to handle transcripts. Despite obvious shortcomings, including limited dynamic range and difficulties to compare the results from distinct experiments, microarray was widely applied. During the past decade, next-generation sequencing (NGS) has revolutionized our understanding of genomics in a fast, high-throughput, cost-effective, and tractable manner. By adopting NGS, efficiency and fruitful outcomes concerning the efforts to elucidate genes responsible for producing active compounds in medicinal plants were profoundly enhanced. The whole process involves steps, from the plant material sampling, to cDNA library preparation, to deep sequencing, and then bioinformatics takes over to assemble enormous-yet fragmentary-data from which to comb and extract information. The unprecedentedly rapid development of such technologies provides so many choices to facilitate the task, which can cause confusion when choosing the suitable methodology for specific purposes. Here, we review the general approaches for deep transcriptome analysis and then focus on their application in discovering biosynthetic pathways of medicinal plants that produce important secondary metabolites. © 2016 Elsevier Inc. All rights reserved.

  4. Dendritic cells are early cellular targets of Listeria monocytogenes after intestinal delivery and are involved in bacterial spread in the host.

    PubMed

    Pron, B; Boumaila, C; Jaubert, F; Berche, P; Milon, G; Geissmann, F; Gaillard, J L

    2001-05-01

    We studied the sequence of cellular events leading to the dissemination of Listeria monocytogenes from the gut to draining mesenteric lymph nodes (MLNs) by confocal microscopy of immunostained tissue sections from a rat ligated ileal loop system. OX-62-positive cells beneath the epithelial lining of Peyer's patches (PPs) were the first Listeria targets identified after intestinal inoculation. These cells had other features typical of dendritic cells (DCs): they were large, pleiomorphic and major histocompatibility complex class II(hi). Listeria were detected by microscopy in draining MLNs as early as 6 h after inoculation. Some 80-90% of bacteria were located in the deep paracortical regions, and 100% of the bacteria were present in OX-62-positive cells. Most infected cells contained more than five bacteria each, suggesting that they had arrived already loaded with bacteria. At later stages, the bacteria in these areas were mostly present in ED1-positive mononuclear phagocytes. These cells were also infected by an actA mutant defective in cell-to-cell spreading. This suggests that Listeria are transported by DCs from PPs to the deep paracortical regions of draining MLNs and are then transmitted to other cell populations by mechanisms independent of ActA. Another pathway of dissemination to MLNs was identified, probably involving free Listeria and leading to the infection of ED3-positive mononuclear phagocytes in the subcapsular sinus and adjacent paracortical areas. This study provides evidence that DCs are major cellular targets of L. monocytogenes in PPs and that DCs may be involved in the early dissemination of this pathogen. DCs were not sites of active bacterial replication, making these cells ideal vectors of infection.

  5. Deep-targeted exon sequencing reveals renal polymorphisms associate with postexercise hypotension among African Americans.

    PubMed

    Pescatello, Linda S; Schifano, Elizabeth D; Ash, Garrett I; Panza, Gregory A; Lamberti, Lauren; Chen, Ming-Hui; Deshpande, Ved; Zaleski, Amanda; Farinatti, Paulo; Taylor, Beth A; Thompson, Paul D

    2016-10-01

    We found variants from the Angiotensinogen-Converting Enzyme (ACE), Angiotensin Type 1 Receptor (AGTR1), Aldosterone Synthase (CYP11B2), and Adducin (ADD1) genes exhibited intensity-dependent associations with the ambulatory blood pressure (BP) response following acute exercise, or postexercise hypotension (PEH). In a validation cohort, we sequenced exons from these genes for their associations with PEH Obese (30.9 ± 3.6 kg m -2 ) adults (n = 23; 61% African Americans [AF], 39% Caucasian) 42.0 ± 9.8 years with hypertension (139.8 ± 10.4/84.6 ± 6.2 mmHg) completed three random experiments: bouts of vigorous and moderate intensity cycling and control. Subjects wore an ambulatory BP monitor for 19 h. We performed deep-targeted exon sequencing using the Illumina TruSeq Custom Amplicon kit. Variant genotypes were coded as number of minor alleles (#MA) and selected for further statistical analysis based upon Bonferonni or Benjamini-Yekutieli multiple testing corrected p-values under time adjusted linear models for 19 hourly BP measurements per subject. After vigorous intensity over 19 h among ACE, AGTR1, CYP11B2, and ADD1 variants passing multiple testing thresholds, as the #MA increased, systolic (SBP) and/or diastolic BP decreased 12 mmHg (P = 4.5E-05) to 30 mmHg (P = 6.4E-04) among AF only. In contrast, after moderate intensity over 19 h among ACE and CYP11B2 variants passing multiple testing thresholds, as the #MA increased, SBP increased 21 mmHg (P = 8.0E-04) to 22 mmHg (P = 8.2E-04) among AF only. In this replication study, ACE, AGTR1, CYP11B2, and ADD1 variants exhibited associations with PEH after vigorous, but not moderate intensity exercise among AF only. Renal variants should be explored further with a multi-level "omics" approach for associations with PEH among a large, ethnically diverse sample of adults with hypertension. © 2016 The Authors. Physiological Reports published by Wiley Periodicals, Inc. on behalf of the American Physiological Society and The Physiological Society.

  6. Discovery and characterization of miRNA genes in atlantic salmon (Salmo salar) by use of a deep sequencing approach

    PubMed Central

    2013-01-01

    Background MicroRNAs (miRNAs) are an abundant class of endogenous small RNA molecules that downregulate gene expression at the posttranscriptional level. They play important roles in multiple biological processes by regulating genes that control developmental timing, growth, stem cell division and apoptosis by binding to the mRNA of target genes. Despite the position Atlantic salmon (Salmo salar) has as an economically important domesticated animal, there has been little research on miRNAs in this species. Knowledge about miRNAs and their target genes may be used to control health and to improve performance of economically important traits. However, before their biological function can be unravelled they must be identified and annotated. The aims of this study were to identify and characterize miRNA genes in Atlantic salmon by deep sequencing analysis of small RNA libraries from nine different tissues. Results A total of 180 distinct mature miRNAs belonging to 106 families of evolutionary conserved miRNAs, and 13 distinct novel mature miRNAs were discovered and characterized. The mature miRNAs corresponded to 521 putative precursor sequences located at unique genome locations. About 40% of these precursors were part of gene clusters, and the majority of the Salmo salar gene clusters discovered were conserved across species. Comparison of expression levels in samples from different tissues applying DESeq indicated that there were tissue specific expression differences in three conserved and one novel miRNA. Ssa-miR 736 was detected in heart tissue only, while two other clustered miRNAs (ssa-miR 212 and132) seems to be at a higher expression level in brain tissue. These observations correlate well with their expected functions as regulators of signal pathways in cardiac and neuronal cells, respectively. Ssa-miR 8163 is one of the novel miRNAs discovered and its function remains unknown. However, differential expression analysis using DESeq suggests that this miRNA is enriched in liver tissue and the precursor was mapped to intron 7 of the transferrin gene. Conclusions The identification and annotation of evolutionary conserved and novel Salmo salar miRNAs as well as the characterization of miRNA gene clusters provide biological knowledge that will greatly facilitate further functional studies on miRNAs in this species. PMID:23865519

  7. A map of human genome variation from population-scale sequencing.

    PubMed

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  8. Selective Phylogenetic Analysis Targeted at 16S rRNA Genes of Thermophiles and Hyperthermophiles in Deep-Subsurface Geothermal Environments

    PubMed Central

    Kimura, Hiroyuki; Sugihara, Maki; Kato, Kenji; Hanada, Satoshi

    2006-01-01

    Deep-subsurface samples obtained by deep drilling are likely to be contaminated with mesophilic microorganisms in the drilling fluid, and this could affect determination of the community structure of the geothermal microflora using 16S rRNA gene clone library analysis. To eliminate possible contamination by PCR-amplified 16S rRNA genes from mesophiles, a combined thermal denaturation and enzyme digestion method, based on a strong correlation between the G+C content of the 16S rRNA gene and the optimum growth temperatures of most known prokaryotic cultures, was used prior to clone library construction. To validate this technique, hot spring fluid (76°C) and river water (14°C) were used to mimic a deep-subsurface sample contaminated with drilling fluid. After DNA extraction and PCR amplification of the 16S rRNA genes from individual samples separately, the amplified products from river water were observed to be denatured at 82°C and completely digested by exonuclease I (Exo I), while the amplified products from hot spring fluid remained intact after denaturation at 84°C and enzyme digestion with Exo I. DNAs extracted from the two samples were mixed and used as a template for amplification of the 16S rRNA genes. The amplified rRNA genes were denatured at 84°C and digested with Exo I before clone library construction. The results indicated that the 16S rRNA gene sequences from the river water were almost completely eliminated, whereas those from the hot spring fluid remained. PMID:16391020

  9. Deep sequencing of the viral phoH gene reveals temporal variation, depth-specific composition, and persistent dominance of the same viral phoH genes in the Sargasso Sea

    PubMed Central

    Goldsmith, Dawn B.; Parsons, Rachel J.; Beyene, Damitu; Salamon, Peter

    2015-01-01

    Deep sequencing of the viral phoH gene, a host-derived auxiliary metabolic gene, was used to track viral diversity throughout the water column at the Bermuda Atlantic Time-series Study (BATS) site in the summer (September) and winter (March) of three years. Viral phoH sequences reveal differences in the viral communities throughout a depth profile and between seasons in the same year. Variation was also detected between the same seasons in subsequent years, though these differences were not as great as the summer/winter distinctions. Over 3,600 phoH operational taxonomic units (OTUs; 97% sequence identity) were identified. Despite high richness, most phoH sequences belong to a few large, common OTUs whereas the majority of the OTUs are small and rare. While many OTUs make sporadic appearances at just a few times or depths, a small number of OTUs dominate the community throughout the seasons, depths, and years. PMID:26157645

  10. Comparison of magnetic resonance imaging sequences for depicting the subthalamic nucleus for deep brain stimulation.

    PubMed

    Nagahama, Hiroshi; Suzuki, Kengo; Shonai, Takaharu; Aratani, Kazuki; Sakurai, Yuuki; Nakamura, Manami; Sakata, Motomichi

    2015-01-01

    Electrodes are surgically implanted into the subthalamic nucleus (STN) of Parkinson's disease patients to provide deep brain stimulation. For ensuring correct positioning, the anatomic location of the STN must be determined preoperatively. Magnetic resonance imaging has been used for pinpointing the location of the STN. To identify the optimal imaging sequence for identifying the STN, we compared images produced with T2 star-weighted angiography (SWAN), gradient echo T2*-weighted imaging, and fast spin echo T2-weighted imaging in 6 healthy volunteers. Our comparison involved measurement of the contrast-to-noise ratio (CNR) for the STN and substantia nigra and a radiologist's interpretations of the images. Of the sequences examined, the CNR and qualitative scores were significantly higher on SWAN images than on other images (p < 0.01) for STN visualization. Kappa value (0.74) on SWAN images was the highest in three sequences for visualizing the STN. SWAN is the sequence best suited for identifying the STN at the present time.

  11. Pure Perceptual-Based Sequence Learning: A Role for Visuospatial Attention

    ERIC Educational Resources Information Center

    Remillard, Gilbert

    2009-01-01

    Learning the structure of a sequence of target locations when target location is not the response dimension and the sequence of target locations is uncorrelated with the sequence of responses is called pure perceptual-based sequence learning. The paradigm introduced by G. Remillard (2003) was used to determine whether orienting of visuospatial…

  12. MicroRNA and Transcription Factor: Key Players in Plant Regulatory Network

    PubMed Central

    Samad, Abdul F. A.; Sajad, Muhammad; Nazaruddin, Nazaruddin; Fauzi, Izzat A.; Murad, Abdul M. A.; Zainal, Zamri; Ismail, Ismanizan

    2017-01-01

    Recent achievements in plant microRNA (miRNA), a large class of small and non-coding RNAs, are very exciting. A wide array of techniques involving forward genetic, molecular cloning, bioinformatic analysis, and the latest technology, deep sequencing have greatly advanced miRNA discovery. A tiny miRNA sequence has the ability to target single/multiple mRNA targets. Most of the miRNA targets are transcription factors (TFs) which have paramount importance in regulating the plant growth and development. Various families of TFs, which have regulated a range of regulatory networks, may assist plants to grow under normal and stress environmental conditions. This present review focuses on the regulatory relationships between miRNAs and different families of TFs like; NF-Y, MYB, AP2, TCP, WRKY, NAC, GRF, and SPL. For instance NF-Y play important role during drought tolerance and flower development, MYB are involved in signal transduction and biosynthesis of secondary metabolites, AP2 regulate the floral development and nodule formation, TCP direct leaf development and growth hormones signaling. WRKY have known roles in multiple stress tolerances, NAC regulate lateral root formation, GRF are involved in root growth, flower, and seed development, and SPL regulate plant transition from juvenile to adult. We also studied the relation between miRNAs and TFs by consolidating the research findings from different plant species which will help plant scientists in understanding the mechanism of action and interaction between these regulators in the plant growth and development under normal and stress environmental conditions. PMID:28446918

  13. Characterization and differential expression of microRNAs elicited by sulfur deprivation in Chlamydomonas reinhardtii

    PubMed Central

    2012-01-01

    Background microRNAs (miRNAs) have been found to play an essential role in the modulation of numerous biological processes in eukaryotes. Chlamydomonas reinhardtii is an ideal model organism for the study of many metabolic processes including responses to sulfur-deprivation. We used a deep sequencing platform to extensively profile and identify changes in the miRNAs expression that occurred under sulfur-replete and sulfur-deprived conditions. The aim of our research was to characterize the differential expression of Chlamydomonas miRNAs under sulfur-deprived conditions, and subsequently, the target genes of miRNA involved in sulfur-deprivation were further predicted and analyzed. Results By using high-throughput sequencing, we characterized the microRNA transcriptomes under sulphur-replete and sulfur-deprived conditions in Chlamydomonas reinhardtii. We predicted a total of 310 miRNAs which included 85 known miRNAs and 225 novel miRNAs. 13 miRNAs were the specific to the sulfur-deprived conditions. 47 miRNAs showed significantly differential expressions responding to sulfur-deprivation, and most were up-regulated in the small RNA libraries with sulfur-deprivation. Using a web-based integrated system (Web MicroRNAs Designer 3) and combing the former information from a transcriptome of Chlamydomonas reinhardtii, 22 miRNAs and their targets involved in metabolism regulation with sulfur-deprivation were verified. Conclusions Our results indicate that sulfur-deprivation may have a significant influence on small RNA expression patterns, and the differential expressions of miRNAs and interactions between miRNA and its targets might further reveal the molecular mechanism responding to sulfur-deprivation in Chlamydomonas reinhardtii. PMID:22439676

  14. Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh].

    PubMed

    Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K

    2011-01-20

    Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.

  15. Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

    PubMed Central

    2011-01-01

    Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263

  16. Accurate and exact CNV identification from targeted high-throughput sequence data.

    PubMed

    Nord, Alex S; Lee, Ming; King, Mary-Claire; Walsh, Tom

    2011-04-12

    Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data. Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate. Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.

  17. Looking beyond the exome: a phenotype-first approach to molecular diagnostic resolution in rare and undiagnosed diseases

    PubMed Central

    Pena, Loren DM; Jiang, Yong-Hui; Schoch, Kelly; Spillmann, Rebecca C.; Walley, Nicole; Stong, Nicholas; Horn, Sarah Rapisardo; Sullivan, Jennifer A.; McConkie-Rosell, Allyn; Kansagra, Sujay; Smith, Edward C.; El-Dairi, Mays; Bellet, Jane; Ann Keels, Martha; Jasien, Joan; Kranz, Peter G.; Noel, Richard; Nagaraj, Shashi K.; Lark, Robert K.; Wechsler, Daniel SG; del Gaudio, Daniela; Leung, Marco L.; Hendon, Laura G.; Parker, Collette C.; Jones, Kelly L.; Goldstein, David B.; Shashi, Vandana

    2017-01-01

    Purpose To describe examples of missed pathogenic variants on whole exome sequencing (WES) and the importance of deep phenotyping for further diagnostic testing. Methods Guided by phenotypic information, three children with negative WES underwent targeted single gene testing. Results Individual 1 had a clinical diagnosis consistent with infantile systemic hyalinosis, although WES and an NGS-based ANTXR2 test were negative. Sanger sequencing of ANTXR2 revealed a homozygous single base pair insertion, previously missed by the WES variant caller software. Individual 2 had neurodevelopmental regression and cerebellar atrophy, with no diagnosis on WES. New clinical findings prompted Sanger sequencing and copy number testing of PLA2G6. A novel homozygous deletion of the non-coding exon 1 (not included in the WES capture kit) was detected, with extension into the promoter, confirming the clinical suspicion of infantile neuroaxonal dystrophy. Individual 3 had progressive ataxia, spasticity and MRI changes of vanishing white matter leukoencephalopathy. An NGS leukodystrophy gene panel and WES showed a heterozygous pathogenic variant in EIF2B5; no deletions/duplications were detected. Sanger sequencing of EIF2B5 showed a frameshift indel, likely missed due to failure of alignment. Conclusions These cases illustrate potential pitfalls of WES/NGS testing, and the importance of phenotype-guided molecular testing in yielding diagnoses. PMID:28914269

  18. Who Benefits from a Low versus High Guidance CSCL Script and Why?

    ERIC Educational Resources Information Center

    Mende, Stephan; Proske, Antje; Körndle, Hermann; Narciss, Susanne

    2017-01-01

    Computer-supported collaborative learning (CSCL) scripts can foster learners' deep text comprehension. However, this depends on (a) the extent to which the learning activities targeted by a script promote deep text comprehension and (b) whether the guidance level provided by the script is adequate to induce the targeted learning activities…

  19. Culturable prokaryotic diversity of deep, gas hydrate sediments: first use of a continuous high-pressure, anaerobic, enrichment and isolation system for subseafloor sediments (DeepIsoBUG)

    PubMed Central

    Parkes, R John; Sellek, Gerard; Webster, Gordon; Martin, Derek; Anders, Erik; Weightman, Andrew J; Sass, Henrik

    2009-01-01

    Deep subseafloor sediments may contain depressurization-sensitive, anaerobic, piezophilic prokaryotes. To test this we developed the DeepIsoBUG system, which when coupled with the HYACINTH pressure-retaining drilling and core storage system and the PRESS core cutting and processing system, enables deep sediments to be handled without depressurization (up to 25 MPa) and anaerobic prokaryotic enrichments and isolation to be conducted up to 100 MPa. Here, we describe the system and its first use with subsurface gas hydrate sediments from the Indian Continental Shelf, Cascadia Margin and Gulf of Mexico. Generally, highest cell concentrations in enrichments occurred close to in situ pressures (14 MPa) in a variety of media, although growth continued up to at least 80 MPa. Predominant sequences in enrichments were Carnobacterium, Clostridium, Marinilactibacillus and Pseudomonas, plus Acetobacterium and Bacteroidetes in Indian samples, largely independent of media and pressures. Related 16S rRNA gene sequences for all of these Bacteria have been detected in deep, subsurface environments, although isolated strains were piezotolerant, being able to grow at atmospheric pressure. Only the Clostridium and Acetobacterium were obligate anaerobes. No Archaea were enriched. It may be that these sediment samples were not deep enough (total depth 1126–1527 m) to obtain obligate piezophiles. PMID:19694787

  20. Evolution of coding and non-coding genes in HOX clusters of a marsupial.

    PubMed

    Yu, Hongshi; Lindsay, James; Feng, Zhi-Ping; Frankenberg, Stephen; Hu, Yanqiu; Carone, Dawn; Shaw, Geoff; Pask, Andrew J; O'Neill, Rachel; Papenfuss, Anthony T; Renfree, Marilyn B

    2012-06-18

    The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.

  1. Evolution of coding and non-coding genes in HOX clusters of a marsupial

    PubMed Central

    2012-01-01

    Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b) were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial. PMID:22708672

  2. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    PubMed Central

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  3. Comparative analysis of chimeric ZFP-, TALE- and Cas9-piggyBac transposases for integration into a single locus in human cells.

    PubMed

    Luo, Wentian; Galvan, Daniel L; Woodard, Lauren E; Dorset, Dan; Levy, Shawn; Wilson, Matthew H

    2017-08-21

    Integrating DNA delivery systems hold promise for many applications including treatment of diseases; however, targeted integration is needed for improved safety. The piggyBac (PB) transposon system is a highly active non-viral gene delivery system capable of integrating defined DNA segments into host chromosomes without requiring homologous recombination. We systematically compared four different engineered zinc finger proteins (ZFP), four transcription activator-like effector proteins (TALE), CRISPR associated protein 9 (SpCas9) and the catalytically inactive dSpCas9 protein fused to the amino-terminus of the transposase enzyme designed to target the hypoxanthine phosphoribosyltransferase (HPRT) gene located on human chromosome X. Chimeric transposases were evaluated for expression, transposition activity, chromatin immunoprecipitation at the target loci, and targeted knockout of the HPRT gene in human cells. One ZFP-PB and one TALE-PB chimera demonstrated notable HPRT gene targeting. In contrast, Cas9/dCas9-PB chimeras did not result in gene targeting. Instead, the HPRT locus appeared to be protected from transposon integration. Supplied separately, PB permitted highly efficient isolation of Cas9-mediated knockout of HPRT, with zero transposon integrations in HPRT by deep sequencing. In summary, these tools may allow isolation of 'targeted-only' cells, be utilized to protect a genomic locus from transposon integration, and enrich for Cas9-mutated cells. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  4. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  5. Transposable element-associated microRNA hairpins produce 21-nt sRNAs integrated into typical microRNA pathways in rice

    PubMed Central

    Ou-Yang, Fangqian; Luo, Qing-Jun; Zhang, Yue; Richardson, Casey R.; Jiang, Yingwen; Rock, Christopher D.

    2013-01-01

    microRNAs (miRNAs) are a class of small RNAs (sRNAs) of ~21 nucleotides (nt) in length processed from foldback hairpins by DICER-LIKE1 (DCL1) or DCL4. They regulate the expression of target mRNAs by base pairing through RNA-Induced Silencing Complex (RISC). In the RISC, ARGONAUTE1 (AGO1) is the key protein that cleaves miRNA targets at position ten of a miRNA:target duplex. The authenticity of many annotated rice miRNA hairpins is under debate because of their homology to repeat sequences. Some of them, like miR1884b, have been removed from the current release of miRBase based on incomplete information. In this study, we investigated the association of transposable element (TE)-derived miRNAs with typical miRNA pathways (DCL1/4- and AGO1-dependent) using publicly available deep sequencing datasets. Seven miRNA hairpins with 13 unique sRNAs were specifically enriched in AGO1 immunoprecipitation samples and relatively reduced in DCL1/4 knockdown genotypes. Interestingly, these species are ~21-nt long, instead of 24-nt as annotated in miRBase and the literature. Their expression profiles meet current criteria for functional annotation of miRNAs. In addition, diagnostic cleavage tags were found in degradome datasets for predicted target mRNAs. Most of these miRNA hairpins share significant homology with miniature inverted-repeat transposable elements (MITEs), one type of abundant DNA transposons in rice. Finally, the root-specific production of a 24 nt miRNA-like sRNA was confirmed by RNA blot for a novel EST that maps to the 3'-UTR of a candidate pseudogene showing extensive sequence homology to miR1884b hairpin. Our data are consistent with the hypothesis that TEs can serve as a driving force for the evolution of some MIRNAs, where co-opting of DICER-LIKE1/4 processing and integration into AGO1 could exapt transcribed TE-associated hairpins into typical miRNA pathways. PMID:23420033

  6. Optical Communications Channel Combiner

    NASA Technical Reports Server (NTRS)

    Quirk, Kevin J.; Quirk, Kevin J.; Nguyen, Danh H.; Nguyen, Huy

    2012-01-01

    NASA has identified deep-space optical communications links as an integral part of a unified space communication network in order to provide data rates in excess of 100 Mb/s. The distances and limited power inherent in a deep-space optical downlink necessitate the use of photon-counting detectors and a power-efficient modulation such as pulse position modulation (PPM). For the output of each photodetector, whether from a separate telescope or a portion of the detection area, a communication receiver estimates a log-likelihood ratio for each PPM slot. To realize the full effective aperture of these receivers, their outputs must be combined prior to information decoding. A channel combiner was developed to synchronize the log-likelihood ratio (LLR) sequences of multiple receivers, and then combines these into a single LLR sequence for information decoding. The channel combiner synchronizes the LLR sequences of up to three receivers and then combines these into a single LLR sequence for output. The channel combiner has three channel inputs, each of which takes as input a sequence of four-bit LLRs for each PPM slot in a codeword via a XAUI 10 Gb/s quad optical fiber interface. The cross-correlation between the channels LLR time series are calculated and used to synchronize the sequences prior to combining. The output of the channel combiner is a sequence of four-bit LLRs for each PPM slot in a codeword via a XAUI 10 Gb/s quad optical fiber interface. The unit is controlled through a 1 Gb/s Ethernet UDP/IP interface. A deep-space optical communication link has not yet been demonstrated. This ground-station channel combiner was developed to demonstrate this capability and is unique in its ability to process such a signal.

  7. The DEEP2 Galaxy Redshift Survey: Design, Observations, Data Reduction, and Redshifts

    NASA Technical Reports Server (NTRS)

    Newman, Jeffrey A.; Cooper, Michael C.; Davis, Marc; Faber, S. M.; Coil, Alison L; Guhathakurta, Puraga; Koo, David C.; Phillips, Andrew C.; Conroy, Charlie; Dutton, Aaron A.; hide

    2013-01-01

    We describe the design and data analysis of the DEEP2 Galaxy Redshift Survey, the densest and largest high-precision redshift survey of galaxies at z approx. 1 completed to date. The survey was designed to conduct a comprehensive census of massive galaxies, their properties, environments, and large-scale structure down to absolute magnitude MB = -20 at z approx. 1 via approx.90 nights of observation on the Keck telescope. The survey covers an area of 2.8 Sq. deg divided into four separate fields observed to a limiting apparent magnitude of R(sub AB) = 24.1. Objects with z approx. < 0.7 are readily identifiable using BRI photometry and rejected in three of the four DEEP2 fields, allowing galaxies with z > 0.7 to be targeted approx. 2.5 times more efficiently than in a purely magnitude-limited sample. Approximately 60% of eligible targets are chosen for spectroscopy, yielding nearly 53,000 spectra and more than 38,000 reliable redshift measurements. Most of the targets that fail to yield secure redshifts are blue objects that lie beyond z approx. 1.45, where the [O ii] 3727 Ang. doublet lies in the infrared. The DEIMOS 1200 line mm(exp -1) grating used for the survey delivers high spectral resolution (R approx. 6000), accurate and secure redshifts, and unique internal kinematic information. Extensive ancillary data are available in the DEEP2 fields, particularly in the Extended Groth Strip, which has evolved into one of the richest multiwavelength regions on the sky. This paper is intended as a handbook for users of the DEEP2 Data Release 4, which includes all DEEP2 spectra and redshifts, as well as for the DEEP2 DEIMOS data reduction pipelines. Extensive details are provided on object selection, mask design, biases in target selection and redshift measurements, the spec2d two-dimensional data-reduction pipeline, the spec1d automated redshift pipeline, and the zspec visual redshift verification process, along with examples of instrumental signatures or other artifacts that in some cases remain after data reduction. Redshift errors and catastrophic failure rates are assessed through more than 2000 objects with duplicate observations. Sky subtraction is essentially photon-limited even under bright OH sky lines; we describe the strategies that permitted this, based on high image stability, accurate wavelength solutions, and powerful B-spline modeling methods. We also investigate the impact of targets that appear to be single objects in ground-based targeting imaging but prove to be composite in Hubble Space Telescope data; they constitute several percent of targets at z approx. 1, approaching approx. 5%-10% at z > 1.5. Summary data are given that demonstrate the superiority of DEEP2 over other deep high-precision redshift surveys at z approx. 1 in terms of redshift accuracy, sample number density, and amount of spectral information. We also provide an overview of the scientific highlights of the DEEP2 survey thus far.

  8. The 3-D aftershock distribution of three recent M5~5.5 earthquakes in the Anza region,California

    NASA Astrophysics Data System (ADS)

    Zhang, Q.; Wdowinski, S.; Lin, G.

    2011-12-01

    The San Jacinto fault zone (SJFZ) exhibits the highest level of seismicity compared to other regions in southern California. On average, it produces four earthquakes per day, most of them at depth of 10-17 km. Over the past decade, an increasing seismic activity occurred in the Anza region, which included three M5~5.5 events and their aftershock sequences. These events occurred in 2001, 2005, and 2010. In this research we map the 3-D distribution of these three events to evaluate their rupture geometry and better understand the unusual deep seismic pattern along the SJFZ, which was termed "deep creep" (Wdowinski, 2009). We relocated 97,562 events from 1981 to 2011 in Anza region by applying the Source-Specific Station Term (SSST) method (Lin et al., 2006) and used an accurate 1-D velocity model derived from 3-D model of Lin et al (2007) and used In order to separate the aftershock sequence from background seismicity, we characterized each of the three aftershock sequences using Omori's law. Preliminary results show that all three sequences had a similar geometry of deep elongated aftershock distribution. Most aftershocks occurred at depth of 10-17 km and extended over a 70 km long segments of the SJFZ, centered at the mainshock hypocenters. A comparative study of other M5~5.5 mainshocks and their aftershock sequences in southern California reveals very different geometrical pattern, suggesting that the three Anza M5~5.5 events are unique and can be indicative of "deep creep" deformation processes. Reference 1.Lin, G.and Shearer,P.M.,2006, The COMPLOC earthquake location package,Seism. Res. Lett.77, pp.440-444. 2.Lin, G. and Shearer, P.M., Hauksson, E., and Thurber C.H.,2007, A three-dimensional crustal seismic velocity model for southern California from a composite event method,J. Geophys.Res.112, B12306, doi: 10.1029/ 2007JB004977. 3.Wdowinski, S. ,2009, Deep creep as a cause for the excess seismicity along the San Jacinto fault, Nat. Geosci.,doi:10.1038/NGEO684.

  9. Effects of hydrostatic pressure on yeasts isolated from deep-sea hydrothermal vents.

    PubMed

    Burgaud, Gaëtan; Hué, Nguyen Thi Minh; Arzur, Danielle; Coton, Monika; Perrier-Cornet, Jean-Marie; Jebbar, Mohamed; Barbier, Georges

    2015-11-01

    Hydrostatic pressure plays a significant role in the distribution of life in the biosphere. Knowledge of deep-sea piezotolerant and (hyper)piezophilic bacteria and archaea diversity has been well documented, along with their specific adaptations to cope with high hydrostatic pressure (HHP). Recent investigations of deep-sea microbial community compositions have shown unexpected micro-eukaryotic communities, mainly dominated by fungi. Molecular methods such as next-generation sequencing have been used for SSU rRNA gene sequencing to reveal fungal taxa. Currently, a difficult but fascinating challenge for marine mycologists is to create deep-sea marine fungus culture collections and assess their ability to cope with pressure. Indeed, although there is no universal genetic marker for piezoresistance, physiological analyses provide concrete relevant data for estimating their adaptations and understanding the role of fungal communities in the abyss. The present study investigated morphological and physiological responses of fungi to HHP using a collection of deep-sea yeasts as a model. The aim was to determine whether deep-sea yeasts were able to tolerate different HHP and if they were metabolically active. Here we report an unexpected taxonomic-based dichotomic response to pressure with piezosensitve ascomycetes and piezotolerant basidiomycetes, and distinct morphological switches triggered by pressure for certain strains. Copyright © 2015 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  10. Microfluidic droplet enrichment for targeted sequencing

    PubMed Central

    Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.

    2015-01-01

    Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629

  11. High-Resolution Analysis of the Efficiency, Heritability, and Editing Outcomes of CRISPR/Cas9-Induced Modifications of NCED4 in Lettuce (Lactuca sativa).

    PubMed

    Bertier, Lien D; Ron, Mily; Huo, Heqiang; Bradford, Kent J; Britt, Anne B; Michelmore, Richard W

    2018-05-04

    CRISPR/Cas9 is a transformative tool for making targeted genetic alterations. In plants, high mutation efficiencies have been reported in primary transformants. However, many of the mutations analyzed were somatic and therefore not heritable. To provide more insights into the efficiency of creating stable homozygous mutants using CRISPR/Cas9, we targeted LsNCED4 ( 9-cis-EPOXYCAROTENOID DIOXYGENASE4) , a gene conditioning thermoinhibition of seed germination in lettuce. Three constructs, each capable of expressing Cas9 and a single gRNA targeting different sites in LsNCED4 , were stably transformed into lettuce (Lactuca sativa) cvs. Salinas and Cobham Green. Analysis of 47 primary transformants (T 1 ) and 368 T 2 plants by deep amplicon sequencing revealed that 57% of T 1 plants contained events at the target site: 28% of plants had germline mutations in one allele indicative of an early editing event (mono-allelic), 8% of plants had germline mutations in both alleles indicative of two early editing events (bi-allelic), and the remaining 21% of plants had multiple low frequency mutations indicative of late events (chimeric plants). Editing efficiency was similar in both genotypes, while the different gRNAs varied in efficiency. Amplicon sequencing of 20 T 1 and more than 100 T 2 plants for each of the three gRNAs showed that repair outcomes were not random, but reproducible and characteristic for each gRNA. Knockouts of NCED4 resulted in large increases in the maximum temperature for seed germination, with seeds of both cultivars capable of germinating >70% at 37°. Knockouts of NCED4 provide a whole-plant selectable phenotype that has minimal pleiotropic consequences. Targeting NCED4 in a co-editing strategy could therefore be used to enrich for germline-edited events simply by germinating seeds at high temperature. Copyright © 2018 Bertier et al.

  12. High-Resolution Analysis of the Efficiency, Heritability, and Editing Outcomes of CRISPR/Cas9-Induced Modifications of NCED4 in Lettuce (Lactuca sativa)

    PubMed Central

    Bertier, Lien D.; Ron, Mily; Huo, Heqiang; Bradford, Kent J.; Britt, Anne B.; Michelmore, Richard W.

    2018-01-01

    CRISPR/Cas9 is a transformative tool for making targeted genetic alterations. In plants, high mutation efficiencies have been reported in primary transformants. However, many of the mutations analyzed were somatic and therefore not heritable. To provide more insights into the efficiency of creating stable homozygous mutants using CRISPR/Cas9, we targeted LsNCED4 (9-cis-EPOXYCAROTENOID DIOXYGENASE4), a gene conditioning thermoinhibition of seed germination in lettuce. Three constructs, each capable of expressing Cas9 and a single gRNA targeting different sites in LsNCED4, were stably transformed into lettuce (Lactuca sativa) cvs. Salinas and Cobham Green. Analysis of 47 primary transformants (T1) and 368 T2 plants by deep amplicon sequencing revealed that 57% of T1 plants contained events at the target site: 28% of plants had germline mutations in one allele indicative of an early editing event (mono-allelic), 8% of plants had germline mutations in both alleles indicative of two early editing events (bi-allelic), and the remaining 21% of plants had multiple low frequency mutations indicative of late events (chimeric plants). Editing efficiency was similar in both genotypes, while the different gRNAs varied in efficiency. Amplicon sequencing of 20 T1 and more than 100 T2 plants for each of the three gRNAs showed that repair outcomes were not random, but reproducible and characteristic for each gRNA. Knockouts of NCED4 resulted in large increases in the maximum temperature for seed germination, with seeds of both cultivars capable of germinating >70% at 37°. Knockouts of NCED4 provide a whole-plant selectable phenotype that has minimal pleiotropic consequences. Targeting NCED4 in a co-editing strategy could therefore be used to enrich for germline-edited events simply by germinating seeds at high temperature. PMID:29511025

  13. Population structure of microbial communities associated with two deep, anaerobic, alkaline aquifers.

    PubMed Central

    Fry, N K; Fredrickson, J K; Fishbain, S; Wagner, M; Stahl, D A

    1997-01-01

    Microbial communities of two deep (1,270 and 316 m) alkaline (pH 9.94 and 8.05), anaerobic (Eh, -137 and -27 mV) aquifers were characterized by rRNA-based analyses. Both aquifers, the Grande Ronde (GR) and Priest rapids (PR) formations, are located within the Columbia River Basalt Group in south-central Washington, and sulfidogenesis and methanogenesis characterize the GR and PR formations, respectively. RNA was extracted from microorganisms collected from groundwater by ultrafiltration through hollow-fiber membranes and hybridized to taxon-specific oligonucleotide probes. Of the three domains, Bacteria dominated both communities, making up to 92.0 and 64.4% of the total rRNA from the GR and PR formations, respectively. Eucarya comprised 5.7 and 14.4%, and Archaea comprised 1.8% and 2.5%, respectively. The gram-positive target group was found in both aquifers, 11.7% in GR and 7.6% in PR. Two probes were used to target sulfate- and/or metal-reducing bacteria within the delta subclass of Proteobacteria. The Desulfobacter groups was present (0.3%) only in the high-sulfate groundwater (GR). However, comparable hybridization to a probe selective for the desulfovibrios and some metal-reducing bacteria was found in both aquifers, 2.5 and 2.9% from the GR and PR formations, respectively. Selective PCR amplification and sequencing of the desulfovibrio/metal-reducing group revealed a predominance of desulfovibrios in both systems (17 of 20 clones), suggesting that their environmental distribution is not restricted by sulfate availability. PMID:9097447

  14. Population structure of microbial communities associated with two deep, anaerobic, alkaline aquifers.

    PubMed

    Fry, N K; Fredrickson, J K; Fishbain, S; Wagner, M; Stahl, D A

    1997-04-01

    Microbial communities of two deep (1,270 and 316 m) alkaline (pH 9.94 and 8.05), anaerobic (Eh, -137 and -27 mV) aquifers were characterized by rRNA-based analyses. Both aquifers, the Grande Ronde (GR) and Priest rapids (PR) formations, are located within the Columbia River Basalt Group in south-central Washington, and sulfidogenesis and methanogenesis characterize the GR and PR formations, respectively. RNA was extracted from microorganisms collected from groundwater by ultrafiltration through hollow-fiber membranes and hybridized to taxon-specific oligonucleotide probes. Of the three domains, Bacteria dominated both communities, making up to 92.0 and 64.4% of the total rRNA from the GR and PR formations, respectively. Eucarya comprised 5.7 and 14.4%, and Archaea comprised 1.8% and 2.5%, respectively. The gram-positive target group was found in both aquifers, 11.7% in GR and 7.6% in PR. Two probes were used to target sulfate- and/or metal-reducing bacteria within the delta subclass of Proteobacteria. The Desulfobacter groups was present (0.3%) only in the high-sulfate groundwater (GR). However, comparable hybridization to a probe selective for the desulfovibrios and some metal-reducing bacteria was found in both aquifers, 2.5 and 2.9% from the GR and PR formations, respectively. Selective PCR amplification and sequencing of the desulfovibrio/metal-reducing group revealed a predominance of desulfovibrios in both systems (17 of 20 clones), suggesting that their environmental distribution is not restricted by sulfate availability.

  15. When less is more: 'slicing' sequencing data improves read decoding accuracy and de novo assembly quality.

    PubMed

    Lonardi, Stefano; Mirebrahim, Hamid; Wanamaker, Steve; Alpert, Matthew; Ciardo, Gianfranco; Duma, Denisa; Close, Timothy J

    2015-09-15

    As the invention of DNA sequencing in the 70s, computational biologists have had to deal with the problem of de novo genome assembly with limited (or insufficient) depth of sequencing. In this work, we investigate the opposite problem, that is, the challenge of dealing with excessive depth of sequencing. We explore the effect of ultra-deep sequencing data in two domains: (i) the problem of decoding reads to bacterial artificial chromosome (BAC) clones (in the context of the combinatorial pooling design we have recently proposed), and (ii) the problem of de novo assembly of BAC clones. Using real ultra-deep sequencing data, we show that when the depth of sequencing increases over a certain threshold, sequencing errors make these two problems harder and harder (instead of easier, as one would expect with error-free data), and as a consequence the quality of the solution degrades with more and more data. For the first problem, we propose an effective solution based on 'divide and conquer': we 'slice' a large dataset into smaller samples of optimal size, decode each slice independently, and then merge the results. Experimental results on over 15 000 barley BACs and over 4000 cowpea BACs demonstrate a significant improvement in the quality of the decoding and the final assembly. For the second problem, we show for the first time that modern de novo assemblers cannot take advantage of ultra-deep sequencing data. Python scripts to process slices and resolve decoding conflicts are available from http://goo.gl/YXgdHT; software Hashfilter can be downloaded from http://goo.gl/MIyZHs stelo@cs.ucr.edu or timothy.close@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Identification, visualization, and sorting of translationally active microbial consortia from deep-sea methane seeps

    NASA Astrophysics Data System (ADS)

    Hatzenpichler, R.; Connon, S. A.; Goudeau, D.; Malmstrom, R.; Woyke, T.; Orphan, V. J.

    2015-12-01

    Within the past few years, great progress has been made in tapping the genomes of individual cells separated from environmental samples. Unfortunately, however, most often these efforts have been target blind, as they did not pre-select for taxa of interest or focus on metabolically active cells that could be considered key species of the system at the time. This problem is particularly pronounced in low-turnover systems such as deep sea sediments. In an effort to tap the genetic potential hidden within functionally active cells, we have recently developed an approach for the in situ fluorescent tracking of protein synthesis in uncultured cells via bioorthogonal non-canonical amino acid-tagging (BONCAT). This technique depends on the incorporation of synthetic amino acids that carry chemically modifiable tags into newly made proteins, which later can be visualized via click chemistry-mediated fluorescence-labeling. BONCAT is thus able to specifically target proteins that have been expressed in reaction to an experimental condition. We are particularly interested in using BONCAT to understand the functional potential of slow-growing syntrophic consortia of anaerobic methanotrophic archaea and sulfate-reducing bacteria which together catalyze the anaerobic oxidation of methane (AOM) in marine methane seeps. In order to specifically target consortia that are active under varying environmental regimes, we are studying different subpopulations of these inter-domain consortia via a combination of BONCAT with rRNA-targeted FISH. We then couple the BONCAT-enabled staining of active consortia with their separation from inactive members of the community via fluorescence-activated cell-sorting (FACS) and metagenomic sequencing of individual consortia. Using this approach, we were able to identify previously unrecognized AOM-partnerships. By comparing the mini-metagenomes obtained from individual consortia with each other we are starting to gain a more hollistic understanding of the genetic similarities and niche-determining characteristics of a range of functional and taxonomic clades of AOM-consortia.

  17. Unique microbial community in drilling fluids from Chinese continental scientific drilling

    USGS Publications Warehouse

    Zhang, Gengxin; Dong, Hailiang; Jiang, Hongchen; Xu, Zhiqin; Eberl, Dennis D.

    2006-01-01

    Circulating drilling fluid is often regarded as a contamination source in investigations of subsurface microbiology. However, it also provides an opportunity to sample geological fluids at depth and to study contained microbial communities. During our study of deep subsurface microbiology of the Chinese Continental Scientific Deep drilling project, we collected 6 drilling fluid samples from a borehole from 2290 to 3350 m below the land surface. Microbial communities in these samples were characterized with cultivation-dependent and -independent techniques. Characterization of 16S rRNA genes indicated that the bacterial clone sequences related to Firmicutes became progressively dominant with increasing depth. Most sequences were related to anaerobic, thermophilic, halophilic or alkaliphilic bacteria. These habitats were consistent with the measured geochemical characteristics of the drilling fluids that have incorporated geological fluids and partly reflected the in-situ conditions. Several clone types were closely related to Thermoanaerobacter ethanolicus, Caldicellulosiruptor lactoaceticus, and Anaerobranca gottschalkii, an anaerobic metal-reducer, an extreme thermophile, and an anaerobic chemoorganotroph, respectively, with an optimal growth temperature of 50–68°C. Seven anaerobic, thermophilic Fe(III)-reducing bacterial isolates were obtained and they were capable of reducing iron oxide and clay minerals to produce siderite, vivianite, and illite. The archaeal diversity was low. Most archaeal sequences were not related to any known cultivated species, but rather to environmental clone sequences recovered from subsurface environments. We infer that the detected microbes were derived from geological fluids at depth and their growth habitats reflected the deep subsurface conditions. These findings have important implications for microbial survival and their ecological functions in the deep subsurface.

  18. Integrative analysis of long non-coding RNA acting as ceRNAs involved in chilling injury in tomato fruit.

    PubMed

    Wang, Yunxiang; Gao, Lipu; Zhu, Benzhong; Zhu, Hongliang; Luo, Yunbo; Wang, Qing; Zuo, Jinhua

    2018-08-15

    Long-non-coding RNA (LncRNA) is a kind of non-coding endogenous RNA that plays essential roles in diverse biological processes and various stress responses. To identify and elucidate the intricate regulatory roles of lncRNAs in chilling injury in tomato fruit, deep sequencing and bioinformatics methods were performed here. After strict screening, a total of 1411 lncRNAs were identified. Among these lncRNAs, 239 of them were significantly differentially expressed. A large amount of target genes were identified and many of them were found to code chilling stress related proteins, including redox reaction related enzyme, important enzymes about cell wall degradation, membrane lipid peroxidation related enzymes, heat and cold shock protein, energy metabolism related enzymes, salicylic acid and abscisic acid metabolism related genes. Interestingly, 41 lncRNAs were found to be the precursor of 33 miRNAs, and 186 lncRNAs were targets of 45 miRNAs. These lncRNAs targeted by miRNAs might be potential ceRNAs. Particularly, a sophisticated regulatory model including miRNAs, lncRNAs and their targets was set up. This model revealed that some miRNAs and lncRNAs may be involved in chilling injury, which provided a new perspective of lncRNAs role. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Full genome virus detection in fecal samples using sensitive nucleic acid preparation, deep sequencing, and a novel iterative sequence classification algorithm.

    PubMed

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis.

  20. Full Genome Virus Detection in Fecal Samples Using Sensitive Nucleic Acid Preparation, Deep Sequencing, and a Novel Iterative Sequence Classification Algorithm

    PubMed Central

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J.; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis. PMID:24695106

Top