Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier
2008-01-01
Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152
Li, Qike; Schissler, A Grant; Gardeux, Vincent; Achour, Ikbel; Kenost, Colleen; Berghout, Joanne; Li, Haiquan; Zhang, Hao Helen; Lussier, Yves A
2017-05-24
Transcriptome analytic tools are commonly used across patient cohorts to develop drugs and predict clinical outcomes. However, as precision medicine pursues more accurate and individualized treatment decisions, these methods are not designed to address single-patient transcriptome analyses. We previously developed and validated the N-of-1-pathways framework using two methods, Wilcoxon and Mahalanobis Distance (MD), for personal transcriptome analysis derived from a pair of samples of a single patient. Although, both methods uncover concordantly dysregulated pathways, they are not designed to detect dysregulated pathways with up- and down-regulated genes (bidirectional dysregulation) that are ubiquitous in biological systems. We developed N-of-1-pathways MixEnrich, a mixture model followed by a gene set enrichment test, to uncover bidirectional and concordantly dysregulated pathways one patient at a time. We assess its accuracy in a comprehensive simulation study and in a RNA-Seq data analysis of head and neck squamous cell carcinomas (HNSCCs). In presence of bidirectionally dysregulated genes in the pathway or in presence of high background noise, MixEnrich substantially outperforms previous single-subject transcriptome analysis methods, both in the simulation study and the HNSCCs data analysis (ROC Curves; higher true positive rates; lower false positive rates). Bidirectional and concordant dysregulated pathways uncovered by MixEnrich in each patient largely overlapped with the quasi-gold standard compared to other single-subject and cohort-based transcriptome analyses. The greater performance of MixEnrich presents an advantage over previous methods to meet the promise of providing accurate personal transcriptome analysis to support precision medicine at point of care.
2010-01-01
Background Recent developments in high-throughput methods of analyzing transcriptomic profiles are promising for many areas of biology, including ecophysiology. However, although commercial microarrays are available for most common laboratory models, transcriptome analysis in non-traditional model species still remains a challenge. Indeed, the signal resulting from heterologous hybridization is low and difficult to interpret because of the weak complementarity between probe and target sequences, especially when no microarray dedicated to a genetically close species is available. Results We show here that transcriptome analysis in a species genetically distant from laboratory models is made possible by using MAXRS, a new method of analyzing heterologous hybridization on microarrays. This method takes advantage of the design of several commercial microarrays, with different probes targeting the same transcript. To illustrate and test this method, we analyzed the transcriptome of king penguin pectoralis muscle hybridized to Affymetrix chicken microarrays, two organisms separated by an evolutionary distance of approximately 100 million years. The differential gene expression observed between different physiological situations computed by MAXRS was confirmed by real-time PCR on 10 genes out of 11 tested. Conclusions MAXRS appears to be an appropriate method for gene expression analysis under heterologous hybridization conditions. PMID:20509979
Transcriptome assembly and digital gene expression atlas of the rainbow trout
USDA-ARS?s Scientific Manuscript database
Background: Transcriptome analysis is a preferred method for gene discovery, marker development and gene expression profiling in non-model organisms. Previously, we sequenced a transcriptome reference using Sanger-based and 454-pyrosequencing, however, a transcriptome assembly is still incomplete an...
Liu, Na; Liu, Lin; Pan, Xinghua
2014-07-01
Cellular heterogeneity within a cell population is a common phenomenon in multicellular organisms, tissues, cultured cells, and even FACS-sorted subpopulations. Important information may be masked if the cells are studied as a mass. Transcriptome profiling is a parameter that has been intensively studied, and relatively easier to address than protein composition. To understand the basis and importance of heterogeneity and stochastic aspects of the cell function and its mechanisms, it is essential to examine transcriptomes of a panel of single cells. High-throughput technologies, starting from microarrays and now RNA-seq, provide a full view of the expression of transcriptomes but are limited by the amount of RNA for analysis. Recently, several new approaches for amplification and sequencing the transcriptome of single cells or a limited low number of cells have been developed and applied. In this review, we summarize these major strategies, such as PCR-based methods, IVT-based methods, phi29-DNA polymerase-based methods, and several other methods, including their principles, characteristics, advantages, and limitations, with representative applications in cancer stem cells, early development, and embryonic stem cells. The prospects for development of future technology and application of transcriptome analysis in a single cell are also discussed.
Use of prior knowledge for the analysis of high-throughput transcriptomics and metabolomics data
2014-01-01
Background High-throughput omics technologies have enabled the measurement of many genes or metabolites simultaneously. The resulting high dimensional experimental data poses significant challenges to transcriptomics and metabolomics data analysis methods, which may lead to spurious instead of biologically relevant results. One strategy to improve the results is the incorporation of prior biological knowledge in the analysis. This strategy is used to reduce the solution space and/or to focus the analysis on biological meaningful regions. In this article, we review a selection of these methods used in transcriptomics and metabolomics. We combine the reviewed methods in three groups based on the underlying mathematical model: exploratory methods, supervised methods and estimation of the covariance matrix. We discuss which prior knowledge has been used, how it is incorporated and how it modifies the mathematical properties of the underlying methods. PMID:25033193
Brown, Roger B; Madrid, Nathaniel J; Suzuki, Hideaki; Ness, Scott A
2017-01-01
RNA-sequencing (RNA-seq) has become the standard method for unbiased analysis of gene expression but also provides access to more complex transcriptome features, including alternative RNA splicing, RNA editing, and even detection of fusion transcripts formed through chromosomal translocations. However, differences in library methods can adversely affect the ability to recover these different types of transcriptome data. For example, some methods have bias for one end of transcripts or rely on low-efficiency steps that limit the complexity of the resulting library, making detection of rare transcripts less likely. We tested several commonly used methods of RNA-seq library preparation and found vast differences in the detection of advanced transcriptome features, such as alternatively spliced isoforms and RNA editing sites. By comparing several different protocols available for the Ion Proton sequencer and by utilizing detailed bioinformatics analysis tools, we were able to develop an optimized random primer based RNA-seq technique that is reliable at uncovering rare transcript isoforms and RNA editing features, as well as fusion reads from oncogenic chromosome rearrangements. The combination of optimized libraries and rapid Ion Proton sequencing provides a powerful platform for the transcriptome analysis of research and clinical samples.
Lovatt, Ditte; Ruble, Brittani K.; Lee, Jaehee; Dueck, Hannah; Kim, Tae Kyung; Fisher, Stephen; Francis, Chantal; Spaethling, Jennifer M.; Wolf, John A.; Grady, M. Sean; Ulyanova, Alexandra V.; Yeldell, Sean B.; Griepenburg, Julianne C.; Buckley, Peter T.; Kim, Junhyong; Sul, Jai-Yoon; Dmochowski, Ivan J.; Eberwine, James
2014-01-01
Transcriptome profiling is an indispensable tool in advancing the understanding of single cell biology, but depends upon methods capable of isolating mRNA at the spatial resolution of a single cell. Current capture methods lack sufficient spatial resolution to isolate mRNA from individual in vivo resident cells without damaging adjacent tissue. Because of this limitation, it has been difficult to assess the influence of the microenvironment on the transcriptome of individual neurons. Here, we engineered a Transcriptome In Vivo Analysis (TIVA)-tag, which upon photoactivation enables mRNA capture from single cells in live tissue. Using the TIVA-tag in combination with RNA-seq to analyze transcriptome variance among single dispersed cells and in vivo resident mouse and human neurons, we show that the tissue microenvironment shapes the transcriptomic landscape of individual cells. The TIVA methodology provides the first noninvasive approach for capturing mRNA from single cells in their natural microenvironment. PMID:24412976
Characterizing differential gene expression in polyploid grasses lacking a reference transcriptome
USDA-ARS?s Scientific Manuscript database
Basal transcriptome characterization and differential gene expression in response to varying conditions are often addressed through next generation sequencing (NGS) and data analysis techniques. While these strategies are commonly used, there are countless tools, pipelines, data analysis methods an...
Transcriptome Analysis at the Single-Cell Level Using SMART Technology.
Fish, Rachel N; Bostick, Magnolia; Lehman, Alisa; Farmer, Andrew
2016-10-10
RNA sequencing (RNA-seq) is a powerful method for analyzing cell state, with minimal bias, and has broad applications within the biological sciences. However, transcriptome analysis of seemingly homogenous cell populations may in fact overlook significant heterogeneity that can be uncovered at the single-cell level. The ultra-low amount of RNA contained in a single cell requires extraordinarily sensitive and reproducible transcriptome analysis methods. As next-generation sequencing (NGS) technologies mature, transcriptome profiling by RNA-seq is increasingly being used to decipher the molecular signature of individual cells. This unit describes an ultra-sensitive and reproducible protocol to generate cDNA and sequencing libraries directly from single cells or RNA inputs ranging from 10 pg to 10 ng. Important considerations for working with minute RNA inputs are given. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
BLIND ordering of large-scale transcriptomic developmental timecourses.
Anavy, Leon; Levin, Michal; Khair, Sally; Nakanishi, Nagayasu; Fernandez-Valverde, Selene L; Degnan, Bernard M; Yanai, Itai
2014-03-01
RNA-Seq enables the efficient transcriptome sequencing of many samples from small amounts of material, but the analysis of these data remains challenging. In particular, in developmental studies, RNA-Seq is challenged by the morphological staging of samples, such as embryos, since these often lack clear markers at any particular stage. In such cases, the automatic identification of the stage of a sample would enable previously infeasible experimental designs. Here we present the 'basic linear index determination of transcriptomes' (BLIND) method for ordering samples comprising different developmental stages. The method is an implementation of a traveling salesman algorithm to order the transcriptomes according to their inter-relationships as defined by principal components analysis. To establish the direction of the ordered samples, we show that an appropriate indicator is the entropy of transcriptomic gene expression levels, which increases over developmental time. Using BLIND, we correctly recover the annotated order of previously published embryonic transcriptomic timecourses for frog, mosquito, fly and zebrafish. We further demonstrate the efficacy of BLIND by collecting 59 embryos of the sponge Amphimedon queenslandica and ordering their transcriptomes according to developmental stage. BLIND is thus useful in establishing the temporal order of samples within large datasets and is of particular relevance to the study of organisms with asynchronous development and when morphological staging is difficult.
Comparison of normalization methods for differential gene expression analysis in RNA-Seq experiments
Maza, Elie; Frasse, Pierre; Senin, Pavel; Bouzayen, Mondher; Zouine, Mohamed
2013-01-01
In recent years, RNA-Seq technologies became a powerful tool for transcriptome studies. However, computational methods dedicated to the analysis of high-throughput sequencing data are yet to be standardized. In particular, it is known that the choice of a normalization procedure leads to a great variability in results of differential gene expression analysis. The present study compares the most widespread normalization procedures and proposes a novel one aiming at removing an inherent bias of studied transcriptomes related to their relative size. Comparisons of the normalization procedures are performed on real and simulated data sets. Real RNA-Seq data sets analyses, performed with all the different normalization methods, show that only 50% of significantly differentially expressed genes are common. This result highlights the influence of the normalization step on the differential expression analysis. Real and simulated data sets analyses give similar results showing 3 different groups of procedures having the same behavior. The group including the novel method named “Median Ratio Normalization” (MRN) gives the lower number of false discoveries. Within this group the MRN method is less sensitive to the modification of parameters related to the relative size of transcriptomes such as the number of down- and upregulated genes and the gene expression levels. The newly proposed MRN method efficiently deals with intrinsic bias resulting from relative size of studied transcriptomes. Validation with real and simulated data sets confirmed that MRN is more consistent and robust than existing methods. PMID:26442135
Houshyani, Benyamin; van der Krol, Alexander R; Bino, Raoul J; Bouwmeester, Harro J
2014-06-19
Molecular characterization is an essential step of risk/safety assessment of genetically modified (GM) crops. Holistic approaches for molecular characterization using omics platforms can be used to confirm the intended impact of the genetic engineering, but can also reveal the unintended changes at the omics level as a first assessment of potential risks. The potential of omics platforms for risk assessment of GM crops has rarely been used for this purpose because of the lack of a consensus reference and statistical methods to judge the significance or importance of the pleiotropic changes in GM plants. Here we propose a meta data analysis approach to the analysis of GM plants, by measuring the transcriptome distance to untransformed wild-types. In the statistical analysis of the transcriptome distance between GM and wild-type plants, values are compared with naturally occurring transcriptome distances in non-GM counterparts obtained from a database. Using this approach we show that the pleiotropic effect of genes involved in indirect insect defence traits is substantially equivalent to the variation in gene expression occurring naturally in Arabidopsis. Transcriptome distance is a useful screening method to obtain insight in the pleiotropic effects of genetic modification.
Comparison of ribosomal RNA removal methods for transcriptome sequencing workflows in teleost fish
USDA-ARS?s Scientific Manuscript database
RNA sequencing (RNA-Seq) is becoming the standard for transcriptome analysis. Removal of contaminating ribosomal RNA (rRNA) is a priority in the preparation of libraries suitable for sequencing. rRNAs are commonly removed from total RNA via either mRNA selection or rRNA depletion. These methods have...
Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich
2015-12-16
Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
Baldrian, Petr; López-Mondéjar, Rubén
2014-02-01
Molecular methods for the analysis of biomolecules have undergone rapid technological development in the last decade. The advent of next-generation sequencing methods and improvements in instrumental resolution enabled the analysis of complex transcriptome, proteome and metabolome data, as well as a detailed annotation of microbial genomes. The mechanisms of decomposition by model fungi have been described in unprecedented detail by the combination of genome sequencing, transcriptomics and proteomics. The increasing number of available genomes for fungi and bacteria shows that the genetic potential for decomposition of organic matter is widespread among taxonomically diverse microbial taxa, while expression studies document the importance of the regulation of expression in decomposition efficiency. Importantly, high-throughput methods of nucleic acid analysis used for the analysis of metagenomes and metatranscriptomes indicate the high diversity of decomposer communities in natural habitats and their taxonomic composition. Today, the metaproteomics of natural habitats is of interest. In combination with advanced analytical techniques to explore the products of decomposition and the accumulation of information on the genomes of environmentally relevant microorganisms, advanced methods in microbial ecophysiology should increase our understanding of the complex processes of organic matter transformation.
Nam, Seungyoon
2017-04-01
Cancer transcriptome analysis is one of the leading areas of Big Data science, biomarker, and pharmaceutical discovery, not to forget personalized medicine. Yet, cancer transcriptomics and postgenomic medicine require innovation in bioinformatics as well as comparison of the performance of available algorithms. In this data analytics context, the value of network generation and algorithms has been widely underscored for addressing the salient questions in cancer pathogenesis. Analysis of cancer trancriptome often results in complicated networks where identification of network modularity remains critical, for example, in delineating the "druggable" molecular targets. Network clustering is useful, but depends on the network topology in and of itself. Notably, the performance of different network-generating tools for network cluster (NC) identification has been little investigated to date. Hence, using gastric cancer (GC) transcriptomic datasets, we compared two algorithms for generating pathway versus gene regulatory network-based NCs, showing that the pathway-based approach better agrees with a reference set of cancer-functional contexts. Finally, by applying pathway-based NC identification to GC transcriptome datasets, we describe cancer NCs that associate with candidate therapeutic targets and biomarkers in GC. These observations collectively inform future research on cancer transcriptomics, drug discovery, and rational development of new analysis tools for optimal harnessing of omics data.
Transcriptome analysis of Pseudomonas syringae identifies new genes, ncRNAs, and antisense activity
USDA-ARS?s Scientific Manuscript database
To fully understand how bacteria respond to their environment, it is essential to assess genome-wide transcriptional activity. New high throughput sequencing technologies make it possible to query the transcriptome of an organism in an efficient unbiased manner. We applied a strand-specific method t...
Performance of Arma chinensis reared on an artificial diet formulated using transcriptomic methods
USDA-ARS?s Scientific Manuscript database
An artificial diet formulated for continuous rearing of the predator Arma chinensis was inferior to natural prey when evaluated using life history parameters. A transcriptome analysis identified differentially expressed genes in diet-fed and prey-fed A. chinensis that were suggestive of molecular me...
Microarray-Based Gene Expression Analysis for Veterinary Pathologists: A Review.
Raddatz, Barbara B; Spitzbarth, Ingo; Matheis, Katja A; Kalkuhl, Arno; Deschl, Ulrich; Baumgärtner, Wolfgang; Ulrich, Reiner
2017-09-01
High-throughput, genome-wide transcriptome analysis is now commonly used in all fields of life science research and is on the cusp of medical and veterinary diagnostic application. Transcriptomic methods such as microarrays and next-generation sequencing generate enormous amounts of data. The pathogenetic expertise acquired from understanding of general pathology provides veterinary pathologists with a profound background, which is essential in translating transcriptomic data into meaningful biological knowledge, thereby leading to a better understanding of underlying disease mechanisms. The scientific literature concerning high-throughput data-mining techniques usually addresses mathematicians or computer scientists as the target audience. In contrast, the present review provides the reader with a clear and systematic basis from a veterinary pathologist's perspective. Therefore, the aims are (1) to introduce the reader to the necessary methodological background; (2) to introduce the sequential steps commonly performed in a microarray analysis including quality control, annotation, normalization, selection of differentially expressed genes, clustering, gene ontology and pathway analysis, analysis of manually selected genes, and biomarker discovery; and (3) to provide references to publically available and user-friendly software suites. In summary, the data analysis methods presented within this review will enable veterinary pathologists to analyze high-throughput transcriptome data obtained from their own experiments, supplemental data that accompany scientific publications, or public repositories in order to obtain a more in-depth insight into underlying disease mechanisms.
Multiplexed transcriptome analysis to detect ALK, ROS1 and RET rearrangements in lung cancer
Rogers, Toni-Maree; Arnau, Gisela Mir; Ryland, Georgina L.; Huang, Stephen; Lira, Maruja E.; Emmanuel, Yvette; Perez, Omar D.; Irwin, Darryl; Fellowes, Andrew P.; Wong, Stephen Q.; Fox, Stephen B.
2017-01-01
ALK, ROS1 and RET gene fusions are important predictive biomarkers for tyrosine kinase inhibitors in lung cancer. Currently, the gold standard method for gene fusion detection is Fluorescence In Situ Hybridization (FISH) and while highly sensitive and specific, it is also labour intensive, subjective in analysis, and unable to screen a large numbers of gene fusions. Recent developments in high-throughput transcriptome-based methods may provide a suitable alternative to FISH as they are compatible with multiplexing and diagnostic workflows. However, the concordance between these different methods compared with FISH has not been evaluated. In this study we compared the results from three transcriptome-based platforms (Nanostring Elements, Agena LungFusion panel and ThermoFisher NGS fusion panel) to those obtained from ALK, ROS1 and RET FISH on 51 clinical specimens. Overall agreement of results ranged from 86–96% depending on the platform used. While all platforms were highly sensitive, both the Agena panel and Thermo Fisher NGS fusion panel reported minor fusions that were not detectable by FISH. Our proof–of–principle study illustrates that transcriptome-based analyses are sensitive and robust methods for detecting actionable gene fusions in lung cancer and could provide a robust alternative to FISH testing in the diagnostic setting. PMID:28181564
Necklace: combining reference and assembled transcriptomes for more comprehensive RNA-Seq analysis.
Davidson, Nadia M; Oshlack, Alicia
2018-05-01
RNA sequencing (RNA-seq) analyses can benefit from performing a genome-guided and de novo assembly, in particular for species where the reference genome or the annotation is incomplete. However, tools for integrating an assembled transcriptome with reference annotation are lacking. Necklace is a software pipeline that runs genome-guided and de novo assembly and combines the resulting transcriptomes with reference genome annotations. Necklace constructs a compact but comprehensive superTranscriptome out of the assembled and reference data. Reads are subsequently aligned and counted in preparation for differential expression testing. Necklace allows a comprehensive transcriptome to be built from a combination of assembled and annotated transcripts, which results in a more comprehensive transcriptome for the majority of organisms. In addition RNA-seq data are mapped back to this newly created superTranscript reference to enable differential expression testing with standard methods.
PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms.
Gan, Ruei-Chi; Chen, Ting-Wen; Wu, Timothy H; Huang, Po-Jung; Lee, Chi-Ching; Yeh, Yuan-Ming; Chiu, Cheng-Hsun; Huang, Hsien-Da; Tang, Petrus
2016-12-22
Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared. Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours. In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw .
Brooks, Matthew J.; Rajasimha, Harsha K.; Roger, Jerome E.
2011-01-01
Purpose Next-generation sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived retinal transcriptome profiling (RNA-seq) to microarray and quantitative reverse transcription polymerase chain reaction (qRT–PCR) methods and to evaluate protocols for optimal high-throughput data analysis. Methods Retinal mRNA profiles of 21-day-old wild-type (WT) and neural retina leucine zipper knockout (Nrl−/−) mice were generated by deep sequencing, in triplicate, using Illumina GAIIx. The sequence reads that passed quality filters were analyzed at the transcript isoform level with two methods: Burrows–Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat followed by Cufflinks. qRT–PCR validation was performed using TaqMan and SYBR Green assays. Results Using an optimized data analysis workflow, we mapped about 30 million sequence reads per sample to the mouse genome (build mm9) and identified 16,014 transcripts in the retinas of WT and Nrl−/− mice with BWA workflow and 34,115 transcripts with TopHat workflow. RNA-seq data confirmed stable expression of 25 known housekeeping genes, and 12 of these were validated with qRT–PCR. RNA-seq data had a linear relationship with qRT–PCR for more than four orders of magnitude and a goodness of fit (R2) of 0.8798. Approximately 10% of the transcripts showed differential expression between the WT and Nrl−/− retina, with a fold change ≥1.5 and p value <0.05. Altered expression of 25 genes was confirmed with qRT–PCR, demonstrating the high degree of sensitivity of the RNA-seq method. Hierarchical clustering of differentially expressed genes uncovered several as yet uncharacterized genes that may contribute to retinal function. Data analysis with BWA and TopHat workflows revealed a significant overlap yet provided complementary insights in transcriptome profiling. Conclusions Our study represents the first detailed analysis of retinal transcriptomes, with biologic replicates, generated by RNA-seq technology. The optimized data analysis workflows reported here should provide a framework for comparative investigations of expression profiles. Our results show that NGS offers a comprehensive and more accurate quantitative and qualitative evaluation of mRNA content within a cell or tissue. We conclude that RNA-seq based transcriptome characterization would expedite genetic network analyses and permit the dissection of complex biologic functions. PMID:22162623
Elucidating and mining the Tulipa and Lilium transcriptomes.
Moreno-Pachon, Natalia M; Leeggangers, Hendrika A C F; Nijveen, Harm; Severing, Edouard; Hilhorst, Henk; Immink, Richard G H
2016-10-01
Genome sequencing remains a challenge for species with large and complex genomes containing extensive repetitive sequences, of which the bulbous and monocotyledonous plants tulip and lily are examples. In such a case, sequencing of only the active part of the genome, represented by the transcriptome, is a good alternative to obtain information about gene content. In this study we aimed to generate a high quality transcriptome of tulip and lily and to make this data available as an open-access resource via a user-friendly web-based interface. The Illumina HiSeq 2000 platform was applied and the transcribed RNA was sequenced from a collection of different lily and tulip tissues, respectively. In order to obtain good transcriptome coverage and to facilitate effective data mining, assembly was done using different filtering parameters for clearing out contamination and noise of the RNAseq datasets. This analysis revealed limitations of commonly applied methods and parameter settings used in de novo transcriptome assembly. The final created transcriptomes are publicly available via a user friendly Transcriptome browser ( http://www.bioinformatics.nl/bulbs/db/species/index ). The usefulness of this resource has been exemplified by a search for all potential transcription factors in lily and tulip, with special focus on the TCP transcription factor family. This analysis and other quality parameters point out the quality of the transcriptomes, which can serve as a basis for further genomics studies in lily, tulip, and bulbous plants in general.
2010-01-01
Background Identification of genes with invariant levels of gene expression is a prerequisite for validating transcriptomic changes accompanying development. Ideally expression of these genes should be independent of the morphogenetic process or environmental condition tested as well as the methods used for RNA purification and analysis. Results In an effort to identify endogenous genes meeting these criteria nine reference genes (RG) were tested in two Petunia lines (Mitchell and V30). Growth conditions differed in Mitchell and V30, and different methods were used for RNA isolation and analysis. Four different software tools were employed to analyze the data. We merged the four outputs by means of a non-weighted unsupervised rank aggregation method. The genes identified as optimal for transcriptomic analysis of Mitchell and V30 were EF1α in Mitchell and CYP in V30, whereas the least suitable gene was GAPDH in both lines. Conclusions The least adequate gene turned out to be GAPDH indicating that it should be rejected as reference gene in Petunia. The absence of correspondence of the best-suited genes suggests that assessing reference gene stability is needed when performing normalization of data from transcriptomic analysis of flower and leaf development. PMID:20056000
RNA-Skim: a rapid method for RNA-Seq quantification at transcript level
Zhang, Zhaojun; Wang, Wei
2014-01-01
Motivation: RNA-Seq technique has been demonstrated as a revolutionary means for exploring transcriptome because it provides deep coverage and base pair-level resolution. RNA-Seq quantification is proven to be an efficient alternative to Microarray technique in gene expression study, and it is a critical component in RNA-Seq differential expression analysis. Most existing RNA-Seq quantification tools require the alignments of fragments to either a genome or a transcriptome, entailing a time-consuming and intricate alignment step. To improve the performance of RNA-Seq quantification, an alignment-free method, Sailfish, has been recently proposed to quantify transcript abundances using all k-mers in the transcriptome, demonstrating the feasibility of designing an efficient alignment-free method for transcriptome quantification. Even though Sailfish is substantially faster than alternative alignment-dependent methods such as Cufflinks, using all k-mers in the transcriptome quantification impedes the scalability of the method. Results: We propose a novel RNA-Seq quantification method, RNA-Skim, which partitions the transcriptome into disjoint transcript clusters based on sequence similarity, and introduces the notion of sig-mers, which are a special type of k-mers uniquely associated with each cluster. We demonstrate that the sig-mer counts within a cluster are sufficient for estimating transcript abundances with accuracy comparable with any state-of-the-art method. This enables RNA-Skim to perform transcript quantification on each cluster independently, reducing a complex optimization problem into smaller optimization tasks that can be run in parallel. As a result, RNA-Skim uses <4% of the k-mers and <10% of the CPU time required by Sailfish. It is able to finish transcriptome quantification in <10 min per sample by using just a single thread on a commodity computer, which represents >100 speedup over the state-of-the-art alignment-based methods, while delivering comparable or higher accuracy. Availability and implementation: The software is available at http://www.csbio.unc.edu/rs. Contact: weiwang@cs.ucla.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24931995
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells.
Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang
2018-01-01
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. © 2018 Han et al.; Published by Cold Spring Harbor Laboratory Press.
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells
Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang
2018-01-01
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. PMID:29208629
TCW: Transcriptome Computational Workbench
Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R.
2013-01-01
Background The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. Methodology The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. Conclusion It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw. PMID:23874959
TCW: transcriptome computational workbench.
Soderlund, Carol; Nelson, William; Willer, Mark; Gang, David R
2013-01-01
The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw.
RNA-Seq Technology and Its Application in Fish Transcriptomics
Ba, Yi; Zhuang, Qianfeng
2014-01-01
Abstract High-throughput sequencing technologies, also known as next-generation sequencing (NGS) technologies, have revolutionized the way that genomic research is advancing. In addition to the static genome, these state-of-art technologies have been recently exploited to analyze the dynamic transcriptome, and the resulting technology is termed RNA sequencing (RNA-seq). RNA-seq is free from many limitations of other transcriptomic approaches, such as microarray and tag-based sequencing method. Although RNA-seq has only been available for a short time, studies using this method have completely changed our perspective of the breadth and depth of eukaryotic transcriptomes. In terms of the transcriptomics of teleost fishes, both model and non-model species have benefited from the RNA-seq approach and have undergone tremendous advances in the past several years. RNA-seq has helped not only in mapping and annotating fish transcriptome but also in our understanding of many biological processes in fish, such as development, adaptive evolution, host immune response, and stress response. In this review, we first provide an overview of each step of RNA-seq from library construction to the bioinformatic analysis of the data. We then summarize and discuss the recent biological insights obtained from the RNA-seq studies in a variety of fish species. PMID:24380445
Salivary biomarker development using genomic, proteomic and metabolomic approaches
2012-01-01
The use of saliva as a diagnostic sample provides a non-invasive, cost-efficient method of sample collection for disease screening without the need for highly trained professionals. Saliva collection is far more practical and safe compared with invasive methods of sample collection, because of the infection risk from contaminated needles during, for example, blood sampling. Furthermore, the use of saliva could increase the availability of accurate diagnostics for remote and impoverished regions. However, the development of salivary diagnostics has required technical innovation to allow stabilization and detection of analytes in the complex molecular mixture that is saliva. The recent development of cost-effective room temperature analyte stabilization methods, nucleic acid pre-amplification techniques and direct saliva transcriptomic analysis have allowed accurate detection and quantification of transcripts found in saliva. Novel protein stabilization methods have also facilitated improved proteomic analyses. Although candidate biomarkers have been discovered using epigenetic, transcriptomic, proteomic and metabolomic approaches, transcriptomic analyses have so far achieved the most progress in terms of sensitivity and specificity, and progress towards clinical implementation. Here, we review recent developments in salivary diagnostics that have been accomplished using genomic, transcriptomic, proteomic and metabolomic approaches. PMID:23114182
Fricano, Meagan M; Ditewig, Amy C; Jung, Paul M; Liguori, Michael J; Blomme, Eric A G; Yang, Yi
2011-01-01
Blood is an ideal tissue for the identification of novel genomic biomarkers for toxicity or efficacy. However, using blood for transcriptomic profiling presents significant technical challenges due to the transcriptomic changes induced by ex vivo handling and the interference of highly abundant globin mRNA. Most whole blood RNA stabilization and isolation methods also require significant volumes of blood, limiting their effective use in small animal species, such as rodents. To overcome these challenges, a QIAzol-based RNA stabilization and isolation method (QSI) was developed to isolate sufficient amounts of high quality total RNA from 25 to 500 μL of rat whole blood. The method was compared to the standard PAXgene Blood RNA System using blood collected from rats exposed to saline or lipopolysaccharide (LPS). The QSI method yielded an average of 54 ng total RNA per μL of rat whole blood with an average RNA Integrity Number (RIN) of 9, a performance comparable with the standard PAXgene method. Total RNA samples were further processed using the NuGEN Ovation Whole Blood Solution system and cDNA was hybridized to Affymetrix Rat Genome 230 2.0 Arrays. The microarray QC parameters using RNA isolated with the QSI method were within the acceptable range for microarray analysis. The transcriptomic profiles were highly correlated with those using RNA isolated with the PAXgene method and were consistent with expected LPS-induced inflammatory responses. The present study demonstrated that the QSI method coupled with NuGEN Ovation Whole Blood Solution system is cost-effective and particularly suitable for transcriptomic profiling of minimal volumes of whole blood, typical of those obtained with small animal species.
Using single nuclei for RNA-seq to capture the transcriptome of postmortem neurons
Krishnaswami, Suguna Rani; Grindberg, Rashel V; Novotny, Mark; Venepally, Pratap; Lacar, Benjamin; Bhutani, Kunal; Linker, Sara B; Pham, Son; Erwin, Jennifer A; Miller, Jeremy A; Hodge, Rebecca; McCarthy, James K; Kelder, Martin; McCorrison, Jamison; Aevermann, Brian D; Fuertes, Francisco Diez; Scheuermann, Richard H; Lee, Jun; Lein, Ed S; Schork, Nicholas; McConnell, Michael J; Gage, Fred H; Lasken, Roger S
2016-01-01
A protocol is described for sequencing the transcriptome of a cell nucleus. Nuclei are isolated from specimens and sorted by FACS, cDNA libraries are constructed and RNA-seq is performed, followed by data analysis. Some steps follow published methods (Smart-seq2 for cDNA synthesis and Nextera XT barcoded library preparation) and are not described in detail here. Previous single-cell approaches for RNA-seq from tissues include cell dissociation using protease treatment at 30 °C, which is known to alter the transcriptome. We isolate nuclei at 4 °C from tissue homogenates, which cause minimal damage. Nuclear transcriptomes can be obtained from postmortem human brain tissue stored at −80 °C, making brain archives accessible for RNA-seq from individual neurons. The method also allows investigation of biological features unique to nuclei, such as enrichment of certain transcripts and precursors of some noncoding RNAs. By following this procedure, it takes about 4 d to construct cDNA libraries that are ready for sequencing. PMID:26890679
Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex
2016-11-16
Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.
Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex
2016-01-01
Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968
Kim, Min Kyung; Lane, Anatoliy; Kelley, James J; Lun, Desmond S
2016-01-01
Several methods have been developed to predict system-wide and condition-specific intracellular metabolic fluxes by integrating transcriptomic data with genome-scale metabolic models. While powerful in many settings, existing methods have several shortcomings, and it is unclear which method has the best accuracy in general because of limited validation against experimentally measured intracellular fluxes. We present a general optimization strategy for inferring intracellular metabolic flux distributions from transcriptomic data coupled with genome-scale metabolic reconstructions. It consists of two different template models called DC (determined carbon source model) and AC (all possible carbon sources model) and two different new methods called E-Flux2 (E-Flux method combined with minimization of l2 norm) and SPOT (Simplified Pearson cOrrelation with Transcriptomic data), which can be chosen and combined depending on the availability of knowledge on carbon source or objective function. This enables us to simulate a broad range of experimental conditions. We examined E. coli and S. cerevisiae as representative prokaryotic and eukaryotic microorganisms respectively. The predictive accuracy of our algorithm was validated by calculating the uncentered Pearson correlation between predicted fluxes and measured fluxes. To this end, we compiled 20 experimental conditions (11 in E. coli and 9 in S. cerevisiae), of transcriptome measurements coupled with corresponding central carbon metabolism intracellular flux measurements determined by 13C metabolic flux analysis (13C-MFA), which is the largest dataset assembled to date for the purpose of validating inference methods for predicting intracellular fluxes. In both organisms, our method achieves an average correlation coefficient ranging from 0.59 to 0.87, outperforming a representative sample of competing methods. Easy-to-use implementations of E-Flux2 and SPOT are available as part of the open-source package MOST (http://most.ccib.rutgers.edu/). Our method represents a significant advance over existing methods for inferring intracellular metabolic flux from transcriptomic data. It not only achieves higher accuracy, but it also combines into a single method a number of other desirable characteristics including applicability to a wide range of experimental conditions, production of a unique solution, fast running time, and the availability of a user-friendly implementation.
Lloréns-Rico, Verónica; Serrano, Luis; Lluch-Senar, Maria
2014-07-29
RNA sequencing methods have already altered our view of the extent and complexity of bacterial and eukaryotic transcriptomes, revealing rare transcript isoforms (circular RNAs, RNA chimeras) that could play an important role in their biology. We performed an analysis of chimera formation by four different computational approaches, including a custom designed pipeline, to study the transcriptomes of M. pneumoniae and P. aeruginosa, as well as mixtures of both. We found that rare transcript isoforms detected by conventional pipelines of analysis could be artifacts of the experimental procedure used in the library preparation, and that they are protocol-dependent. By using a customized pipeline we show that optimal library preparation protocol and the pipeline to analyze the results are crucial to identify real chimeric RNAs.
Transcriptomics of cortical gray matter thickness decline during normal aging
Kochunov, P; Charlesworth, J; Winkler, A; Hong, LE; Nichols, T; Curran, JE; Sprooten, E; Jahanshad, N; Thompson, PM; Johnson, MP; Kent, JW; Landman, BA; Mitchell, B; Cole, SA; Dyer, TD; Moses, EK; Goring, HHH; Almasy, L; Duggirala, R; Olvera, RL; Glahn, DC; Blangero, J
2013-01-01
Introduction We performed a whole-transcriptome correlation analysis, followed by the pathway enrichment and testing of innate immune response pathways analyses to evaluate the hypothesis that transcriptional activity can predict cortical gray matter thickness (GMT) variability during normal cerebral aging Methods Transcriptome and GMT data were availabe for 379 individuals (age range=28–85) community-dwelling members of large extended Mexican-American families. Collection of transcriptome data preceded that of neuroimaging data by 17 years. Genome-wide gene transcriptome data consisted of 20,413 heritable lymphocytes-based transcripts. GMT measurements were performed from high-resolution (isotropic 800µm) T1-weighted MRI. Transcriptome-wide and pathway enrichment analysis was used to classify genes correlated with GMT. Transcripts for sixty genes from seven innate immune pathways were tested as specific predictors of GMT variability. Results Transcripts for eight genes (IGFBP3, LRRN3, CRIP2, SCD, IDS, TCF4, GATA3, HN1) passed the transcriptome-wide significance threshold. Four orthogonal factors extracted from this set predicted 31.9% of the variability in the whole-brain and between 23.4 and 35% of regional GMT measurements. Pathway enrichment analysis identified six functional categories including cellular proliferation, aggregation, differentiation, viral infection, and metabolism. The integrin signaling pathway was significantly (p<10−6) enriched with GMT. Finally, three innate immune pathways (complement signaling, toll-receptors and scavenger and immunoglobulins) were significantly associated with GMT. Conclusion Expression activity for the genes that regulate cellular proliferation, adhesion, differentiation and inflammation can explain a significant proportion of individual variability in cortical GMT. Our findings suggest that normal cerebral aging is the product of a progressive decline in regenerative capacity and increased neuroinflammation. PMID:23707588
Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H
2014-11-19
Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.
Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin
2013-09-22
High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.
Cho, Byuri Angela; Yoo, Seong-Keun; Song, Young Shin; Kim, Su-jin; Lee, Kyu Eun; Shong, Minho
2018-01-01
Background: Elucidating aging-related transcriptomic changes in human organs is necessary to understand the aging physiology and mechanisms, but little is known regarding the thyroid gland. We investigated aging-related transcriptomic alterations in the human thyroid gland and characterized the related molecular functions. Methods: Publicly available RNA sequencing data of 322 thyroid tissue samples from the Genotype-Tissue Expression project were analyzed. In addition, our own 64 RNA sequencing data of normal thyroid tissue samples were used as a validation set. To comprehensively evaluate the associations between aging and transcriptomic changes, we performed a weighted gene coexpression network analysis and pathway enrichment analysis. The thyroid differentiation score was then used for further analysis, defining the correlations between thyroid differentiation and aging. Results: The most significant aging-related transcriptomic change in thyroid was the downregulation of genes related to the mitochondrial and proteasomal functions (p = 3 × 10−6). Moreover, genes that are associated with immune processes were significantly upregulated with age (p = 3 × 10−4), and all of them overlapped with the upregulated genes in the thyroid glands affected by lymphocytic thyroiditis. Furthermore, these aging-related changes were not significantly different according to sex, but in terms of the thyroid differentiation, females were more susceptible to aging-related changes (p for trend = 0.03). Conclusions: Aging-related transcriptomic changes in the thyroid gland were associated with mitochondrial and proteasomal dysfunction, loss of differentiation, and activation of autoimmune processes. Our results provide clues to better understanding the age-related decline in thyroid function and higher susceptibility to autoimmune thyroid disease. PMID:29652618
PIVOT: platform for interactive analysis and visualization of transcriptomics data.
Zhu, Qin; Fisher, Stephen A; Dueck, Hannah; Middleton, Sarah; Khaladkar, Mugdha; Kim, Junhyong
2018-01-05
Many R packages have been developed for transcriptome analysis but their use often requires familiarity with R and integrating results of different packages requires scripts to wrangle the datatypes. Furthermore, exploratory data analyses often generate multiple derived datasets such as data subsets or data transformations, which can be difficult to track. Here we present PIVOT, an R-based platform that wraps open source transcriptome analysis packages with a uniform user interface and graphical data management that allows non-programmers to interactively explore transcriptomics data. PIVOT supports more than 40 popular open source packages for transcriptome analysis and provides an extensive set of tools for statistical data manipulations. A graph-based visual interface is used to represent the links between derived datasets, allowing easy tracking of data versions. PIVOT further supports automatic report generation, publication-quality plots, and program/data state saving, such that all analysis can be saved, shared and reproduced. PIVOT will allow researchers with broad background to easily access sophisticated transcriptome analysis tools and interactively explore transcriptome datasets.
Trinity | Informatics Technology for Cancer Research (ITCR)
Trinity Cancer Transcriptome Analysis Toolkit (CTAT) including de novo transcriptome assembly with downstream support for expression analysis and focused analyses on cancer transcriptomes, incorporating mutation and fusion transcript discovery, and single cell analysis.
Microfluidic single-cell whole-transcriptome sequencing.
Streets, Aaron M; Zhang, Xiannian; Cao, Chen; Pang, Yuhong; Wu, Xinglong; Xiong, Liang; Yang, Lu; Fu, Yusi; Zhao, Liang; Tang, Fuchou; Huang, Yanyi
2014-05-13
Single-cell whole-transcriptome analysis is a powerful tool for quantifying gene expression heterogeneity in populations of cells. Many techniques have, thus, been recently developed to perform transcriptome sequencing (RNA-Seq) on individual cells. To probe subtle biological variation between samples with limiting amounts of RNA, more precise and sensitive methods are still required. We adapted a previously developed strategy for single-cell RNA-Seq that has shown promise for superior sensitivity and implemented the chemistry in a microfluidic platform for single-cell whole-transcriptome analysis. In this approach, single cells are captured and lysed in a microfluidic device, where mRNAs with poly(A) tails are reverse-transcribed into cDNA. Double-stranded cDNA is then collected and sequenced using a next generation sequencing platform. We prepared 94 libraries consisting of single mouse embryonic cells and technical replicates of extracted RNA and thoroughly characterized the performance of this technology. Microfluidic implementation increased mRNA detection sensitivity as well as improved measurement precision compared with tube-based protocols. With 0.2 M reads per cell, we were able to reconstruct a majority of the bulk transcriptome with 10 single cells. We also quantified variation between and within different types of mouse embryonic cells and found that enhanced measurement precision, detection sensitivity, and experimental throughput aided the distinction between biological variability and technical noise. With this work, we validated the advantages of an early approach to single-cell RNA-Seq and showed that the benefits of combining microfluidic technology with high-throughput sequencing will be valuable for large-scale efforts in single-cell transcriptome analysis.
Li, Qi-Gang; He, Yong-Han; Wu, Huan; Yang, Cui-Ping; Pu, Shao-Yan; Fan, Song-Qing; Jiang, Li-Ping; Shen, Qiu-Shuo; Wang, Xiao-Xiong; Chen, Xiao-Qiong; Yu, Qin; Li, Ying; Sun, Chang; Wang, Xiangting; Zhou, Jumin; Li, Hai-Peng; Chen, Yong-Bin; Kong, Qing-Peng
2017-01-01
Heterogeneity in transcriptional data hampers the identification of differentially expressed genes (DEGs) and understanding of cancer, essentially because current methods rely on cross-sample normalization and/or distribution assumption-both sensitive to heterogeneous values. Here, we developed a new method, Cross-Value Association Analysis (CVAA), which overcomes the limitation and is more robust to heterogeneous data than the other methods. Applying CVAA to a more complex pan-cancer dataset containing 5,540 transcriptomes discovered numerous new DEGs and many previously rarely explored pathways/processes; some of them were validated, both in vitro and in vivo , to be crucial in tumorigenesis, e.g., alcohol metabolism ( ADH1B ), chromosome remodeling ( NCAPH ) and complement system ( Adipsin ). Together, we present a sharper tool to navigate large-scale expression data and gain new mechanistic insights into tumorigenesis.
Safikhani, Zhaleh; Sadeghi, Mehdi; Pezeshk, Hamid; Eslahchi, Changiz
2013-01-01
Recent advances in the sequencing technologies have provided a handful of RNA-seq datasets for transcriptome analysis. However, reconstruction of full-length isoforms and estimation of the expression level of transcripts with a low cost are challenging tasks. We propose a novel de novo method named SSP that incorporates interval integer linear programming to resolve alternatively spliced isoforms and reconstruct the whole transcriptome from short reads. Experimental results show that SSP is fast and precise in determining different alternatively spliced isoforms along with the estimation of reconstructed transcript abundances. The SSP software package is available at http://www.bioinf.cs.ipm.ir/software/ssp. © 2013.
Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.
Dueck, Hannah; Khaladkar, Mugdha; Kim, Tae Kyung; Spaethling, Jennifer M; Francis, Chantal; Suresh, Sangita; Fisher, Stephen A; Seale, Patrick; Beck, Sheryl G; Bartfai, Tamas; Kuhn, Bernhard; Eberwine, James; Kim, Junhyong
2015-06-09
Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotypes. Additionally, single-cell transcriptomics presents unique analysis challenges that need to be addressed to answer this question. We present high quality deep read-depth single-cell RNA sequencing for 91 cells from five mouse tissues and 18 cells from two rat tissues, along with 30 control samples of bulk RNA diluted to single-cell levels. We find that transcriptomes differ globally across tissues with regard to the number of genes expressed, the average expression patterns, and within-cell-type variation patterns. We develop methods to filter genes for reliable quantification and to calibrate biological variation. All cell types include genes with high variability in expression, in a tissue-specific manner. We also find evidence that single-cell variability of neuronal genes in mice is correlated with that in rats consistent with the hypothesis that levels of variation may be conserved. Single-cell RNA-sequencing data provide a unique view of transcriptome function; however, careful analysis is required in order to use single-cell RNA-sequencing measurements for this purpose. Technical variation must be considered in single-cell RNA-sequencing studies of expression variation. For a subset of genes, biological variability within each cell type appears to be regulated in order to perform dynamic functions, rather than solely molecular noise.
Computational analysis of conserved RNA secondary structure in transcriptomes and genomes.
Eddy, Sean R
2014-01-01
Transcriptomics experiments and computational predictions both enable systematic discovery of new functional RNAs. However, many putative noncoding transcripts arise instead from artifacts and biological noise, and current computational prediction methods have high false positive rates. I discuss prospects for improving computational methods for analyzing and identifying functional RNAs, with a focus on detecting signatures of conserved RNA secondary structure. An interesting new front is the application of chemical and enzymatic experiments that probe RNA structure on a transcriptome-wide scale. I review several proposed approaches for incorporating structure probing data into the computational prediction of RNA secondary structure. Using probabilistic inference formalisms, I show how all these approaches can be unified in a well-principled framework, which in turn allows RNA probing data to be easily integrated into a wide range of analyses that depend on RNA secondary structure inference. Such analyses include homology search and genome-wide detection of new structural RNAs.
Kairov, Ulykbek; Cantini, Laura; Greco, Alessandro; Molkenov, Askhat; Czerwinska, Urszula; Barillot, Emmanuel; Zinovyev, Andrei
2017-09-11
Independent Component Analysis (ICA) is a method that models gene expression data as an action of a set of statistically independent hidden factors. The output of ICA depends on a fundamental parameter: the number of components (factors) to compute. The optimal choice of this parameter, related to determining the effective data dimension, remains an open question in the application of blind source separation techniques to transcriptomic data. Here we address the question of optimizing the number of statistically independent components in the analysis of transcriptomic data for reproducibility of the components in multiple runs of ICA (within the same or within varying effective dimensions) and in multiple independent datasets. To this end, we introduce ranking of independent components based on their stability in multiple ICA computation runs and define a distinguished number of components (Most Stable Transcriptome Dimension, MSTD) corresponding to the point of the qualitative change of the stability profile. Based on a large body of data, we demonstrate that a sufficient number of dimensions is required for biological interpretability of the ICA decomposition and that the most stable components with ranks below MSTD have more chances to be reproduced in independent studies compared to the less stable ones. At the same time, we show that a transcriptomics dataset can be reduced to a relatively high number of dimensions without losing the interpretability of ICA, even though higher dimensions give rise to components driven by small gene sets. We suggest a protocol of ICA application to transcriptomics data with a possibility of prioritizing components with respect to their reproducibility that strengthens the biological interpretation. Computing too few components (much less than MSTD) is not optimal for interpretability of the results. The components ranked within MSTD range have more chances to be reproduced in independent studies.
A large-scale full-length cDNA analysis to explore the budding yeast transcriptome
Miura, Fumihito; Kawaguchi, Noriko; Sese, Jun; Toyoda, Atsushi; Hattori, Masahira; Morishita, Shinichi; Ito, Takashi
2006-01-01
We performed a large-scale cDNA analysis to explore the transcriptome of the budding yeast Saccharomyces cerevisiae. We sequenced two cDNA libraries, one from the cells exponentially growing in a minimal medium and the other from meiotic cells. Both libraries were generated by using a vector-capping method that allows the accurate mapping of transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, to suggest that most yeast genes have two or more TSSs. In addition, we identified 45 previously undescribed introns, including those affecting current ORF annotations and those spliced alternatively. Furthermore, the analysis revealed 667 transcription units in the intergenic regions and transcripts derived from antisense strands of 367 known features. We also found that 348 ORFs carry TSSs in their 3′-halves to generate sense transcripts starting from inside the ORFs. These results indicate that the budding yeast transcriptome is considerably more complex than previously thought, and it shares many recently revealed characteristics with the transcriptomes of mammals and other higher eukaryotes. Thus, the genome-wide active transcription that generates novel classes of transcripts appears to be an intrinsic feature of the eukaryotic cells. The budding yeast will serve as a versatile model for the studies on these aspects of transcriptome, and the full-length cDNA clones can function as an invaluable resource in such studies. PMID:17101987
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haggard, Derik E.; Noyes, Pamela D.; Waters, Katrina M.
There is a need to develop novel, high-throughput screening and prioritization methods to identify chemicals with adverse estrogen, androgen, and thyroid activity to protect human health and the environment and is of interest to the Endocrine Disruptor Screening Program. The current aim is to explore the utility of zebrafish as a testing paradigm to classify endocrine activity using phenotypically anchored transcriptome profiling. Transcriptome analysis was conducted on embryos exposed to 25 estrogen-, androgen-, or thyroid-active chemicals at a concentration that elicited adverse malformations or mortality at 120 hours post-fertilization in 80% of the animals exposed. Analysis of the top 1000more » significant differentially expressed transcripts across all treatments identified a unique transcriptional and phenotypic profile for thyroid hormone receptor agonists, which can be used as a biomarker screen for potential thyroid hormone agonists.« less
Transcriptome analysis and related databases of Lactococcus lactis.
Kuipers, Oscar P; de Jong, Anne; Baerends, Richard J S; van Hijum, Sacha A F T; Zomer, Aldert L; Karsens, Harma A; den Hengst, Chris D; Kramer, Naomi E; Buist, Girbe; Kok, Jan
2002-08-01
Several complete genome sequences of Lactococcus lactis and their annotations will become available in the near future, next to the already published genome sequence of L. lactis ssp. lactis IL 1403. This will allow intraspecies comparative genomics studies as well as functional genomics studies aimed at a better understanding of physiological processes and regulatory networks operating in lactococci. This paper describes the initial set-up of a DNA-microarray facility in our group, to enable transcriptome analysis of various Gram-positive bacteria, including a ssp. lactis and a ssp. cremoris strain of Lactococcus lactis. Moreover a global description will be given of the hardware and software requirements for such a set-up, highlighting the crucial integration of relevant bioinformatics tools and methods. This includes the development of MolGenIS, an information system for transcriptome data storage and retrieval, and LactococCye, a metabolic pathway/genome database of Lactococcus lactis.
2017-01-01
Mapping gene expression as a quantitative trait using whole genome-sequencing and transcriptome analysis allows to discover the functional consequences of genetic variation. We developed a novel method and ultra-fast software Findr for higly accurate causal inference between gene expression traits using cis-regulatory DNA variations as causal anchors, which improves current methods by taking into consideration hidden confounders and weak regulations. Findr outperformed existing methods on the DREAM5 Systems Genetics challenge and on the prediction of microRNA and transcription factor targets in human lymphoblastoid cells, while being nearly a million times faster. Findr is publicly available at https://github.com/lingfeiwang/findr. PMID:28821014
The technology and biology of single-cell RNA sequencing.
Kolodziejczyk, Aleksandra A; Kim, Jong Kyoung; Svensson, Valentine; Marioni, John C; Teichmann, Sarah A
2015-05-21
The differences between individual cells can have profound functional consequences, in both unicellular and multicellular organisms. Recently developed single-cell mRNA-sequencing methods enable unbiased, high-throughput, and high-resolution transcriptomic analysis of individual cells. This provides an additional dimension to transcriptomic information relative to traditional methods that profile bulk populations of cells. Already, single-cell RNA-sequencing methods have revealed new biology in terms of the composition of tissues, the dynamics of transcription, and the regulatory relationships between genes. Rapid technological developments at the level of cell capture, phenotyping, molecular biology, and bioinformatics promise an exciting future with numerous biological and medical applications. Copyright © 2015 Elsevier Inc. All rights reserved.
Blood transcriptomics and metabolomics for personalized medicine.
Li, Shuzhao; Todor, Andrei; Luo, Ruiyan
2016-01-01
Molecular analysis of blood samples is pivotal to clinical diagnosis and has been intensively investigated since the rise of systems biology. Recent developments have opened new opportunities to utilize transcriptomics and metabolomics for personalized and precision medicine. Efforts from human immunology have infused into this area exquisite characterizations of subpopulations of blood cells. It is now possible to infer from blood transcriptomics, with fine accuracy, the contribution of immune activation and of cell subpopulations. In parallel, high-resolution mass spectrometry has brought revolutionary analytical capability, detecting > 10,000 metabolites, together with environmental exposure, dietary intake, microbial activity, and pharmaceutical drugs. Thus, the re-examination of blood chemicals by metabolomics is in order. Transcriptomics and metabolomics can be integrated to provide a more comprehensive understanding of the human biological states. We will review these new data and methods and discuss how they can contribute to personalized medicine.
Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R
2016-05-27
Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and repositioning, allowing the grouping and prioritisation of drug repositioning candidates on the basis of putative mode of action.
A practical examination of RNA isolation methods for European pear (Pyrus communis)
USDA-ARS?s Scientific Manuscript database
With the goal of identifying fast, reliable and broadly applicable RNA isolation methods in European pear fruit for downstream transcriptome analysis, we evaluated several commercially available kit-based RNA isolations methods, plus our modified version of a published cetyl trimethyl ammonium bromi...
Kunnath-Velayudhan, Shajo; Goldberg, Michael F; Saini, Neeraj K; Johndrow, Christopher T; Ng, Tony W; Johnson, Alison J; Xu, Jiayong; Chan, John; Jacobs, William R; Porcelli, Steven A
2017-10-01
Analysis of Ag-specific CD4 + T cells in mycobacterial infections at the transcriptome level is informative but technically challenging. Although several methods exist for identifying Ag-specific T cells, including intracellular cytokine staining, cell surface cytokine-capture assays, and staining with peptide:MHC class II multimers, all of these have significant technical constraints that limit their usefulness. Measurement of activation-induced expression of CD154 has been reported to detect live Ag-specific CD4 + T cells, but this approach remains underexplored and, to our knowledge, has not previously been applied in mycobacteria-infected animals. In this article, we show that CD154 expression identifies adoptively transferred or endogenous Ag-specific CD4 + T cells induced by Mycobacterium bovis bacillus Calmette-Guérin vaccination. We confirmed that Ag-specific cytokine production was positively correlated with CD154 expression by CD4 + T cells from bacillus Calmette-Guérin-vaccinated mice and show that high-quality microarrays can be performed from RNA isolated from CD154 + cells purified by cell sorting. Analysis of microarray data demonstrated that the transcriptome of CD4 + CD154 + cells was distinct from that of CD154 - cells and showed major enrichment of transcripts encoding multiple cytokines and pathways of cellular activation. One notable finding was the identification of a previously unrecognized subset of mycobacteria-specific CD4 + T cells that is characterized by the production of IL-3. Our results support the use of CD154 expression as a practical and reliable method to isolate live Ag-specific CD4 + T cells for transcriptomic analysis and potentially for a range of other studies in infected or previously immunized hosts. Copyright © 2017 by The American Association of Immunologists, Inc.
The aquatic animals' transcriptome resource for comparative functional analysis.
Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da
2018-05-09
Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, P. E.; Trivedi, G.; Sreedasyam, A.
2010-07-06
Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data.
Tomescu, Oana A; Mattanovich, Diethard; Thallinger, Gerhard G
2014-01-01
Technological improvements have shifted the focus from data generation to data analysis. The availability of large amounts of data from transcriptomics, protemics and metabolomics experiments raise new questions concerning suitable integrative analysis methods. We compare three integrative analysis techniques (co-inertia analysis, generalized singular value decomposition and integrative biclustering) by applying them to gene and protein abundance data from the six life cycle stages of Plasmodium falciparum. Co-inertia analysis is an analysis method used to visualize and explore gene and protein data. The generalized singular value decomposition has shown its potential in the analysis of two transcriptome data sets. Integrative Biclustering applies biclustering to gene and protein data. Using CIA, we visualize the six life cycle stages of Plasmodium falciparum, as well as GO terms in a 2D plane and interpret the spatial configuration. With GSVD, we decompose the transcriptomic and proteomic data sets into matrices with biologically meaningful interpretations and explore the processes captured by the data sets. IBC identifies groups of genes, proteins, GO Terms and life cycle stages of Plasmodium falciparum. We show method-specific results as well as a network view of the life cycle stages based on the results common to all three methods. Additionally, by combining the results of the three methods, we create a three-fold validated network of life cycle stage specific GO terms: Sporozoites are associated with transcription and transport; merozoites with entry into host cell as well as biosynthetic and metabolic processes; rings with oxidation-reduction processes; trophozoites with glycolysis and energy production; schizonts with antigenic variation and immune response; gametocyctes with DNA packaging and mitochondrial transport. Furthermore, the network connectivity underlines the separation of the intraerythrocytic cycle from the gametocyte and sporozoite stages. Using integrative analysis techniques, we can integrate knowledge from different levels and obtain a wider view of the system under study. The overlap between method-specific and common results is considerable, even if the basic mathematical assumptions are very different. The three-fold validated network of life cycle stage characteristics of Plasmodium falciparum could identify a large amount of the known associations from literature in only one study.
Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data
2014-01-01
Background Technological improvements have shifted the focus from data generation to data analysis. The availability of large amounts of data from transcriptomics, protemics and metabolomics experiments raise new questions concerning suitable integrative analysis methods. We compare three integrative analysis techniques (co-inertia analysis, generalized singular value decomposition and integrative biclustering) by applying them to gene and protein abundance data from the six life cycle stages of Plasmodium falciparum. Co-inertia analysis is an analysis method used to visualize and explore gene and protein data. The generalized singular value decomposition has shown its potential in the analysis of two transcriptome data sets. Integrative Biclustering applies biclustering to gene and protein data. Results Using CIA, we visualize the six life cycle stages of Plasmodium falciparum, as well as GO terms in a 2D plane and interpret the spatial configuration. With GSVD, we decompose the transcriptomic and proteomic data sets into matrices with biologically meaningful interpretations and explore the processes captured by the data sets. IBC identifies groups of genes, proteins, GO Terms and life cycle stages of Plasmodium falciparum. We show method-specific results as well as a network view of the life cycle stages based on the results common to all three methods. Additionally, by combining the results of the three methods, we create a three-fold validated network of life cycle stage specific GO terms: Sporozoites are associated with transcription and transport; merozoites with entry into host cell as well as biosynthetic and metabolic processes; rings with oxidation-reduction processes; trophozoites with glycolysis and energy production; schizonts with antigenic variation and immune response; gametocyctes with DNA packaging and mitochondrial transport. Furthermore, the network connectivity underlines the separation of the intraerythrocytic cycle from the gametocyte and sporozoite stages. Conclusion Using integrative analysis techniques, we can integrate knowledge from different levels and obtain a wider view of the system under study. The overlap between method-specific and common results is considerable, even if the basic mathematical assumptions are very different. The three-fold validated network of life cycle stage characteristics of Plasmodium falciparum could identify a large amount of the known associations from literature in only one study. PMID:25033389
2014-01-01
Background Clinically useful biomarkers for patient stratification and monitoring of disease progression and drug response are in big demand in drug development and for addressing potential safety concerns. Many diseases influence the frequency and phenotype of cells found in the peripheral blood and the transcriptome of blood cells. Changes in cell type composition influence whole blood gene expression analysis results and thus the discovery of true transcript level changes remains a challenge. We propose a robust and reproducible procedure, which includes whole transcriptome gene expression profiling of major subsets of immune cell cells directly sorted from whole blood. Methods Target cells were enriched using magnetic microbeads and an autoMACS® Pro Separator (Miltenyi Biotec). Flow cytometric analysis for purity was performed before and after magnetic cell sorting. Total RNA was hybridized on HGU133 Plus 2.0 expression microarrays (Affymetrix, USA). CEL files signal intensity values were condensed using RMA and a custom CDF file (EntrezGene-based). Results Positive selection by use of MACS® Technology coupled to transcriptomics was assessed for eight different peripheral blood cell types, CD14+ monocytes, CD3+, CD4+, or CD8+ T cells, CD15+ granulocytes, CD19+ B cells, CD56+ NK cells, and CD45+ pan leukocytes. RNA quality from enriched cells was above a RIN of eight. GeneChip analysis confirmed cell type specific transcriptome profiles. Storing whole blood collected in an EDTA Vacutainer® tube at 4°C followed by MACS does not activate sorted cells. Gene expression analysis supports cell enrichment measurements by MACS. Conclusions The proposed workflow generates reproducible cell-type specific transcriptome data which can be translated to clinical settings and used to identify clinically relevant gene expression biomarkers from whole blood samples. This procedure enables the integration of transcriptomics of relevant immune cell subsets sorted directly from whole blood in clinical trial protocols. PMID:25984272
This week, we are excited to announce the launch of the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) Proteogenomics Computational DREAM Challenge. The aim of this Challenge is to encourage the generation of computational methods for extracting information from the cancer proteome and for linking those data to genomic and transcriptomic information. The specific goals are to predict proteomic and phosphoproteomic data from other multiple data types including transcriptomics and genetics.
Transcriptomic Analysis of the Salivary Glands of an Invasive Whitefly
Su, Yun-Lin; Li, Jun-Min; Li, Meng; Luan, Jun-Bo; Ye, Xiao-Dong; Wang, Xiao-Wei; Liu, Shu-Sheng
2012-01-01
Background Some species of the whitefly Bemisia tabaci complex cause tremendous losses to crops worldwide through feeding directly and virus transmission indirectly. The primary salivary glands of whiteflies are critical for their feeding and virus transmission. However, partly due to their tiny size, research on whitefly salivary glands is limited and our knowledge on these glands is scarce. Methodology/Principal Findings We sequenced the transcriptome of the primary salivary glands of the Mediterranean species of B. tabaci complex using an effective cDNA amplification method in combination with short read sequencing (Illumina). In a single run, we obtained 13,615 unigenes. The quantity of the unigenes obtained from the salivary glands of the whitefly is at least four folds of the salivary gland genes from other plant-sucking insects. To reveal the functions of the primary glands, sequence similarity search and comparisons with the whole transcriptome of the whitefly were performed. The results demonstrated that the genes related to metabolism and transport were significantly enriched in the primary salivary glands. Furthermore, we found that a number of highly expressed genes in the salivary glands might be involved in secretory protein processing, secretion and virus transmission. To identify potential proteins of whitefly saliva, the translated unigenes were put into secretory protein prediction. Finally, 295 genes were predicted to encode secretory proteins and some of them might play important roles in whitefly feeding. Conclusions/Significance: The combined method of cDNA amplification, Illumina sequencing and de novo assembly is suitable for transcriptomic analysis of tiny organs in insects. Through analysis of the transcriptome, genomic features of the primary salivary glands were dissected and biologically important proteins, especially secreted proteins, were predicted. Our findings provide substantial sequence information for the primary salivary glands of whiteflies and will be the basis for future studies on whitefly-plant interactions and virus transmission. PMID:22745728
Takahara, Hiroyuki; Dolf, Andreas; Endl, Elmar; O'Connell, Richard
2009-08-01
Generation of stage-specific cDNA libraries is a powerful approach to identify pathogen genes that are differentially expressed during plant infection. Biotrophic pathogens develop specialized infection structures inside living plant cells, but sampling the transcriptome of these structures is problematic due to the low ratio of fungal to plant RNA, and the lack of efficient methods to isolate them from infected plants. Here we established a method, based on fluorescence-activated cell sorting (FACS), to purify the intracellular biotrophic hyphae of Colletotrichum higginsianum from homogenates of infected Arabidopsis leaves. Specific selection of viable hyphae using a fluorescent vital marker provided intact RNA for cDNA library construction. Pilot-scale sequencing showed that the library was enriched with plant-induced and pathogenicity-related fungal genes, including some encoding small, soluble secreted proteins that represent candidate fungal effectors. The high purity of the hyphae (94%) prevented contamination of the library by sequences derived from host cells or other fungal cell types. RT-PCR confirmed that genes identified in the FACS-purified hyphae were also expressed in planta. The method has wide applicability for isolating the infection structures of other plant pathogens, and will facilitate cell-specific transcriptome analysis via deep sequencing and microarray hybridization, as well as proteomic analyses.
Revealing gene regulation and association through biological networks
USDA-ARS?s Scientific Manuscript database
This review had first summarized traditional methods used by plant breeders for genetic improvement, such as QTL analysis and transcriptomic analysis. With accumulating data, we can draw a network that comprises all possible links between members of a community, including protein–protein interaction...
Expression Profiling Smackdown: Human Transcriptome Array HTA 2.0 vs. RNA-Seq
Palermo, Meghann; Driscoll, Heather; Tighe, Scott; Dragon, Julie; Bond, Jeff; Shukla, Arti; Vangala, Mahesh; Vincent, James; Hunter, Tim
2014-01-01
The advent of both microarray and massively parallel sequencing have revolutionized high-throughput analysis of the human transcriptome. Due to limitations in microarray technology, detecting and quantifying coding transcript isoforms, in addition to non-coding transcripts, has been challenging. As a result, RNA-Seq has been the preferred method for characterizing the full human transcriptome, until now. A new high-resolution array from Affymetrix, GeneChip Human Transcriptome Array 2.0 (HTA 2.0), has been designed to interrogate all transcript isoforms in the human transcriptome with >6 million probes targeting coding transcripts, exon-exon splice junctions, and non-coding transcripts. Here we compare expression results from GeneChip HTA 2.0 and RNA-Seq data using identical RNA extractions from three samples each of healthy human mesothelial cells in culture, LP9-C1, and healthy mesothelial cells treated with asbestos, LP9-A1. For GeneChip HTA 2.0 sample preparation, we chose to compare two target preparation methods, NuGEN Ovation Pico WTA V2 with the Encore Biotin Module versus Affymetrix's GeneChip WT PLUS with the WT Terminal Labeling Kit, on identical RNA extractions from both untreated and treated samples. These same RNA extractions were used for the RNA-Seq library preparation. All analyses were performed in Partek Genomics Suite 6.6. Expression profiles for control and asbestos-treated mesothelial cells prepared with NuGEN versus Affymetrix target preparation methods (GeneChip HTA 2.0) are compared to each other as well as to RNA-Seq results.
The bench scientist's guide to RNA-Seq analysis
USDA-ARS?s Scientific Manuscript database
RNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance. However, analyses of the large data sets obtained by sequencing the entire transcriptome of organisms have generally been performed by bioinformatic specialists. Here we outline a methods strategy desi...
Transcriptional profiling: a potential anti-doping strategy.
Rupert, J L
2009-12-01
Evolving challenges require evolving responses. The use of illicit performance enhancing drugs by athletes permeates the reality and the perception of elite sports. New drugs with ergogenic or masking potential are quickly adopted, driven by a desire to win and the necessity of avoiding detection. To counter this trend, anti-doping authorities are continually refining existing assays and developing new testing strategies. In the post-genome era, genetic- and molecular-based tests are being evaluated as potential approaches to detect new and sophisticated forms of doping. Transcriptome analysis, in which a tissue's complement of mRNA transcripts is characterized, is one such method. The quantity and composition of a tissue's transcriptome is highly reflective of milieu and metabolic activity. There is much interest in transcriptional profiling in medical diagnostics and, as transcriptional information can be obtained from a variety of easily accessed tissues, similar approaches could be used in doping control. This article briefly reviews current understanding of the transcriptome, common methods of global analysis of gene expression and non-invasive sample sources. While the focus of this article is on anti-doping, the principles and methodology described could be applied to any research in which non-invasive, yet biologically informative sampling is desired.
Spatial transcriptomic analysis of cryosectioned tissue samples with Geo-seq.
Chen, Jun; Suo, Shengbao; Tam, Patrick Pl; Han, Jing-Dong J; Peng, Guangdun; Jing, Naihe
2017-03-01
Conventional gene expression studies analyze multiple cells simultaneously or single cells, for which the exact in vivo or in situ position is unknown. Although cellular heterogeneity can be discerned when analyzing single cells, any spatially defined attributes that underpin the heterogeneous nature of the cells cannot be identified. Here, we describe how to use Geo-seq, a method that combines laser capture microdissection (LCM) and single-cell RNA-seq technology. The combination of these two methods enables the elucidation of cellular heterogeneity and spatial variance simultaneously. The Geo-seq protocol allows the profiling of transcriptome information from only a small number cells and retains their native spatial information. This protocol has wide potential applications to address biological and pathological questions of cellular properties such as prospective cell fates, biological function and the gene regulatory network. Geo-seq has been applied to investigate the spatial transcriptome of mouse early embryo, mouse brain, and pathological liver and sperm tissues. The entire protocol from tissue collection and microdissection to sequencing requires ∼5 d, Data analysis takes another 1 or 2 weeks, depending on the amount of data and the speed of the processor.
Safo, Sandra E; Li, Shuzhao; Long, Qi
2018-03-01
Integrative analysis of high dimensional omics data is becoming increasingly popular. At the same time, incorporating known functional relationships among variables in analysis of omics data has been shown to help elucidate underlying mechanisms for complex diseases. In this article, our goal is to assess association between transcriptomic and metabolomic data from a Predictive Health Institute (PHI) study that includes healthy adults at a high risk of developing cardiovascular diseases. Adopting a strategy that is both data-driven and knowledge-based, we develop statistical methods for sparse canonical correlation analysis (CCA) with incorporation of known biological information. Our proposed methods use prior network structural information among genes and among metabolites to guide selection of relevant genes and metabolites in sparse CCA, providing insight on the molecular underpinning of cardiovascular disease. Our simulations demonstrate that the structured sparse CCA methods outperform several existing sparse CCA methods in selecting relevant genes and metabolites when structural information is informative and are robust to mis-specified structural information. Our analysis of the PHI study reveals that a number of gene and metabolic pathways including some known to be associated with cardiovascular diseases are enriched in the set of genes and metabolites selected by our proposed approach. © 2017, The International Biometric Society.
Melicher, Dacotah; Torson, Alex S; Dworkin, Ian; Bowsher, Julia H
2014-03-12
The Sepsidae family of flies is a model for investigating how sexual selection shapes courtship and sexual dimorphism in a comparative framework. However, like many non-model systems, there are few molecular resources available. Large-scale sequencing and assembly have not been performed in any sepsid, and the lack of a closely related genome makes investigation of gene expression challenging. Our goal was to develop an automated pipeline for de novo transcriptome assembly, and to use that pipeline to assemble and analyze the transcriptome of the sepsid Themira biloba. Our bioinformatics pipeline uses cloud computing services to assemble and analyze the transcriptome with off-site data management, processing, and backup. It uses a multiple k-mer length approach combined with a second meta-assembly to extend transcripts and recover more bases of transcript sequences than standard single k-mer assembly. We used 454 sequencing to generate 1.48 million reads from cDNA generated from embryo, larva, and pupae of T. biloba and assembled a transcriptome consisting of 24,495 contigs. Annotation identified 16,705 transcripts, including those involved in embryogenesis and limb patterning. We assembled transcriptomes from an additional three non-model organisms to demonstrate that our pipeline assembled a higher-quality transcriptome than single k-mer approaches across multiple species. The pipeline we have developed for assembly and analysis increases contig length, recovers unique transcripts, and assembles more base pairs than other methods through the use of a meta-assembly. The T. biloba transcriptome is a critical resource for performing large-scale RNA-Seq investigations of gene expression patterns, and is the first transcriptome sequenced in this Dipteran family.
Lamm, Ayelet T; Stadler, Michael R; Zhang, Huibin; Gent, Jonathan I; Fire, Andrew Z
2011-02-01
We have used a combination of three high-throughput RNA capture and sequencing methods to refine and augment the transcriptome map of a well-studied genetic model, Caenorhabditis elegans. The three methods include a standard (non-directional) library preparation protocol relying on cDNA priming and foldback that has been used in several previous studies for transcriptome characterization in this species, and two directional protocols, one involving direct capture of single-stranded RNA fragments and one involving circular-template PCR (CircLigase). We find that each RNA-seq approach shows specific limitations and biases, with the application of multiple methods providing a more complete map than was obtained from any single method. Of particular note in the analysis were substantial advantages of CircLigase-based and ssRNA-based capture for defining sequences and structures of the precise 5' ends (which were lost using the double-strand cDNA capture method). Of the three methods, ssRNA capture was most effective in defining sequences to the poly(A) junction. Using data sets from a spectrum of C. elegans strains and stages and the UCSC Genome Browser, we provide a series of tools, which facilitate rapid visualization and assignment of gene structures.
Bizama, Carolina; Benavente, Felipe; Salvatierra, Edgardo; Gutiérrez-Moraga, Ana; Espinoza, Jaime A; Fernández, Elmer A; Roa, Iván; Mazzolini, Guillermo; Sagredo, Eduardo A; Gidekel, Manuel; Podhajcer, Osvaldo L
2014-02-15
Studies on the low-abundance transcriptome are of paramount importance for identifying the intimate mechanisms of tumor progression that can lead to novel therapies. The aim of the present study was to identify novel markers and targetable genes and pathways in advanced human gastric cancer through analyses of the low-abundance transcriptome. The procedure involved an initial subtractive hybridization step, followed by global gene expression analysis using microarrays. We observed profound differences, both at the single gene and gene ontology levels, between the low-abundance transcriptome and the whole transcriptome. Analysis of the low-abundance transcriptome led to the identification and validation by tissue microarrays of novel biomarkers, such as LAMA3 and TTN; moreover, we identified cancer type-specific intracellular pathways and targetable genes, such as IRS2, IL17, IFNγ, VEGF-C, WISP1, FZD5 and CTBP1 that were not detectable by whole transcriptome analyses. We also demonstrated that knocking down the expression of CTBP1 sensitized gastric cancer cells to mainstay chemotherapeutic drugs. We conclude that the analysis of the low-abundance transcriptome provides useful insights into the molecular basis and treatment of cancer. © 2013 UICC.
Sun, Li Xue; Teng, Jian; Zhao, Yan; Li, Ning; Wang, Hui
2018-01-01
Background: Nowadays, the molecular mechanisms governing TSD (temperature-dependent sex determination) or GSD + TE (genotypic sex determination + temperature effects) remain a mystery in fish. Methods: We developed three all-female families of Nile tilapia (Oreochromis niloticus), and the family with the highest male ratio after high-temperature treatment was used for transcriptome analysis. Results: First, gonadal histology analysis indicated that the histological morphology of control females (CF) was not significantly different from that of high-temperature-treated females (TF) at various development stages. However, the high-temperature treatment caused a lag of spermatogenesis in high-temperature-induced neomales (IM). Next, we sequenced the transcriptome of CF, TF, and IM Nile tilapia. 79, 11,117, and 11,000 differentially expressed genes (DEGs) were detected in the CF–TF, CF–IM, and TF–IM comparisons, respectively, and 44 DEGs showed identical expression changes in the CF–TF and CF–IM comparisons. Principal component analysis (PCA) indicated that three individuals in CF and three individuals in TF formed a cluster, and three individuals in IM formed a distinct cluster, which confirmed that the gonad transcriptome profile of TF was similar to that of CF and different from that of IM. Finally, six sex-related genes were validated by qRT-PCR. Conclusions: This study identifies a number of genes that may be involved in GSD + TE, which will be useful for investigating the molecular mechanisms of TSD or GSD + TE in fish. PMID:29495590
Preliminary profiling of blood transcriptome in a rat model of hemorrhagic shock.
Braga, D; Barcella, M; D'Avila, F; Lupoli, S; Tagliaferri, F; Santamaria, M H; DeLano, F A; Baselli, G; Schmid-Schönbein, G W; Kistler, E B; Aletti, F; Barlassina, C
2017-08-01
Hemorrhagic shock is a leading cause of morbidity and mortality worldwide. Significant blood loss may lead to decreased blood pressure and inadequate tissue perfusion with resultant organ failure and death, even after replacement of lost blood volume. One reason for this high acuity is that the fundamental mechanisms of shock are poorly understood. Proteomic and metabolomic approaches have been used to investigate the molecular events occurring in hemorrhagic shock but, to our knowledge, a systematic analysis of the transcriptomic profile is missing. Therefore, a pilot analysis using paired-end RNA sequencing was used to identify changes that occur in the blood transcriptome of rats subjected to hemorrhagic shock after blood reinfusion. Hemorrhagic shock was induced using a Wigger's shock model. The transcriptome of whole blood from shocked animals shows modulation of genes related to inflammation and immune response (Tlr13, Il1b, Ccl6, Lgals3), antioxidant functions (Mt2A, Mt1), tissue injury and repair pathways (Gpnmb, Trim72) and lipid mediators (Alox5ap, Ltb4r, Ptger2) compared with control animals. These findings are congruent with results obtained in hemorrhagic shock analysis by other authors using metabolomics and proteomics. The analysis of blood transcriptome may be a valuable tool to understand the biological changes occurring in hemorrhagic shock and a promising approach for the identification of novel biomarkers and therapeutic targets. Impact statement This study provides the first pilot analysis of the changes occurring in transcriptome expression of whole blood in hemorrhagic shock (HS) rats. We showed that the analysis of blood transcriptome is a useful approach to investigate pathways and functional alterations in this disease condition. This pilot study encourages the possible application of transcriptome analysis in the clinical setting, for the molecular profiling of whole blood in HS patients.
Giustacchini, Alice; Thongjuea, Supat; Barkas, Nikolaos; Woll, Petter S; Povinelli, Benjamin J; Booth, Christopher A G; Sopp, Paul; Norfo, Ruggiero; Rodriguez-Meira, Alba; Ashley, Neil; Jamieson, Lauren; Vyas, Paresh; Anderson, Kristina; Segerstolpe, Åsa; Qian, Hong; Olsson-Strömberg, Ulla; Mustjoki, Satu; Sandberg, Rickard; Jacobsen, Sten Eirik W; Mead, Adam J
2017-06-01
Recent advances in single-cell transcriptomics are ideally placed to unravel intratumoral heterogeneity and selective resistance of cancer stem cell (SC) subpopulations to molecularly targeted cancer therapies. However, current single-cell RNA-sequencing approaches lack the sensitivity required to reliably detect somatic mutations. We developed a method that combines high-sensitivity mutation detection with whole-transcriptome analysis of the same single cell. We applied this technique to analyze more than 2,000 SCs from patients with chronic myeloid leukemia (CML) throughout the disease course, revealing heterogeneity of CML-SCs, including the identification of a subgroup of CML-SCs with a distinct molecular signature that selectively persisted during prolonged therapy. Analysis of nonleukemic SCs from patients with CML also provided new insights into cell-extrinsic disruption of hematopoiesis in CML associated with clinical outcome. Furthermore, we used this single-cell approach to identify a blast-crisis-specific SC population, which was also present in a subclone of CML-SCs during the chronic phase in a patient who subsequently developed blast crisis. This approach, which might be broadly applied to any malignancy, illustrates how single-cell analysis can identify subpopulations of therapy-resistant SCs that are not apparent through cell-population analysis.
Wenger, Yvan; Galliot, Brigitte
2013-03-25
Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
2013-01-01
Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
Single-Cell RNA-Sequencing: Assessment of Differential Expression Analysis Methods.
Dal Molin, Alessandra; Baruzzo, Giacomo; Di Camillo, Barbara
2017-01-01
The sequencing of the transcriptomes of single-cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types and for the study of stochastic gene expression. In recent years, various tools for analyzing single-cell RNA-sequencing data have been proposed, many of them with the purpose of performing differentially expression analysis. In this work, we compare four different tools for single-cell RNA-sequencing differential expression, together with two popular methods originally developed for the analysis of bulk RNA-sequencing data, but largely applied to single-cell data. We discuss results obtained on two real and one synthetic dataset, along with considerations about the perspectives of single-cell differential expression analysis. In particular, we explore the methods performance in four different scenarios, mimicking different unimodal or bimodal distributions of the data, as characteristic of single-cell transcriptomics. We observed marked differences between the selected methods in terms of precision and recall, the number of detected differentially expressed genes and the overall performance. Globally, the results obtained in our study suggest that is difficult to identify a best performing tool and that efforts are needed to improve the methodologies for single-cell RNA-sequencing data analysis and gain better accuracy of results.
The cancer transcriptome is shaped by genetic changes, variation in gene transcription, mRNA processing, editing and stability, and the cancer microbiome. Deciphering this variation and understanding its implications on tumorigenesis requires sophisticated computational analyses. Most RNA-Seq analyses rely on methods that first map short reads to a reference genome, and then compare them to annotated transcripts or assemble them. However, this strategy can be limited when the cancer genome is substantially different than the reference or for detecting sequences from the cancer microbiome.
Wu, Jieying; Gao, Weimin; Zhang, Weiwen; Meldrum, Deirdre R
2011-01-01
Limitation in sample quality and quantity is one of the big obstacles for applying metatranscriptomic technologies to explore gene expression and functionality of microbial communities in natural environments. In this study, several amplification methods were evaluated for whole-transcriptome amplification of deep-sea microbial samples, which are of low cell density and high impurity. The best amplification method was identified and incorporated into a complete protocol to isolate and amplify deep-sea microbial samples. In the protocol, total RNA was first isolated by a modified method combining Trizol (Invitrogen, CA) and RNeasy (QIAGEN, CA) method, amplified with a WT-Ovation™ Pico RNA Amplification System (NuGEN, CA), and then converted to double-strand DNA from single-strand cDNA with a WT-Ovation™ Exon Module (NuGEN, CA). The products from the whole-transcriptome amplification of deep-sea microbial samples were assessed first through random clone library sequencing. The BLAST search results showed that marine-based sequences are dominant in the libraries, consistent with the ecological source of the samples. The products were then used for next-generation Roche GS FLX Titanium sequencing to obtain metatranscriptome data. Preliminary analysis of the metatranscriptomic data showed good sequencing quality. Although the protocol was designed and demonstrated to be effective for deep-sea microbial samples, it should be applicable to similar samples from other extreme environments in exploring community structure and functionality of microbial communities. Copyright © 2010 Elsevier B.V. All rights reserved.
Li, Jing-Woei; Lee, Heung-Man; Wang, Ying; Tong, Amy Hin-Yan; Yip, Kevin Y.; Tsui, Stephen Kwok-Wing; Lok, Si; Ozaki, Risa; Luk, Andrea O; Kong, Alice P. S.; So, Wing-Yee; Ma, Ronald C. W.; Chan, Juliana C. N.; Chan, Ting-Fung
2016-01-01
Protein interactions play significant roles in complex diseases. We analyzed peripheral blood mononuclear cells (PBMC) transcriptome using a multi-method strategy. We constructed a tissue-specific interactome (T2Di) and identified 420 molecular signatures associated with T2D-related comorbidity and symptoms, mainly implicated in inflammation, adipogenesis, protein phosphorylation and hormonal secretion. Apart from explaining the residual associations within the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) study, the T2Di signatures were enriched in pathogenic cell type-specific regulatory elements related to fetal development, immunity and expression quantitative trait loci (eQTL). The T2Di revealed a novel locus near a well-established GWAS loci AChE, in which SRRT interacts with JAZF1, a T2D-GWAS gene implicated in pancreatic function. The T2Di also included known anti-diabetic drug targets (e.g. PPARD, MAOB) and identified possible druggable targets (e.g. NCOR2, PDGFR). These T2Di signatures were validated by an independent computational method, and by expression data of pancreatic islet, muscle and liver with some of the signatures (CEBPB, SREBF1, MLST8, SRF, SRRT and SLC12A9) confirmed in PBMC from an independent cohort of 66 T2D and 66 control subjects. By combining prior knowledge and transcriptome analysis, we have constructed an interactome to explain the multi-layered regulatory pathways in T2D. PMID:27752041
Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2013-01-01
Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids.
Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2013-01-01
Background Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. Methodology and Principal Findings In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. Conclusion The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids. PMID:24278202
Walker, Joseph F; Yang, Ya; Feng, Tao; Timoneda, Alfonso; Mikenas, Jessica; Hutchison, Vera; Edwards, Caroline; Wang, Ning; Ahluwalia, Sonia; Olivieri, Julia; Walker-Hale, Nathanael; Majure, Lucas C; Puente, Raúl; Kadereit, Gudrun; Lauterbach, Maximilian; Eggli, Urs; Flores-Olvera, Hilda; Ochoterena, Helga; Brockington, Samuel F; Moore, Michael J; Smith, Stephen A
2018-03-01
The Caryophyllales contain ~12,500 species and are known for their cosmopolitan distribution, convergence of trait evolution, and extreme adaptations. Some relationships within the Caryophyllales, like those of many large plant clades, remain unclear, and phylogenetic studies often recover alternative hypotheses. We explore the utility of broad and dense transcriptome sampling across the order for resolving evolutionary relationships in Caryophyllales. We generated 84 transcriptomes and combined these with 224 publicly available transcriptomes to perform a phylogenomic analysis of Caryophyllales. To overcome the computational challenge of ortholog detection in such a large data set, we developed an approach for clustering gene families that allowed us to analyze >300 transcriptomes and genomes. We then inferred the species relationships using multiple methods and performed gene-tree conflict analyses. Our phylogenetic analyses resolved many clades with strong support, but also showed significant gene-tree discordance. This discordance is not only a common feature of phylogenomic studies, but also represents an opportunity to understand processes that have structured phylogenies. We also found taxon sampling influences species-tree inference, highlighting the importance of more focused studies with additional taxon sampling. Transcriptomes are useful both for species-tree inference and for uncovering evolutionary complexity within lineages. Through analyses of gene-tree conflict and multiple methods of species-tree inference, we demonstrate that phylogenomic data can provide unparalleled insight into the evolutionary history of Caryophyllales. We also discuss a method for overcoming computational challenges associated with homolog clustering in large data sets. © 2018 The Authors. American Journal of Botany is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America.
Castandet, Benoît; Hotto, Amber M.; Strickler, Susan R.; ...
2016-07-06
Although RNA-Seq has revolutionized transcript analysis, organellar transcriptomes are rarely assessed even when present in published datasets. Here, we describe the development and application of a rapid and convenient method, ChloroSeq, to delineate qualitative and quantitative features of chloroplast RNA metabolism from strand-specific RNA-Seq datasets, including processing, editing, splicing, and relative transcript abundance. The use of a single experiment to analyze systematically chloroplast transcript maturation and abundance is of particular interest due to frequent pleiotropic effects observed in mutants that affect chloroplast gene expression and/or photosynthesis. To illustrate its utility, ChloroSeq was applied to published RNA-Seq datasets derived from Arabidopsismore » thaliana grown under control and abiotic stress conditions, where the organellar transcriptome had not been examined. The most appreciable effects were found for heat stress, which induces a global reduction in splicing and editing efficiency, and leads to increased abundance of chloroplast transcripts, including genic, intergenic, and antisense transcripts. Moreover, by concomitantly analyzing nuclear transcripts that encode chloroplast gene expression regulators from the same libraries, we demonstrate the possibility of achieving a holistic understanding of the nucleus-organelle system. In conclusion, ChloroSeq thus represents a unique method for streamlining RNA-Seq data interpretation of the chloroplast transcriptome and its regulators.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castandet, Benoît; Hotto, Amber M.; Strickler, Susan R.
Although RNA-Seq has revolutionized transcript analysis, organellar transcriptomes are rarely assessed even when present in published datasets. Here, we describe the development and application of a rapid and convenient method, ChloroSeq, to delineate qualitative and quantitative features of chloroplast RNA metabolism from strand-specific RNA-Seq datasets, including processing, editing, splicing, and relative transcript abundance. The use of a single experiment to analyze systematically chloroplast transcript maturation and abundance is of particular interest due to frequent pleiotropic effects observed in mutants that affect chloroplast gene expression and/or photosynthesis. To illustrate its utility, ChloroSeq was applied to published RNA-Seq datasets derived from Arabidopsismore » thaliana grown under control and abiotic stress conditions, where the organellar transcriptome had not been examined. The most appreciable effects were found for heat stress, which induces a global reduction in splicing and editing efficiency, and leads to increased abundance of chloroplast transcripts, including genic, intergenic, and antisense transcripts. Moreover, by concomitantly analyzing nuclear transcripts that encode chloroplast gene expression regulators from the same libraries, we demonstrate the possibility of achieving a holistic understanding of the nucleus-organelle system. In conclusion, ChloroSeq thus represents a unique method for streamlining RNA-Seq data interpretation of the chloroplast transcriptome and its regulators.« less
Sinicropi, Dominick; Qu, Kunbin; Collin, Francois; Crager, Michael; Liu, Mei-Lan; Pelham, Robert J; Pho, Mylan; Dei Rossi, Andrew; Jeong, Jennie; Scott, Aaron; Ambannavar, Ranjana; Zheng, Christina; Mena, Raul; Esteban, Jose; Stephans, James; Morlan, John; Baker, Joffre
2012-01-01
RNA biomarkers discovered by RT-PCR-based gene expression profiling of archival formalin-fixed paraffin-embedded (FFPE) tissue form the basis for widely used clinical diagnostic tests; however, RT-PCR is practically constrained in the number of transcripts that can be interrogated. We have developed and optimized RNA-Seq library chemistry as well as bioinformatics and biostatistical methods for whole transcriptome profiling from FFPE tissue. The chemistry accommodates low RNA inputs and sample multiplexing. These methods both enable rediscovery of RNA biomarkers for disease recurrence risk that were previously identified by RT-PCR analysis of a cohort of 136 patients, and also identify a high percentage of recurrence risk markers that were previously discovered using DNA microarrays in a separate cohort of patients, evidence that this RNA-Seq technology has sufficient precision and sensitivity for biomarker discovery. More than two thousand RNAs are strongly associated with breast cancer recurrence risk in the 136 patient cohort (FDR <10%). Many of these are intronic RNAs for which corresponding exons are not also associated with disease recurrence. A number of the RNAs associated with recurrence risk belong to novel RNA networks. It will be important to test the validity of these novel associations in whole transcriptome RNA-Seq screens of other breast cancer cohorts.
Sinicropi, Dominick; Qu, Kunbin; Collin, Francois; Crager, Michael; Liu, Mei-Lan; Pelham, Robert J.; Pho, Mylan; Rossi, Andrew Dei; Jeong, Jennie; Scott, Aaron; Ambannavar, Ranjana; Zheng, Christina; Mena, Raul; Esteban, Jose; Stephans, James; Morlan, John; Baker, Joffre
2012-01-01
RNA biomarkers discovered by RT-PCR-based gene expression profiling of archival formalin-fixed paraffin-embedded (FFPE) tissue form the basis for widely used clinical diagnostic tests; however, RT-PCR is practically constrained in the number of transcripts that can be interrogated. We have developed and optimized RNA-Seq library chemistry as well as bioinformatics and biostatistical methods for whole transcriptome profiling from FFPE tissue. The chemistry accommodates low RNA inputs and sample multiplexing. These methods both enable rediscovery of RNA biomarkers for disease recurrence risk that were previously identified by RT-PCR analysis of a cohort of 136 patients, and also identify a high percentage of recurrence risk markers that were previously discovered using DNA microarrays in a separate cohort of patients, evidence that this RNA-Seq technology has sufficient precision and sensitivity for biomarker discovery. More than two thousand RNAs are strongly associated with breast cancer recurrence risk in the 136 patient cohort (FDR <10%). Many of these are intronic RNAs for which corresponding exons are not also associated with disease recurrence. A number of the RNAs associated with recurrence risk belong to novel RNA networks. It will be important to test the validity of these novel associations in whole transcriptome RNA-Seq screens of other breast cancer cohorts. PMID:22808097
A biochemical landscape of A-to-I RNA editing in the human brain transcriptome
Sakurai, Masayuki; Ueda, Hiroki; Yano, Takanori; Okada, Shunpei; Terajima, Hideki; Mitsuyama, Toutai; Toyoda, Atsushi; Fujiyama, Asao; Kawabata, Hitomi; Suzuki, Tsutomu
2014-01-01
Inosine is an abundant RNA modification in the human transcriptome and is essential for many biological processes in modulating gene expression at the post-transcriptional level. Adenosine deaminases acting on RNA (ADARs) catalyze the hydrolytic deamination of adenosines to inosines (A-to-I editing) in double-stranded regions. We previously established a biochemical method called “inosine chemical erasing” (ICE) to directly identify inosines on RNA strands with high reliability. Here, we have applied the ICE method combined with deep sequencing (ICE-seq) to conduct an unbiased genome-wide screening of A-to-I editing sites in the transcriptome of human adult brain. Taken together with the sites identified by the conventional ICE method, we mapped 19,791 novel sites and newly found 1258 edited mRNAs, including 66 novel sites in coding regions, 41 of which cause altered amino acid assignment. ICE-seq detected novel editing sites in various repeat elements as well as in short hairpins. Gene ontology analysis revealed that these edited mRNAs are associated with transcription, energy metabolism, and neurological disorders, providing new insights into various aspects of human brain functions. PMID:24407955
Lessons from single-cell transcriptome analysis of oxygen-sensing cells.
Zhou, Ting; Matsunami, Hiroaki
2018-05-01
The advent of single-cell RNA-sequencing (RNA-Seq) technology has enabled transcriptome profiling of individual cells. Comprehensive gene expression analysis at the single-cell level has proven to be effective in characterizing the most fundamental aspects of cellular function and identity. This unbiased approach is revolutionary for small and/or heterogeneous tissues like oxygen-sensing cells in identifying key molecules. Here, we review the major methods of current single-cell RNA-Seq technology. We discuss how this technology has advanced the understanding of oxygen-sensing glomus cells in the carotid body and helped uncover novel oxygen-sensing cells and mechanisms in the mice olfactory system. We conclude by providing our perspective on future single-cell RNA-Seq research directed at oxygen-sensing cells.
Influence of socioeconomic status on the whole blood transcriptome in African Americans.
Gaye, Amadou; Gibbons, Gary H; Barry, Charles; Quarells, Rakale; Davis, Sharon K
2017-01-01
The correlation between low socioeconomic status (SES) and poor health outcome or higher risk of disease has been consistently reported by many epidemiological studies across various race/ancestry groups. However, the biological mechanisms linking low SES to disease and/or disease risk factors are not well understood and remain relatively under-studied. The analysis of the blood transcriptome is a promising window for elucidating how social and environmental factors influence the molecular networks governing health and disease. To further define the mechanistic pathways between social determinants and health, this study examined the impact of SES on the blood transcriptome in a sample of African-Americans. An integrative approach leveraging three complementary methods (Weighted Gene Co-expression Network Analysis, Random Forest and Differential Expression) was adopted to identify the most predictive and robust transcriptome pathways associated with SES. We analyzed the expression of 15079 genes (RNA-seq) from whole blood across 36 samples. The results revealed a cluster of 141 co-expressed genes over-expressed in the low SES group. Three pro-inflammatory pathways (IL-8 Signaling, NF-κB Signaling and Dendritic Cell Maturation) are activated in this module and over-expressed in low SES. Random Forest analysis revealed 55 of the 141 genes that, collectively, predict SES with an area under the curve of 0.85. One third of the 141 genes are significantly over-expressed in the low SES group. Lower SES has consistently been linked to many social and environmental conditions acting as stressors and known to be correlated with vulnerability to chronic illnesses (e.g. asthma, diabetes) associated with a chronic inflammatory state. Our unbiased analysis of the blood transcriptome in African-Americans revealed evidence of a robust molecular signature of increased inflammation associated with low SES. The results provide a plausible link between the social factors and chronic inflammation.
Preliminary profiling of blood transcriptome in a rat model of hemorrhagic shock
Braga, D; Barcella, M; D’Avila, F; Lupoli, S; Tagliaferri, F; Santamaria, MH; DeLano, FA; Baselli, G; Schmid-Schönbein, GW; Kistler, EB; Aletti, F
2017-01-01
Hemorrhagic shock is a leading cause of morbidity and mortality worldwide. Significant blood loss may lead to decreased blood pressure and inadequate tissue perfusion with resultant organ failure and death, even after replacement of lost blood volume. One reason for this high acuity is that the fundamental mechanisms of shock are poorly understood. Proteomic and metabolomic approaches have been used to investigate the molecular events occurring in hemorrhagic shock but, to our knowledge, a systematic analysis of the transcriptomic profile is missing. Therefore, a pilot analysis using paired-end RNA sequencing was used to identify changes that occur in the blood transcriptome of rats subjected to hemorrhagic shock after blood reinfusion. Hemorrhagic shock was induced using a Wigger’s shock model. The transcriptome of whole blood from shocked animals shows modulation of genes related to inflammation and immune response (Tlr13, Il1b, Ccl6, Lgals3), antioxidant functions (Mt2A, Mt1), tissue injury and repair pathways (Gpnmb, Trim72) and lipid mediators (Alox5ap, Ltb4r, Ptger2) compared with control animals. These findings are congruent with results obtained in hemorrhagic shock analysis by other authors using metabolomics and proteomics. The analysis of blood transcriptome may be a valuable tool to understand the biological changes occurring in hemorrhagic shock and a promising approach for the identification of novel biomarkers and therapeutic targets. Impact statement This study provides the first pilot analysis of the changes occurring in transcriptome expression of whole blood in hemorrhagic shock (HS) rats. We showed that the analysis of blood transcriptome is a useful approach to investigate pathways and functional alterations in this disease condition. This pilot study encourages the possible application of transcriptome analysis in the clinical setting, for the molecular profiling of whole blood in HS patients. PMID:28661205
Rey, Benjamin; Dégletagne, Cyril; Duchamp, Claude
2016-12-01
In this article, we present differentially expressed gene profiles in the pectoralis muscle of wild juvenile king penguins that were either naturally acclimated to cold marine environment or experimentally immersed in cold water as compared with penguin juveniles that never experienced cold water immersion. Transcriptomic data were obtained by hybridizing penguins total cDNA on Affymetrix GeneChip Chicken Genome arrays and analyzed using maxRS algorithm , " Transcriptome analysis in non-model species: a new method for the analysis of heterologous hybridization on microarrays " (Dégletagne et al., 2010) [1] . We focused on genes involved in multiple antioxidant pathways. For better clarity, these differentially expressed genes were clustered into six functional groups according to their role in controlling redox homeostasis. The data are related to a comprehensive research study on the ontogeny of antioxidant functions in king penguins, "Hormetic response triggers multifaceted anti-oxidant strategies in immature king penguins (Aptenodytes patagonicus)" (Rey et al., 2016) [2] . The raw microarray dataset supporting the present analyses has been deposited at the Gene Expression Omnibus (GEO) repository under accessions GEO: GSE17725 and GEO: GSE82344.
Philipp, E E R; Kraemer, L; Mountfort, D; Schilhabel, M; Schreiber, S; Rosenstiel, P
2012-03-15
Next generation sequencing (NGS) technologies allow a rapid and cost-effective compilation of large RNA sequence datasets in model and non-model organisms. However, the storage and analysis of transcriptome information from different NGS platforms is still a significant bottleneck, leading to a delay in data dissemination and subsequent biological understanding. Especially database interfaces with transcriptome analysis modules going beyond mere read counts are missing. Here, we present the Transcriptome Analysis and Comparison Explorer (T-ACE), a tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains. Results are visualized and can be easily exported for external analysis. We developed T-ACE for laboratory environments, which have only a limited amount of bioinformatics support, and for collaborative projects in which different partners work on the same dataset from different locations or platforms (Windows/Linux/MacOS). For laboratories with some experience in bioinformatics and programming, the low complexity of the database structure and open-source code provides a framework that can be customized according to the different needs of the user and transcriptome project.
NASA Astrophysics Data System (ADS)
Zhang, Hui; Zhai, Yuxiu; Yao, Lin; Jiang, Yanhua; Li, Fengling
2017-05-01
Chlamys farreri is an economically important mollusk that can accumulate excessive amounts of cadmium (Cd). Studying the molecular mechanism of Cd accumulation in bivalves is difficult because of the lack of genome background. Transcriptomic analysis based on high-throughput RNA sequencing has been shown to be an efficient and powerful method for the discovery of relevant genes in non-model and genome reference-free organisms. Here, we constructed two cDNA libraries (control and Cd exposure groups) from the digestive gland of C. farreri and compared the transcriptomic data between them. A total of 227 673 transcripts were assembled into 105 071 unigenes, most of which shared high similarity with sequences in the NCBI non-redundant protein database. For functional classification, 24 493 unigenes were assigned to Gene Ontology terms. Additionally, EuKaryotic Ortholog Groups and Kyoto Encyclopedia of Genes and Genomes analyses assigned 12 028 unigenes to 26 categories and 7 849 unigenes to five pathways, respectively. Comparative transcriptomics analysis identified 3 800 unigenes that were differentially expressed in the Cd-treated group compared with the control group. Among them, genes associated with heavy metal accumulation were screened, including metallothionein, divalent metal transporter, and metal tolerance protein. The functional genes and predicted pathways identified in our study will contribute to a better understanding of the metabolic and immune system in the digestive gland of C. farreri. In addition, the transcriptomic data will provide a comprehensive resource that may contribute to the understanding of molecular mechanisms that respond to marine pollutants in bivalves.
Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing.
Liu, Tiancheng; Yu, Lin; Liu, Lei; Li, Hong; Li, Yixue
2015-01-01
High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO) studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the "funnel-like" model and the "hourglass" model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.
Transcriptomic Dose-Response Analysis for Mode of Action ...
Microarray and RNA-seq technologies can play an important role in assessing the health risks associated with environmental exposures. The utility of gene expression data to predict hazard has been well documented. Early toxicogenomics studies used relatively high, single doses with minimal replication. Thus, they were not useful in understanding health risks at environmentally-relevant doses. Until the past decade, application of toxicogenomics in dose response assessment and determination of chemical mode of action has been limited. New transcriptomic biomarkers have evolved to detect chemical hazards in multiple tissues together with pathway methods to study biological effects across the full dose response range and critical time course. Comprehensive low dose datasets are now available and with the use of transcriptomic benchmark dose estimation techniques within a mode of action framework, the ability to incorporate informative genomic data into human health risk assessment has substantially improved. The key advantage to applying transcriptomic technology to risk assessment is both the sensitivity and comprehensive examination of direct and indirect molecular changes that lead to adverse outcomes. Book Chapter with topic on future application of toxicogenomics technologies for MoA and risk assessment
Toxicogenomics in Environmental Science.
Brinke, Alexandra; Buchinger, Sebastian
This chapter reviews the current knowledge and recent progress in the field of environmental, aquatic ecotoxicogenomics with a focus on transcriptomic methods. In ecotoxicogenomics the omics technologies are applied for the detection and assessment of adverse effects in the environment, and thus are to be distinguished from omics used in human toxicology [Snape et al., Aquat Toxicol 67:143-154, 2004]. Transcriptomic methods in ecotoxicology are applied to gain a mechanistic understanding of toxic effects on organisms or populations, and thus aim to bridge the gap between cause and effect. A worthwhile effect-based interpretation of stressor induced changes on the transcriptome is based on the principle of phenotypic-anchoring [Paules, Environ Health Perspect 111:A338-A339, 2003]. Thereby, changes on the transcriptomic level can only be identified as effects if they are clearly linked to a specific stressor-induced effect on the macroscopic level. By integrating those macroscopic and transcriptomic effects, conclusions on the effect-inducing type of the stressor can be drawn. Stressor-specific effects on the transcriptomic level can be identified as stressor-specific induced pathways, transcriptomic patterns, or stressors-specific genetic biomarkers. In this chapter, examples of the combined application of macroscopic and transcriptional effects for the identification of environmental stressors, such as aquatic pollutants, are given and discussed. By means of these examples, challenges on the way to a standardized application of transcriptomics in ecotoxicology are discussed. This is also done against the background of the application of transcriptomic methods in environmental regulation such as the EU regulation Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH).
Integrated Analysis of Transcriptomic and Proteomic Data
Haider, Saad; Pal, Ranadip
2013-01-01
Until recently, understanding the regulatory behavior of cells has been pursued through independent analysis of the transcriptome or the proteome. Based on the central dogma, it was generally assumed that there exist a direct correspondence between mRNA transcripts and generated protein expressions. However, recent studies have shown that the correlation between mRNA and Protein expressions can be low due to various factors such as different half lives and post transcription machinery. Thus, a joint analysis of the transcriptomic and proteomic data can provide useful insights that may not be deciphered from individual analysis of mRNA or protein expressions. This article reviews the existing major approaches for joint analysis of transcriptomic and proteomic data. We categorize the different approaches into eight main categories based on the initial algorithm and final analysis goal. We further present analogies with other domains and discuss the existing research problems in this area. PMID:24082820
Li, Peng; Chen, Jianxin; Zhang, Wuxia; Fu, Bangze; Wang, Wei
2017-01-04
Herbal medicine is a concoction of numerous chemical ingredients, and it exhibits polypharmacological effects to act on multiple pharmacological targets, regulating different biological mechanisms and treating a variety of diseases. Thus, this complexity is impossible to deconvolute by the reductionist method of extracting one active ingredient acting on one biological target. To dissect the polypharmacological effects of herbal medicines and their underling pharmacological targets as well as their corresponding active ingredients. We propose a system-biology strategy that combines omics and bioinformatical methodologies for exploring the polypharmacology of herbal mixtures. The myocardial ischemia model was induced by Ameroid constriction of the left anterior descending coronary in Ba-Ma miniature pigs. RNA-seq analysis was utilized to find the differential genes induced by myocardial ischemia in pigs treated with formula QSKL. A transcriptome-based inference method was used to find the landmark drugs with similar mechanisms to QSKL. Gene-level analysis of RNA-seq data in QSKL-treated cases versus control animals yields 279 differential genes. Transcriptome-based inference methods identified 80 landmark drugs that covered nearly all drug classes. Then, based on the landmark drugs, 155 potential pharmacological targets and 57 indications were identified for QSKL. Our results demonstrate the power of a combined approach for exploring the pharmacological target and chemical space of herbal medicines. We hope that our method could enhance our understanding of the molecular mechanisms of herbal systems and further accelerate the exploration of the value of traditional herbal medicine systems. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang
2017-08-23
Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.
2009-01-01
Background Whole genome transcriptomic analysis is a powerful approach to elucidate the molecular mechanisms controlling the pathogenesis of obligate intracellular bacteria. However, the major hurdle resides in the low quantity of prokaryotic mRNAs extracted from host cells. Our model Ehrlichia ruminantium (ER), the causative agent of heartwater, is transmitted by tick Amblyomma variegatum. This bacterium affects wild and domestic ruminants and is present in Sub-Saharan Africa and the Caribbean islands. Because of its strictly intracellular location, which constitutes a limitation for its extensive study, the molecular mechanisms involved in its pathogenicity are still poorly understood. Results We successfully adapted the SCOTS method (Selective Capture of Transcribed Sequences) on the model Rickettsiales ER to capture mRNAs. Southern Blots and RT-PCR revealed an enrichment of ER's cDNAs and a diminution of ribosomal contaminants after three rounds of capture. qRT-PCR and whole-genome ER microarrays hybridizations demonstrated that SCOTS method introduced only a limited bias on gene expression. Indeed, we confirmed the differential gene expression between poorly and highly expressed genes before and after SCOTS captures. The comparative gene expression obtained from ER microarrays data, on samples before and after SCOTS at 96 hpi was significantly correlated (R2 = 0.7). Moreover, SCOTS method is crucial for microarrays analysis of ER, especially for early time points post-infection. There was low detection of transcripts for untreated samples whereas 24% and 70.7% were revealed for SCOTS samples at 24 and 96 hpi respectively. Conclusions We conclude that this SCOTS method has a key importance for the transcriptomic analysis of ER and can be potentially used for other Rickettsiales. This study constitutes the first step for further gene expression analyses that will lead to a better understanding of both ER pathogenicity and the adaptation of obligate intracellular bacteria to their environment. PMID:20034374
Strand-specific transcriptome profiling with directly labeled RNA on genomic tiling microarrays
2011-01-01
Background With lower manufacturing cost, high spot density, and flexible probe design, genomic tiling microarrays are ideal for comprehensive transcriptome studies. Typically, transcriptome profiling using microarrays involves reverse transcription, which converts RNA to cDNA. The cDNA is then labeled and hybridized to the probes on the arrays, thus the RNA signals are detected indirectly. Reverse transcription is known to generate artifactual cDNA, in particular the synthesis of second-strand cDNA, leading to false discovery of antisense RNA. To address this issue, we have developed an effective method using RNA that is directly labeled, thus by-passing the cDNA generation. This paper describes this method and its application to the mapping of transcriptome profiles. Results RNA extracted from laboratory cultures of Porphyromonas gingivalis was fluorescently labeled with an alkylation reagent and hybridized directly to probes on genomic tiling microarrays specifically designed for this periodontal pathogen. The generated transcriptome profile was strand-specific and produced signals close to background level in most antisense regions of the genome. In contrast, high levels of signal were detected in the antisense regions when the hybridization was done with cDNA. Five antisense areas were tested with independent strand-specific RT-PCR and none to negligible amplification was detected, indicating that the strong antisense cDNA signals were experimental artifacts. Conclusions An efficient method was developed for mapping transcriptome profiles specific to both coding strands of a bacterial genome. This method chemically labels and uses extracted RNA directly in microarray hybridization. The generated transcriptome profile was free of cDNA artifactual signals. In addition, this method requires fewer processing steps and is potentially more sensitive in detecting small amount of RNA compared to conventional end-labeling methods due to the incorporation of more fluorescent molecules per RNA fragment. PMID:21235785
Wysocki, William P; Ruiz-Sanchez, Eduardo; Yin, Yanbin; Duvall, Melvin R
2016-05-20
Next-generation sequencing now allows for total RNA extracts to be sequenced in non-model organisms such as bamboos, an economically and ecologically important group of grasses. Bamboos are divided into three lineages, two of which are woody perennials with bisexual flowers, which undergo gregarious monocarpy. The third lineage, which are herbaceous perennials, possesses unisexual flowers that undergo annual flowering events. Transcriptomes were assembled using both reference-based and de novo methods. These two methods were tested by characterizing transcriptome content using sequence alignment to previously characterized reference proteomes and by identifying Pfam domains. Because of the striking differences in floral morphology and phenology between the herbaceous and woody bamboo lineages, MADS-box genes, transcription factors that control floral development and timing, were characterized and analyzed in this study. Transcripts were identified using phylogenetic methods and categorized as A, B, C, D or E-class genes, which control floral development, or SOC or SVP-like genes, which control the timing of flowering events. Putative nuclear orthologues were also identified in bamboos to use as phylogenetic markers. Instances of gene copies exhibiting topological patterns that correspond to shared phenotypes were observed in several gene families including floral development and timing genes. Alignments and phylogenetic trees were generated for 3,878 genes and for all genes in a concatenated analysis. Both the concatenated analysis and those of 2,412 separate gene trees supported monophyly among the woody bamboos, which is incongruent with previous phylogenetic studies using plastid markers.
Plessy, Charles; Desbois, Linda; Fujii, Teruo; Carninci, Piero
2013-02-01
Tissues contain complex populations of cells. Like countries, which are comprised of mixed populations of people, tissues are not homogeneous. Gene expression studies that analyze entire populations of cells from tissues as a mixture are blind to this diversity. Thus, critical information is lost when studying samples rich in specialized but diverse cells such as tumors, iPS colonies, or brain tissue. High throughput methods are needed to address, model and understand the constitutive and stochastic differences between individual cells. Here, we describe microfluidics technologies that utilize a combination of molecular biology and miniaturized labs on chips to study gene expression at the single cell level. We discuss how the characterization of the transcriptome of each cell in a sample will open a new field in gene expression analysis, population transcriptomics, that will change the academic and biomedical analysis of complex samples by defining them as quantified populations of single cells. Copyright © 2013 WILEY Periodicals, Inc.
USDA-ARS?s Scientific Manuscript database
Transcriptomic analysis of fecal samples is an emerging method for the diagnosis of gastrointestinal pathology because it is noninvasive and requires minute volumes of analyte; however, detection of mRNA in low copy numbers in human stool is challenging. Our objective was to develop a method for det...
Tn5Prime, a Tn5 based 5' capture method for single cell RNA-seq.
Cole, Charles; Byrne, Ashley; Beaudin, Anna E; Forsberg, E Camilla; Vollmers, Christopher
2018-06-01
RNA-sequencing (RNA-seq) is a powerful technique to investigate and quantify entire transcriptomes. Recent advances in the field have made it possible to explore the transcriptomes of single cells. However, most widely used RNA-seq protocols fail to provide crucial information regarding transcription start sites. Here we present a protocol, Tn5Prime, that takes advantage of the Tn5 transposase-based Smart-seq2 protocol to create RNA-seq libraries that capture the 5' end of transcripts. The Tn5Prime method dramatically streamlines the 5' capture process and is both cost effective and reliable. By applying Tn5Prime to bulk RNA and single cell samples, we were able to define transcription start sites as well as quantify transcriptomes at high accuracy and reproducibility. Additionally, similar to 3' end-based high-throughput methods like Drop-seq and 10× Genomics Chromium, the 5' capture Tn5Prime method allows the introduction of cellular identifiers during reverse transcription, simplifying the analysis of large numbers of single cells. In contrast to 3' end-based methods, Tn5Prime also enables the assembly of the variable 5' ends of the antibody sequences present in single B-cell data. Therefore, Tn5Prime presents a robust tool for both basic and applied research into the adaptive immune system and beyond.
Irla, Marta; Neshat, Armin; Brautaset, Trygve; Rückert, Christian; Kalinowski, Jörn; Wendisch, Volker F
2015-02-14
Bacillus methanolicus MGA3 is a thermophilic, facultative ribulose monophosphate (RuMP) cycle methylotroph. Together with its ability to produce high yields of amino acids, the relevance of this microorganism as a promising candidate for biotechnological applications is evident. The B. methanolicus MGA3 genome consists of a 3,337,035 nucleotides (nt) circular chromosome, the 19,174 nt plasmid pBM19 and the 68,999 nt plasmid pBM69. 3,218 protein-coding regions were annotated on the chromosome, 22 on pBM19 and 82 on pBM69. In the present study, the RNA-seq approach was used to comprehensively investigate the transcriptome of B. methanolicus MGA3 in order to improve the genome annotation, identify novel transcripts, analyze conserved sequence motifs involved in gene expression and reveal operon structures. For this aim, two different cDNA library preparation methods were applied: one which allows characterization of the whole transcriptome and another which includes enrichment of primary transcript 5'-ends. Analysis of the primary transcriptome data enabled the detection of 2,167 putative transcription start sites (TSSs) which were categorized into 1,642 TSSs located in the upstream region (5'-UTR) of known protein-coding genes and 525 TSSs of novel antisense, intragenic, or intergenic transcripts. Firstly, 14 wrongly annotated translation start sites (TLSs) were corrected based on primary transcriptome data. Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements. Moreover, the exact TSSs positions were utilized to define conserved sequence motifs for translation start sites, ribosome binding sites and promoters in B. methanolicus MGA3. Based on the whole transcriptome data set, novel transcripts, operon structures and mRNA abundances were determined. The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons. Several of the genes related to methylotrophy had highly abundant transcripts. The extensive insights into the transcriptional landscape of B. methanolicus MGA3, gained in this study, represent a valuable foundation for further comparative quantitative transcriptome analyses and possibly also for the development of molecular biology tools which at present are very limited for this organism.
RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”
Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113
RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".
Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.
RNA-seq analysis of broiler liver transcriptome reveals novel responses to high ambient temperature.
Coble, Derrick J; Fleming, Damarius; Persia, Michael E; Ashwell, Chris M; Rothschild, Max F; Schmidt, Carl J; Lamont, Susan J
2014-12-10
In broilers, high ambient temperature can result in reduced feed consumption, digestive inefficiency, impaired metabolism, and even death. The broiler sector of the U.S. poultry industry incurs approximately $52 million in heat-related losses annually. The objective of this study is to characterize the effects of cyclic high ambient temperature on the transcriptome of a metabolically active organ, the liver. This study provides novel insight into the effects of high ambient temperature on metabolism in broilers, because it is the first reported RNA-seq study to characterize the effect of heat on the transcriptome of a metabolic-related tissue. This information provides a platform for future investigations to further elucidate physiologic responses to high ambient temperature and seek methods to ameliorate the negative impacts of heat. Transcriptome sequencing of the livers of 8 broiler males using Illumina HiSeq 2000 technology resulted in 138 million, 100-base pair single end reads, yielding a total of 13.8 gigabases of sequence. Forty genes were differentially expressed at a significance level of P-value < 0.05 and a fold-change ≥ 2 in response to a week of cyclic high ambient temperature with 27 down-regulated and 13 up-regulated genes. Two gene networks were created from the function-based Ingenuity Pathway Analysis (IPA) of the differentially expressed genes: "Cell Signaling" and "Endocrine System Development and Function". The gene expression differences in the liver transcriptome of the heat-exposed broilers reflected physiological responses to decrease internal temperature, reduce hyperthermia-induced apoptosis, and promote tissue repair. Additionally, the differential gene expression revealed a physiological response to regulate the perturbed cellular calcium levels that can result from high ambient temperature exposure. Exposure to cyclic high ambient temperature results in changes at the metabolic, physiologic, and cellular level that can be characterized through RNA-seq analysis of the liver transcriptome of broilers. The findings highlight specific physiologic mechanisms by which broilers reduce the effects of exposure to high ambient temperature. This information provides a foundation for future investigations into the gene networks involved in the broiler stress response and for development of strategies to ameliorate the negative impacts of heat on animal production and welfare.
Analysis of the Citrullus colocynthis Transcriptome during Water Deficit Stress
Wang, Zhuoyu; Hu, Hongtao; Goertzen, Leslie R.; McElroy, J. Scott; Dane, Fenny
2014-01-01
Citrullus colocynthis is a very drought tolerant species, closely related to watermelon (C. lanatus var. lanatus), an economically important cucurbit crop. Drought is a threat to plant growth and development, and the discovery of drought inducible genes with various functions is of great importance. We used high throughput mRNA Illumina sequencing technology and bioinformatic strategies to analyze the C. colocynthis leaf transcriptome under drought treatment. Leaf samples at four different time points (0, 24, 36, or 48 hours of withholding water) were used for RNA extraction and Illumina sequencing. qRT-PCR of several drought responsive genes was performed to confirm the accuracy of RNA sequencing. Leaf transcriptome analysis provided the first glimpse of the drought responsive transcriptome of this unique cucurbit species. A total of 5038 full-length cDNAs were detected, with 2545 genes showing significant changes during drought stress. Principle component analysis indicated that drought was the major contributing factor regulating transcriptome changes. Up regulation of many transcription factors, stress signaling factors, detoxification genes, and genes involved in phytohormone signaling and citrulline metabolism occurred under the water deficit conditions. The C. colocynthis transcriptome data highlight the activation of a large set of drought related genes in this species, thus providing a valuable resource for future functional analysis of candidate genes in defense of drought stress. PMID:25118696
Morine, Melissa J; McMonagle, Jolene; Toomey, Sinead; Reynolds, Clare M; Moloney, Aidan P; Gormley, Isobel C; Gaora, Peadar O; Roche, Helen M
2010-10-07
Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p < 0.05), followed by muscle (601 genes) and adipose (16 genes). Results from modified GSEA showed that the high-CLA beef diet affected diverse biological processes across the three tissues, and that the majority of pathway changes reached significance only with the bi-directional test. Combining the liver tissue microarray results with plasma marker data revealed 110 CLA-sensitive genes showing strong canonical correlation with one or more plasma markers of metabolic health, and 9 significantly overrepresented pathways among this set; each of these pathways was also significantly changed by the high-CLA diet. Closer inspection of two of these pathways--selenoamino acid metabolism and steroid biosynthesis--illustrated clear diet-sensitive changes in constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect. Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease.
2010-01-01
Background Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Results Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p < 0.05), followed by muscle (601 genes) and adipose (16 genes). Results from modified GSEA showed that the high-CLA beef diet affected diverse biological processes across the three tissues, and that the majority of pathway changes reached significance only with the bi-directional test. Combining the liver tissue microarray results with plasma marker data revealed 110 CLA-sensitive genes showing strong canonical correlation with one or more plasma markers of metabolic health, and 9 significantly overrepresented pathways among this set; each of these pathways was also significantly changed by the high-CLA diet. Closer inspection of two of these pathways - selenoamino acid metabolism and steroid biosynthesis - illustrated clear diet-sensitive changes in constituent genes, as well as strong correlations between gene expression and plasma markers of metabolic syndrome independent of the dietary effect. Conclusion Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of analysis has the potential to generate novel transcriptome-based biomarkers of disease. PMID:20929581
Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.
Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry
2016-11-01
Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
2014-12-01
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.
Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola
2014-01-01
We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918
DiffSplice: the genome-wide detection of differential splicing events with RNA-seq
Hu, Yin; Huang, Yan; Du, Ying; Orellana, Christian F.; Singh, Darshan; Johnson, Amy R.; Monroy, Anaïs; Kuan, Pei-Fen; Hammond, Scott M.; Makowski, Liza; Randell, Scott H.; Chiang, Derek Y.; Hayes, D. Neil; Jones, Corbin; Liu, Yufeng; Prins, Jan F.; Liu, Jinze
2013-01-01
The RNA transcriptome varies in response to cellular differentiation as well as environmental factors, and can be characterized by the diversity and abundance of transcript isoforms. Differential transcription analysis, the detection of differences between the transcriptomes of different cells, may improve understanding of cell differentiation and development and enable the identification of biomarkers that classify disease types. The availability of high-throughput short-read RNA sequencing technologies provides in-depth sampling of the transcriptome, making it possible to accurately detect the differences between transcriptomes. In this article, we present a new method for the detection and visualization of differential transcription. Our approach does not depend on transcript or gene annotations. It also circumvents the need for full transcript inference and quantification, which is a challenging problem because of short read lengths, as well as various sampling biases. Instead, our method takes a divide-and-conquer approach to localize the difference between transcriptomes in the form of alternative splicing modules (ASMs), where transcript isoforms diverge. Our approach starts with the identification of ASMs from the splice graph, constructed directly from the exons and introns predicted from RNA-seq read alignments. The abundance of alternative splicing isoforms residing in each ASM is estimated for each sample and is compared across sample groups. A non-parametric statistical test is applied to each ASM to detect significant differential transcription with a controlled false discovery rate. The sensitivity and specificity of the method have been assessed using simulated data sets and compared with other state-of-the-art approaches. Experimental validation using qRT-PCR confirmed a selected set of genes that are differentially expressed in a lung differentiation study and a breast cancer data set, demonstrating the utility of the approach applied on experimental biological data sets. The software of DiffSplice is available at http://www.netlab.uky.edu/p/bioinfo/DiffSplice. PMID:23155066
Quantitative RNA-seq analysis of the Campylobacter jejuni transcriptome
Chaudhuri, Roy R.; Yu, Lu; Kanji, Alpa; Perkins, Timothy T.; Gardner, Paul P.; Choudhary, Jyoti; Maskell, Duncan J.
2011-01-01
Campylobacter jejuni is the most common bacterial cause of foodborne disease in the developed world. Its general physiology and biochemistry, as well as the mechanisms enabling it to colonize and cause disease in various hosts, are not well understood, and new approaches are required to understand its basic biology. High-throughput sequencing technologies provide unprecedented opportunities for functional genomic research. Recent studies have shown that direct Illumina sequencing of cDNA (RNA-seq) is a useful technique for the quantitative and qualitative examination of transcriptomes. In this study we report RNA-seq analyses of the transcriptomes of C. jejuni (NCTC11168) and its rpoN mutant. This has allowed the identification of hitherto unknown transcriptional units, and further defines the regulon that is dependent on rpoN for expression. The analysis of the NCTC11168 transcriptome was supplemented by additional proteomic analysis using liquid chromatography-MS. The transcriptomic and proteomic datasets represent an important resource for the Campylobacter research community. PMID:21816880
Analysis of Transcriptomic Dose Response Data in the ...
Slide presentation at the HESI-HEALTH Canada-McGill Workshop on Transcriptomic Dose Response Data in the Context of Chemical Risk Assessment Slide presentation at the HESI-HEALTH Canada-McGill Workshop on Transcriptomic Dose Response Data in the Context of Chemical Risk Assessment
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis
Jones, Beryl M.; Wcislo, William T.; Robinson, Gene E.
2015-01-01
Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell–cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. PMID:26276382
Developmental Transcriptome for a Facultatively Eusocial Bee, Megalopta genalis.
Jones, Beryl M; Wcislo, William T; Robinson, Gene E
2015-08-14
Transcriptomes provide excellent foundational resources for mechanistic and evolutionary analyses of complex traits. We present a developmental transcriptome for the facultatively eusocial bee Megalopta genalis, which represents a potential transition point in the evolution of eusociality. A de novo transcriptome assembly of Megalopta genalis was generated using paired-end Illumina sequencing and the Trinity assembler. Males and females of all life stages were aligned to this transcriptome for analysis of gene expression profiles throughout development. Gene Ontology analysis indicates that stage-specific genes are involved in ion transport, cell-cell signaling, and metabolism. A number of distinct biological processes are upregulated in each life stage, and transitions between life stages involve shifts in dominant functional processes, including shifts from transcriptional regulation in embryos to metabolism in larvae, and increased lipid metabolism in adults. We expect that this transcriptome will provide a useful resource for future analyses to better understand the molecular basis of the evolution of eusociality and, more generally, phenotypic plasticity. Copyright © 2015 Jones et al.
A survey of the sorghum transcriptome using single-molecule long reads
Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; ...
2016-06-24
Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novelmore » splice isoforms. Additionally, we uncover APA ofB11,000 expressed genes and more than 2,100 novel genes. Lastly, these results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism.« less
A survey of the sorghum transcriptome using single-molecule long reads
Abdel-Ghany, Salah E.; Hamilton, Michael; Jacobi, Jennifer L.; Ngam, Peter; Devitt, Nicholas; Schilkey, Faye; Ben-Hur, Asa; Reddy, Anireddy S. N.
2016-01-01
Alternative splicing and alternative polyadenylation (APA) of pre-mRNAs greatly contribute to transcriptome diversity, coding capacity of a genome and gene regulatory mechanisms in eukaryotes. Second-generation sequencing technologies have been extensively used to analyse transcriptomes. However, a major limitation of short-read data is that it is difficult to accurately predict full-length splice isoforms. Here we sequenced the sorghum transcriptome using Pacific Biosciences single-molecule real-time long-read isoform sequencing and developed a pipeline called TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing) to identify full-length splice isoforms and APA sites. Our analysis reveals transcriptome-wide full-length isoforms at an unprecedented scale with over 11,000 novel splice isoforms. Additionally, we uncover APA of ∼11,000 expressed genes and more than 2,100 novel genes. These results greatly enhance sorghum gene annotations and aid in studying gene regulation in this important bioenergy crop. The TAPIS pipeline will serve as a useful tool to analyse Iso-Seq data from any organism. PMID:27339290
Genomic and transcriptomic predictors of triglyceride response to regular exercise
Sarzynski, Mark A; Davidsen, Peter K; Sung, Yun Ju; Hesselink, Matthijs K C; Schrauwen, Patrick; Rice, Treva K; Rao, D C; Falciani, Francesco; Bouchard, Claude
2015-01-01
Aim We performed genome-wide and transcriptome-wide profiling to identify genes and single nucleotide polymorphisms (SNPs) associated with the response of triglycerides (TG) to exercise training. Methods Plasma TG levels were measured before and after a 20-week endurance training programme in 478 white participants from the HERITAGE Family Study. Illumina HumanCNV370-Quad v3.0 BeadChips were genotyped using the Illumina BeadStation 500GX platform. Affymetrix HG-U133+2 arrays were used to quantitate gene expression levels from baseline muscle biopsies of a subset of participants (N=52). Genome-wide association study (GWAS) analysis was performed using MERLIN, while transcriptomic predictor models were developed using the R-package GALGO. Results The GWAS results showed that eight SNPs were associated with TG training-response (ΔTG) at p<9.9×10−6, while another 31 SNPs showed p values <1×10−4. In multivariate regression models, the top 10 SNPs explained 32.0% of the variance in ΔTG, while conditional heritability analysis showed that four SNPs statistically accounted for all of the heritability of ΔTG. A molecular signature based on the baseline expression of 11 genes predicted 27% of ΔTG in HERITAGE, which was validated in an independent study. A composite SNP score based on the top four SNPs, each from the genomic and transcriptomic analyses, was the strongest predictor of ΔTG (R2=0.14, p=3.0×10−68). Conclusions Our results indicate that skeletal muscle transcript abundance at 11 genes and SNPs at a number of loci contribute to TG response to exercise training. Combining data from genomics and transcriptomics analyses identified a SNP-based gene signature that should be further tested in independent samples. PMID:26491034
Single-Cell Sequencing for Precise Cancer Research: Progress and Prospects.
Zhang, Xiaoyan; Marjani, Sadie L; Hu, Zhaoyang; Weissman, Sherman M; Pan, Xinghua; Wu, Shixiu
2016-03-15
Advances in genomic technology have enabled the faithful detection and measurement of mutations and the gene expression profile of cancer cells at the single-cell level. Recently, several single-cell sequencing methods have been developed that permit the comprehensive and precise analysis of the cancer-cell genome, transcriptome, and epigenome. The use of these methods to analyze cancer cells has led to a series of unanticipated discoveries, such as the high heterogeneity and stochastic changes in cancer-cell populations, the new driver mutations and the complicated clonal evolution mechanisms, and the novel identification of biomarkers of variant tumors. These methods and the knowledge gained from their utilization could potentially improve the early detection and monitoring of rare cancer cells, such as circulating tumor cells and disseminated tumor cells, and promote the development of personalized and highly precise cancer therapy. Here, we discuss the current methods for single cancer-cell sequencing, with a strong focus on those practically used or potentially valuable in cancer research, including single-cell isolation, whole genome and transcriptome amplification, epigenome profiling, multi-dimensional sequencing, and next-generation sequencing and analysis. We also examine the current applications, challenges, and prospects of single cancer-cell sequencing. ©2016 American Association for Cancer Research.
Aumer, Denise; Mumoki, Fiona N; Pirk, Christian W W; Moritz, Robin F A
2018-03-20
Social insects are characterized by the division of labor. Queens usually dominate reproduction, whereas workers fulfill non-reproductive age-dependent tasks to maintain the colony. Although workers are typically sterile, they can activate their ovaries to produce their own offspring. In the extreme, worker reproduction can turn into social parasitism as in Apis mellifera capensis. These intraspecific parasites occupy a host colony, kill the resident queen, and take over the reproductive monopoly. Because they exhibit a queenlike behavior and are also treated like queens by the fellow workers, they are so-called pseudoqueens. Here, we compare the development of parasitic pseudoqueens and social workers at different time points using fat body transcriptome data. Two complementary analysis methods-a principal component analysis and a time course analysis-led to the identification of a core set of genes involved in the transition from a social worker into a highly fecund parasitic pseudoqueen. Comparing our results on pseudoqueens with gene expression data of honeybee queens revealed many similarities. In addition, there was a set of specific transcriptomic changes in the parasitic pseudoqueens that differed from both, queens and social workers, which may be typical for the development of the social parasitism in A. m. capensis.
Puthiyedth, Nisha; Riveros, Carlos; Berretta, Regina; Moscato, Pablo
2015-01-01
Background The joint study of multiple datasets has become a common technique for increasing statistical power in detecting biomarkers obtained from smaller studies. The approach generally followed is based on the fact that as the total number of samples increases, we expect to have greater power to detect associations of interest. This methodology has been applied to genome-wide association and transcriptomic studies due to the availability of datasets in the public domain. While this approach is well established in biostatistics, the introduction of new combinatorial optimization models to address this issue has not been explored in depth. In this study, we introduce a new model for the integration of multiple datasets and we show its application in transcriptomics. Methods We propose a new combinatorial optimization problem that addresses the core issue of biomarker detection in integrated datasets. Optimal solutions for this model deliver a feature selection from a panel of prospective biomarkers. The model we propose is a generalised version of the (α,β)-k-Feature Set problem. We illustrate the performance of this new methodology via a challenging meta-analysis task involving six prostate cancer microarray datasets. The results are then compared to the popular RankProd meta-analysis tool and to what can be obtained by analysing the individual datasets by statistical and combinatorial methods alone. Results Application of the integrated method resulted in a more informative signature than the rank-based meta-analysis or individual dataset results, and overcomes problems arising from real world datasets. The set of genes identified is highly significant in the context of prostate cancer. The method used does not rely on homogenisation or transformation of values to a common scale, and at the same time is able to capture markers associated with subgroups of the disease. PMID:26106884
Insights into transcriptomes of Big and Low sagebrush
Mark D. Huynh; Justin T. Page; Bryce A. Richardson; Joshua A. Udall
2015-01-01
We report the sequencing and assembly of three transcriptomes from Big (Artemisia tridentatassp. wyomingensis and A. tridentatassp. tridentata) and Low (A. arbuscula ssp. arbuscula) sagebrush. The sequence reads are available in the Sequence Read Archive of NCBI. We demonstrate the utilities of these transcriptomes for gene discovery and phylogenomic analysis. An...
Busch, Hauke; Boerries, Melanie; Bao, Jie; Hanke, Sebastian T; Hiss, Manuel; Tiko, Theodhor; Rensing, Stefan A
2013-01-01
Transcription factors (TFs) often trigger developmental decisions, yet, their transcripts are often only moderately regulated and thus not easily detected by conventional statistics on expression data. Here we present a method that allows to determine such genes based on trajectory analysis of time-resolved transcriptome data. As a proof of principle, we have analysed apical stem cells of filamentous moss (P. patens) protonemata that develop from leaflets upon their detachment from the plant. By our novel correlation analysis of the post detachment transcriptome kinetics we predict five out of 1,058 TFs to be involved in the signaling leading to the establishment of pluripotency. Among the predicted regulators is the basic helix loop helix TF PpRSL1, which we show to be involved in the establishment of apical stem cells in P. patens. Our methodology is expected to aid analysis of key players of developmental decisions in complex plant and animal systems.
Network Analysis of Rodent Transcriptomes in Spaceflight
NASA Technical Reports Server (NTRS)
Ramachandran, Maya; Fogle, Homer; Costes, Sylvain
2017-01-01
Network analysis methods leverage prior knowledge of cellular systems and the statistical and conceptual relationships between analyte measurements to determine gene connectivity. Correlation and conditional metrics are used to infer a network topology and provide a systems-level context for cellular responses. Integration across multiple experimental conditions and omics domains can reveal the regulatory mechanisms that underlie gene expression. GeneLab has assembled rich multi-omic (transcriptomics, proteomics, epigenomics, and epitranscriptomics) datasets for multiple murine tissues from the Rodent Research 1 (RR-1) experiment. RR-1 assesses the impact of 37 days of spaceflight on gene expression across a variety of tissue types, such as adrenal glands, quadriceps, gastrocnemius, tibalius anterior, extensor digitorum longus, soleus, eye, and kidney. Network analysis is particularly useful for RR-1 -omics datasets because it reinforces subtle relationships that may be overlooked in isolated analyses and subdues confounding factors. Our objective is to use network analysis to determine potential target nodes for therapeutic intervention and identify similarities with existing disease models. Multiple network algorithms are used for a higher confidence consensus.
Decoding genes with coexpression networks and metabolomics - 'majority report by precogs'.
Saito, Kazuki; Hirai, Masami Y; Yonekura-Sakakibara, Keiko
2008-01-01
Following the sequencing of whole genomes of model plants, high-throughput decoding of gene function is a major challenge in modern plant biology. In view of remarkable technical advances in transcriptomics and metabolomics, integrated analysis of these 'omics' by data-mining informatics is an excellent tool for prediction and identification of gene function, particularly for genes involved in complicated metabolic pathways. The availability of Arabidopsis public transcriptome datasets containing data of >1000 microarrays reinforces the potential for prediction of gene function by transcriptome coexpression analysis. Here, we review the strategy of combining transcriptome and metabolome as a powerful technology for studying the functional genomics of model plants and also crop and medicinal plants.
Luck, Ashley N; Slatko, Barton E; Foster, Jeremy M
2017-01-01
Efficient transcriptomic sequencing of microbial mRNA derived from host-microbe associations is often compromised by the much lower relative abundance of microbial RNA in the mixed total RNA sample. One solution to this problem is to perform extensive sequencing until an acceptable level of transcriptome coverage is obtained. More cost-effective methods include use of prokaryotic and/or eukaryotic rRNA depletion strategies, sometimes in conjunction with depletion of polyadenylated eukaryotic mRNA. Here, we report use of Cappable-seq™ to specifically enrich, in a single step, Wolbachia endobacterial mRNA transcripts from total RNA prepared from the parasitic filarial nematode, Brugia malayi. The obligate Wolbachia endosymbiont is a proven drug target for many human filarial infections, yet the precise nature of its symbiosis with the nematode host is poorly understood. Insightful analysis of the expression levels of Wolbachia genes predicted to underpin the mutualistic association and of known drug target genes at different life cycle stages or in response to drug treatments is typically challenged by low transcriptomic coverage. Cappable-seq resulted in up to ~ 5-fold increase in the number of reads mapping to Wolbachia. On average, coverage of Wolbachia transcripts from B. malayi microfilariae was enriched ~40-fold by Cappable-seq. Additionally, this method has an additional benefit of selectively removing abundant prokaryotic ribosomal RNAs.The deeper microbial transcriptome sequencing afforded by Cappable-seq facilitates more detailed characterization of gene expression levels of pathogens and symbionts present in animal tissues.
The study of transcriptome profiles in Holstein cows with miscarriage during peri-implantation.
Zhao, Guoli; Li, Yanyan; Kang, Xiaolong; Huang, Liang; Li, Peng; Zhou, Jinghang; Shi, Yuangang
2018-05-31
In this study, the transcriptome profile of cows who experienced miscarriage during peri-implantation was investigated. The transcriptome was checked by RNA sequencing, and the analyzed by bioinformatics methods. The results suggested that serum progesterone levels were significantly decreased in the cows who miscarried compared with the pregnant cows at 18 d, 21d, 33 d, 39 d and 51 d after artificial insemination. The RNA sequencing results suggested that 32, 176, 5, 10 and 2 differentially expressed genes (DEGs) were identified in the pregnant cows and the cows who miscarried at 18, 21, 33, 39 and 51 d after artificial insemination. Furthermore, the DEGs were analysed with hierarchical clustering and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and 15, 101, 1, 2 and 2 DEGs were upregulated, and 17, 74, 4, 8 and 0 DEGs were downregulated in the cows in the pregnant and miscarriage groups, respectively at 18, 21 33, 39 and 51 d after artificial insemination. These DEGs were distributed to 13, 20, 3, 6 and 20 pathways. This analysis has identified genes and pathways crucial for pregnancy and miscarriage in cows.
Kim, Minsuk; Yi, Jeong Sang; Lakshmanan, Meiyappan; Lee, Dong-Yup; Kim, Byung-Gee
2016-03-01
In silico model-driven analysis using genome-scale model of metabolism (GEM) has been recognized as a promising method for microbial strain improvement. However, most of the current GEM-based strain design algorithms based on flux balance analysis (FBA) heavily rely on the steady-state and optimality assumptions without considering any regulatory information. Thus, their practical usage is quite limited, especially in its application to secondary metabolites overproduction. In this study, we developed a transcriptomics-based strain optimization tool (tSOT) in order to overcome such limitations by integrating transcriptomic data into GEM. Initially, we evaluated existing algorithms for integrating transcriptomic data into GEM using Streptomyces coelicolor dataset, and identified iMAT algorithm as the only and the best algorithm for characterizing the secondary metabolism of S. coelicolor. Subsequently, we developed tSOT platform where iMAT is adopted to predict the reaction states, and successfully demonstrated its applicability to secondary metabolites overproduction by designing actinorhodin (ACT), a polyketide antibiotic, overproducing strain of S. coelicolor. Mutants overexpressing tSOT targets such as ribulose 5-phosphate 3-epimerase and NADP-dependent malic enzyme showed 2 and 1.8-fold increase in ACT production, thereby validating the tSOT prediction. It is expected that tSOT can be used for solving other metabolic engineering problems which could not be addressed by current strain design algorithms, especially for the secondary metabolite overproductions. © 2015 Wiley Periodicals, Inc.
Use of archival resources has been limited to date by inconsistent methods for genomic profiling of degraded RNA from formalin-fixed paraffin-embedded (FFPE) samples. RNA-sequencing offers a promising way to address this problem. Here we evaluated transcriptomic dose responses us...
USDA-ARS?s Scientific Manuscript database
Technological developments in both the collection and analysis of molecular genetic data over the past few years have provided new opportunities for an improved understanding of the global response to pathogen exposure. Such developments are particularly dramatic for scientists studying the pig, whe...
Hou, Yu; Guo, Huahu; Cao, Chen; Li, Xianlong; Hu, Boqiang; Zhu, Ping; Wu, Xinglong; Wen, Lu; Tang, Fuchou; Huang, Yanyi; Peng, Jirun
2016-01-01
Single-cell genome, DNA methylome, and transcriptome sequencing methods have been separately developed. However, to accurately analyze the mechanism by which transcriptome, genome and DNA methylome regulate each other, these omic methods need to be performed in the same single cell. Here we demonstrate a single-cell triple omics sequencing technique, scTrio-seq, that can be used to simultaneously analyze the genomic copy-number variations (CNVs), DNA methylome, and transcriptome of an individual mammalian cell. We show that large-scale CNVs cause proportional changes in RNA expression of genes within the gained or lost genomic regions, whereas these CNVs generally do not affect DNA methylation in these regions. Furthermore, we applied scTrio-seq to 25 single cancer cells derived from a human hepatocellular carcinoma tissue sample. We identified two subpopulations within these cells based on CNVs, DNA methylome, or transcriptome of individual cells. Our work offers a new avenue of dissecting the complex contribution of genomic and epigenomic heterogeneities to the transcriptomic heterogeneity within a population of cells. PMID:26902283
FIT: statistical modeling tool for transcriptome dynamics under fluctuating field conditions
Iwayama, Koji; Aisaka, Yuri; Kutsuna, Natsumaro
2017-01-01
Abstract Motivation: Considerable attention has been given to the quantification of environmental effects on organisms. In natural conditions, environmental factors are continuously changing in a complex manner. To reveal the effects of such environmental variations on organisms, transcriptome data in field environments have been collected and analyzed. Nagano et al. proposed a model that describes the relationship between transcriptomic variation and environmental conditions and demonstrated the capability to predict transcriptome variation in rice plants. However, the computational cost of parameter optimization has prevented its wide application. Results: We propose a new statistical model and efficient parameter optimization based on the previous study. We developed and released FIT, an R package that offers functions for parameter optimization and transcriptome prediction. The proposed method achieves comparable or better prediction performance within a shorter computational time than the previous method. The package will facilitate the study of the environmental effects on transcriptomic variation in field conditions. Availability and Implementation: Freely available from CRAN (https://cran.r-project.org/web/packages/FIT/). Contact: anagano@agr.ryukoku.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online PMID:28158396
Salinas, Yasmmyn D.; Shi, YiJun; Greenwood, Michael; Hoe, See Ziau; Murphy, David; Gainer, Harold
2015-01-01
Magnocellular neurons (MCNs) in the hypothalamo-neurohypophysial system (HNS) are highly specialized to release large amounts of arginine vasopressin (Avp) or oxytocin (Oxt) into the blood stream and play critical roles in the regulation of body fluid homeostasis. The MCNs are osmosensory neurons and are excited by exposure to hypertonic solutions and inhibited by hypotonic solutions. The MCNs respond to systemic hypertonic and hypotonic stimulation with large changes in the expression of their Avp and Oxt genes, and microarray studies have shown that these osmotic perturbations also cause large changes in global gene expression in the HNS. In this paper, we examine gene expression in the rat supraoptic nucleus (SON) under normosmotic and chronic salt-loading SL) conditions by the first time using “new-generation”, RNA sequencing (RNA-Seq) methods. We reliably detect 9,709 genes as present in the SON by RNA-Seq, and 552 of these genes were changed in expression as a result of chronic SL. These genes reflect diverse functions, and 42 of these are involved in either transcriptional or translational processes. In addition, we compare the SON transcriptomes resolved by RNA-Seq methods with the SON transcriptomes determined by Affymetrix microarray methods in rats under the same osmotic conditions, and find that there are 6,466 genes present in the SON that are represented in both data sets, although 1,040 of the expressed genes were found only in the microarray data, and 2,762 of the expressed genes are selectively found in the RNA-Seq data and not the microarray data. These data provide the research community a comprehensive view of the transcriptome in the SON under normosmotic conditions and the changes in specific gene expression evoked by salt loading. PMID:25897513
Shen, Di; Wang, Haiping; Wu, Qingjun; Lu, Peng; Qiu, Yang; Song, Jiangping; Zhang, Youjun; Li, Xixiang
2013-01-01
Background The diamondback moth (DBM, Plutella xylostella) is a crucifer-specific pest that causes significant crop losses worldwide. Barbarea vulgaris (Brassicaceae) can resist DBM and other herbivorous insects by producing feeding-deterrent triterpenoid saponins. Plant breeders have long aimed to transfer this insect resistance to other crops. However, a lack of knowledge on the biosynthetic pathways and regulatory networks of these insecticidal saponins has hindered their practical application. A pyrosequencing-based transcriptome analysis of B. vulgaris during DBM larval feeding was performed to identify genes and gene networks responsible for saponin biosynthesis and its regulation at the genome level. Principal Findings Approximately 1.22, 1.19, 1.16, 1.23, 1.16, 1.20, and 2.39 giga base pairs of clean nucleotides were generated from B. vulgaris transcriptomes sampled 1, 4, 8, 12, 24, and 48 h after onset of P. xylostella feeding and from non-inoculated controls, respectively. De novo assembly using all data of the seven transcriptomes generated 39,531 unigenes. A total of 37,780 (95.57%) unigenes were annotated, 14,399 of which were assigned to one or more gene ontology terms and 19,620 of which were assigned to 126 known pathways. Expression profiles revealed 2,016–4,685 up-regulated and 557–5188 down-regulated transcripts. Secondary metabolic pathways, such as those of terpenoids, glucosinolates, and phenylpropanoids, and its related regulators were elevated. Candidate genes for the triterpene saponin pathway were found in the transcriptome. Orthological analysis of the transcriptome with four other crucifer transcriptomes identified 592 B. vulgaris-specific gene families with a P-value cutoff of 1e−5. Conclusion This study presents the first comprehensive transcriptome analysis of B. vulgaris subjected to a series of DBM feedings. The biosynthetic and regulatory pathways of triterpenoid saponins and other DBM deterrent metabolites in this plant were classified. The results of this study will provide useful data for future investigations on pest-resistance phytochemistry and plant breeding. PMID:23696897
Bu, Dengpan; Bionaz, Massimo; Wang, Mengzhi; Nan, Xuemei; Ma, Lu; Wang, Jiaqi
2017-01-01
Liver and mammary gland are among the most important organs during lactation in dairy cows. With the purpose of understanding both the different and the complementary roles and the crosstalk of those two organs during lactation, a transcriptome analysis was performed on liver and mammary tissues of 10 primiparous dairy cows in mid-lactation. The analysis was performed using a 4×44K Bovine Agilent microarray chip. The transcriptome difference between the two tissues was analyzed using SAS JMP Genomics using ANOVA with a false discovery rate correction (FDR). The analysis uncovered >9,000 genes differentially expressed (DEG) between the two tissues with a FDR<0.001. The functional analysis of the DEG uncovered a larger metabolic (especially related to lipid) and inflammatory response capacity in liver compared with mammary tissue while the mammary tissue had a larger protein synthesis and secretion, proliferation/differentiation, signaling, and innate immune system capacity compared with the liver. A plethora of endogenous compounds, cytokines, and transcription factors were estimated to control the DEG between the two tissues. Compared with mammary tissue, the liver transcriptome appeared to be under control of a large array of ligand-dependent nuclear receptors and, among endogenous chemical, fatty acids and bacteria-derived compounds. Compared with liver, the transcriptome of the mammary tissue was potentially under control of a large number of growth factors and miRNA. The in silico crosstalk analysis between the two tissues revealed an overall large communication with a reciprocal control of lipid metabolism, innate immune system adaptation, and proliferation/differentiation. In summary the transcriptome analysis confirmed prior known differences between liver and mammary tissue, especially considering the indication of a larger metabolic activity in liver compared with the mammary tissue and the larger protein synthesis, communication, and proliferative capacity in mammary tissue compared with the liver. Relatively novel is the indication by the data that the transcriptome of the liver is highly regulated by dietary and bacteria-related compounds while the mammary transcriptome is more under control of hormones, growth factors, and miRNA. A large crosstalk between the two tissues with a reciprocal control of metabolism and innate immune-adaptation was indicated by the network analysis that allowed uncovering previously unknown crosstalk between liver and mammary tissue for several signaling molecules.
Bu, Dengpan; Bionaz, Massimo; Wang, Mengzhi; Nan, Xuemei; Ma, Lu; Wang, Jiaqi
2017-01-01
Liver and mammary gland are among the most important organs during lactation in dairy cows. With the purpose of understanding both the different and the complementary roles and the crosstalk of those two organs during lactation, a transcriptome analysis was performed on liver and mammary tissues of 10 primiparous dairy cows in mid-lactation. The analysis was performed using a 4×44K Bovine Agilent microarray chip. The transcriptome difference between the two tissues was analyzed using SAS JMP Genomics using ANOVA with a false discovery rate correction (FDR). The analysis uncovered >9,000 genes differentially expressed (DEG) between the two tissues with a FDR<0.001. The functional analysis of the DEG uncovered a larger metabolic (especially related to lipid) and inflammatory response capacity in liver compared with mammary tissue while the mammary tissue had a larger protein synthesis and secretion, proliferation/differentiation, signaling, and innate immune system capacity compared with the liver. A plethora of endogenous compounds, cytokines, and transcription factors were estimated to control the DEG between the two tissues. Compared with mammary tissue, the liver transcriptome appeared to be under control of a large array of ligand-dependent nuclear receptors and, among endogenous chemical, fatty acids and bacteria-derived compounds. Compared with liver, the transcriptome of the mammary tissue was potentially under control of a large number of growth factors and miRNA. The in silico crosstalk analysis between the two tissues revealed an overall large communication with a reciprocal control of lipid metabolism, innate immune system adaptation, and proliferation/differentiation. In summary the transcriptome analysis confirmed prior known differences between liver and mammary tissue, especially considering the indication of a larger metabolic activity in liver compared with the mammary tissue and the larger protein synthesis, communication, and proliferative capacity in mammary tissue compared with the liver. Relatively novel is the indication by the data that the transcriptome of the liver is highly regulated by dietary and bacteria-related compounds while the mammary transcriptome is more under control of hormones, growth factors, and miRNA. A large crosstalk between the two tissues with a reciprocal control of metabolism and innate immune-adaptation was indicated by the network analysis that allowed uncovering previously unknown crosstalk between liver and mammary tissue for several signaling molecules. PMID:28291785
USDA-ARS?s Scientific Manuscript database
Using the Eimeria spp. population that infect chickens as a model for coccidian biology, we aimed to survey the transcriptome of E. maxima and contrast it to the two other Eimeria spp. for which transcriptome data are available, E. tenella and E. acervulina. Examining specifically the asexual intra...
Transcriptome profiling analysis of cultivar-specific apple fruit ripening and texture attributes
USDA-ARS?s Scientific Manuscript database
Molecular events regulating cultivar-specific apple fruit ripening and sensory quality are largely unknown. Such knowledge is essential for genomic-assisted apple breeding and postharvest quality management. In this study, transcriptome profile analysis, scanning electron microscopic examination an...
Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi
2017-01-01
We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Sun, Mei-Yu; Li, Jing-Yi; Li, Dong; Huang, Feng-Jie; Wang, Di; Li, Hui; Xing, Quan; Zhu, Hui-Bin; Shi, Lei
2018-04-12
Drynaria roosii (Nakaike) is a traditional Chinese medicinal fern, known as 'GuSuiBu'. The corresponding effective components of naringin/neoeriocitrin share highly similar chemical structure and medicinal function. Our HPLC-MS/MS results showed that the accumulation of naringin/neoeriocitrin depended on specific tissues or ages. However, little was known about the expression patterns of naringin/neoeriocitrin related genes involved in their regulatory pathways. For lack of the basic genetic information, we applied a combination of SMRT sequencing and SGS to generate the complete and full-length transcriptome of D. roosii. According to the SGS data, the DEG-based heat map analysis revealed the naringin/neoeriocitrin related gene expression exhibited obvious tissue- and time-specific transcriptomic differences. Using the systems biology method of modular organization analysis, we clustered 16,472 DEGs into 17 gene modules and studied the relationships between modules and tissue/time point samples, as well as modules and naringin/neoeriocitrin contents. Hereinto, naringin/neoeriocitrin related DEGs distributed in nine distinct modules, and DEGs in these modules showed significant different patterns of transcript abundance to be linked with specific tissues or ages. Moreover, WGCNA results further identified that PAL, 4CL, C4H and C3H, HCT acted as the major hub genes involved in naringin and neoeriocitrin synthesis respectively and exhibited high co-expression with MYB- and bHLH-regulated genes. In this work, modular organization and co-expression networks elucidated the tissue- and time-specificity of gene expression pattern, as well as hub genes associated with naringin/neoeriocitrin synthesis in D. roosii. Simultaneously, the comprehensive transcriptome dataset provided the important genetic information for further research on D. roosii.
Expression of interest: transcriptomics and the designation of conservation units.
Hansen, Michael M
2010-05-01
An important task within conservation genetics consists in defining intraspecific conservation units. Most conceptual frameworks involve two steps: (i) identifying demographically independent units, and (ii) evaluating their degree of adaptive divergence. Whereas a plethora of methods are available for delineating genetic population structure, assessment of functional genetic divergence remains a challenge. In this issue, Tymchuk et al. (2010) study Atlantic salmon (Salmo salar) populations using both microsatellite markers and analysis of global gene expression. They show that important gene expression differences exist that can be interpreted in the context of different ecological conditions experienced by the populations, along with the populations' histories. This demonstrates an important potential role of transcriptomics for designating conservation units.
Guedes, Rafael Lucas Muniz; Rodrigues, Carla Monadeli Filgueira; Coatnoan, Nicolas; Cosson, Alain; Cadioli, Fabiano Antonio; Garcia, Herakles Antonio; Gerber, Alexandra Lehmkuhl; Machado, Rosangela Zacarias; Minoprio, Paola Marcella Camargo; Teixeira, Marta Maria Geraldes; de Vasconcelos, Ana Tereza Ribeiro
2018-02-27
Trypanosoma vivax is a parasite widespread across Africa and South America. Immunological methods using recombinant antigens have been developed aiming at specific and sensitive detection of infections caused by T. vivax. Here, we sequenced for the first time the transcriptome of a virulent T. vivax strain (Lins), isolated from an outbreak of severe disease in South America (Brazil) and performed a computational integrated analysis of genome, transcriptome and in silico predictions to identify and characterize putative linear B-cell epitopes from African and South American T. vivax. A total of 2278, 3936 and 4062 linear B-cell epitopes were respectively characterized for the transcriptomes of T. vivax LIEM-176 (Venezuela), T. vivax IL1392 (Nigeria) and T. vivax Lins (Brazil) and 4684 for the genome of T. vivax Y486 (Nigeria). The results presented are a valuable theoretical source that may pave the way for highly sensitive and specific diagnostic tools. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Rojas, Valentina; Jiménez, Héctor; Palma-Millanao, Rubén; González-González, Angélica; Machuca, Juan; Godoy, Ricardo; Ceballos, Ricardo; Mutis, Ana; Venthur, Herbert
2018-04-30
The grapevine moth, Lobesia botrana, is considered a harmful pest for vineyards in Chile as well as in North America and Europe. Currently, monitoring and control methods of L. botrana are based on its main sex pheromone component, being effective for low population densities. In order to improve control methods, antennal olfactory proteins in moths, such as odorant-binding proteins (OBPs) and odorant receptors (ORs) have been studied as promising targets for the discovery of new potent semiochemicals, which have not been reported for L. botrana. Therefore, the objective of this study was to identify the repertoire of proteins related to chemoreception in L. botrana by antennal transcriptome and analyze the relative expression of OBPs and CSPs in male and female antennae. Through next-generation sequencing of the antennal transcriptome by Ilumina HiSeq2500 we identified a total of 118 chemoreceptors, from which 61, 42 and 15 transcripts are related to ORs, ionotropic receptors (IRs) and gustatory receptors (GRs), respectively. Furthermore, RNA-Seq data revealed 35 transcripts for OBPs and 18 for chemosensory proteins (CSPs). Analysis by qRT-PCR showed 20 OBPs significantly expressed in female antennae, while 5 were more expressed in males. Similarly, most of the CSPs were significantly expressed in female than male antennae. All the olfactory-related sequences were compared with homologs and their phylogenetic relationships elucidated. Finally, our findings in relation to the improvement of L. botrana management are discussed. Copyright © 2018 Elsevier Inc. All rights reserved.
Atia, Jolene; McCloskey, Conor; Shmygol, Anatoly S.; Rand, David A.; van den Berg, Hugo A.; Blanks, Andrew M.
2016-01-01
Uterine smooth muscle cells remain quiescent throughout most of gestation, only generating spontaneous action potentials immediately prior to, and during, labor. This study presents a method that combines transcriptomics with biophysical recordings to characterise the conductance repertoire of these cells, the ‘conductance repertoire’ being the total complement of ion channels and transporters expressed by an electrically active cell. Transcriptomic analysis provides a set of potential electrogenic entities, of which the conductance repertoire is a subset. Each entity within the conductance repertoire was modeled independently and its gating parameter values were fixed using the available biophysical data. The only remaining free parameters were the surface densities for each entity. We characterise the space of combinations of surface densities (density vectors) consistent with experimentally observed membrane potential and calcium waveforms. This yields insights on the functional redundancy of the system as well as its behavioral versatility. Our approach couples high-throughput transcriptomic data with physiological behaviors in health and disease, and provides a formal method to link genotype to phenotype in excitable systems. We accurately predict current densities and chart functional redundancy. For example, we find that to evoke the observed voltage waveform, the BK channel is functionally redundant whereas hERG is essential. Furthermore, our analysis suggests that activation of calcium-activated chloride conductances by intracellular calcium release is the key factor underlying spontaneous depolarisations. PMID:27105427
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants.
Li, Xinguo; Wu, Harry X; Southerton, Simon G
2010-06-21
Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution.
Comparative genomics reveals conservative evolution of the xylem transcriptome in vascular plants
2010-01-01
Background Wood is a valuable natural resource and a major carbon sink. Wood formation is an important developmental process in vascular plants which played a crucial role in plant evolution. Although genes involved in xylem formation have been investigated, the molecular mechanisms of xylem evolution are not well understood. We use comparative genomics to examine evolution of the xylem transcriptome to gain insights into xylem evolution. Results The xylem transcriptome is highly conserved in conifers, but considerably divergent in angiosperms. The functional domains of genes in the xylem transcriptome are moderately to highly conserved in vascular plants, suggesting the existence of a common ancestral xylem transcriptome. Compared to the total transcriptome derived from a range of tissues, the xylem transcriptome is relatively conserved in vascular plants. Of the xylem transcriptome, cell wall genes, ancestral xylem genes, known proteins and transcription factors are relatively more conserved in vascular plants. A total of 527 putative xylem orthologs were identified, which are unevenly distributed across the Arabidopsis chromosomes with eight hot spots observed. Phylogenetic analysis revealed that evolution of the xylem transcriptome has paralleled plant evolution. We also identified 274 conifer-specific xylem unigenes, all of which are of unknown function. These xylem orthologs and conifer-specific unigenes are likely to have played a crucial role in xylem evolution. Conclusions Conifers have highly conserved xylem transcriptomes, while angiosperm xylem transcriptomes are relatively diversified. Vascular plants share a common ancestral xylem transcriptome. The xylem transcriptomes of vascular plants are more conserved than the total transcriptomes. Evolution of the xylem transcriptome has largely followed the trend of plant evolution. PMID:20565927
Schoenfeld, Jonathan; Lessan, Khashayar; Johnson, Nicola A; Charnock-Jones, D Stephen; Evans, Amanda; Vourvouhaki, Ekaterini; Scott, Laurie; Stephens, Richard; Freeman, Tom C; Saidi, Samir A; Tom, Brian; Weston, Gareth C; Rogers, Peter; Smith, Stephen K; Print, Cristin G
2004-01-01
We recently published a review in this journal describing the design, hybridisation and basic data processing required to use gene arrays to investigate vascular biology (Evans et al. Angiogenesis 2003; 6: 93-104). Here, we build on this review by describing a set of powerful and robust methods for the analysis and interpretation of gene array data derived from primary vascular cell cultures. First, we describe the evaluation of transcriptome heterogeneity between primary cultures derived from different individuals, and estimation of the false discovery rate introduced by this heterogeneity and by experimental noise. Then, we discuss the appropriate use of Bayesian t-tests, clustering and independent component analysis to mine the data. We illustrate these principles by analysis of a previously unpublished set of gene array data in which human umbilical vein endothelial cells (HUVEC) cultured in either rich or low-serum media were exposed to vascular endothelial growth factor (VEGF)-A165 or placental growth factor (PlGF)-1(131). We have used Affymetrix U95A gene arrays to map the effects of these factors on the HUVEC transcriptome. These experiments followed a paired design and were biologically replicated three times. In addition, one experiment was repeated using serial analysis of gene expression (SAGE). In contrast to some previous studies, we found that VEGF-A and PlGF consistently regulated only small, non-overlapping and culture media-dependant sets of HUVEC transcripts, despite causing significant cell biological changes.
Comparative transcriptomics of early dipteran development
2013-01-01
Background Modern sequencing technologies have massively increased the amount of data available for comparative genomics. Whole-transcriptome shotgun sequencing (RNA-seq) provides a powerful basis for comparative studies. In particular, this approach holds great promise for emerging model species in fields such as evolutionary developmental biology (evo-devo). Results We have sequenced early embryonic transcriptomes of two non-drosophilid dipteran species: the moth midge Clogmia albipunctata, and the scuttle fly Megaselia abdita. Our analysis includes a third, published, transcriptome for the hoverfly Episyrphus balteatus. These emerging models for comparative developmental studies close an important phylogenetic gap between Drosophila melanogaster and other insect model systems. In this paper, we provide a comparative analysis of early embryonic transcriptomes across species, and use our data for a phylogenomic re-evaluation of dipteran phylogenetic relationships. Conclusions We show how comparative transcriptomics can be used to create useful resources for evo-devo, and to investigate phylogenetic relationships. Our results demonstrate that de novo assembly of short (Illumina) reads yields high-quality, high-coverage transcriptomic data sets. We use these data to investigate deep dipteran phylogenetic relationships. Our results, based on a concatenation of 160 orthologous genes, provide support for the traditional view of Clogmia being the sister group of Brachycera (Megaselia, Episyrphus, Drosophila), rather than that of Culicomorpha (which includes mosquitoes and blackflies). PMID:23432914
USDA-ARS?s Scientific Manuscript database
In order to investigate the mechanisms of persistent foot-and-mouth disease virus (FMDV) infection in cattle, transcriptome alterations associated with the FMDV carrier state were characterized using a bovine whole-transcriptome microarray. Eighteen cattle (8 vaccinated with a recombinant FMDV A vac...
USDA-ARS?s Scientific Manuscript database
Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yie...
Optimized Probe Masking for Comparative Transcriptomics of Closely Related Species
Poeschl, Yvonne; Delker, Carolin; Trenner, Jana; Ullrich, Kristian Karsten; Quint, Marcel; Grosse, Ivo
2013-01-01
Microarrays are commonly applied to study the transcriptome of specific species. However, many available microarrays are restricted to model organisms, and the design of custom microarrays for other species is often not feasible. Hence, transcriptomics approaches of non-model organisms as well as comparative transcriptomics studies among two or more species often make use of cost-intensive RNAseq studies or, alternatively, by hybridizing transcripts of a query species to a microarray of a closely related species. When analyzing these cross-species microarray expression data, differences in the transcriptome of the query species can cause problems, such as the following: (i) lower hybridization accuracy of probes due to mismatches or deletions, (ii) probes binding multiple transcripts of different genes, and (iii) probes binding transcripts of non-orthologous genes. So far, methods for (i) exist, but these neglect (ii) and (iii). Here, we propose an approach for comparative transcriptomics addressing problems (i) to (iii), which retains only transcript-specific probes binding transcripts of orthologous genes. We apply this approach to an Arabidopsis lyrata expression data set measured on a microarray designed for Arabidopsis thaliana, and compare it to two alternative approaches, a sequence-based approach and a genomic DNA hybridization-based approach. We investigate the number of retained probe sets, and we validate the resulting expression responses by qRT-PCR. We find that the proposed approach combines the benefit of sequence-based stringency and accuracy while allowing the expression analysis of much more genes than the alternative sequence-based approach. As an added benefit, the proposed approach requires probes to detect transcripts of orthologous genes only, which provides a superior base for biological interpretation of the measured expression responses. PMID:24260119
Chávez-Mardones, Jacqueline; Gallardo-Escárate, Cristian
2015-12-01
Sea lice are one of the main parasites affecting the salmon aquaculture industry, causing significant economic losses worldwide. Increased resistance to traditional chemical treatments has created the need to find alternative control methods. Therefore, the objective of this study was to identify the transcriptome response of the salmon louse Caligus rogercresseyi to the delousing drug deltamethrin (AlphaMax™). Through bioassays with different concentrations of deltamethrin, adult salmon lice transcriptomes were sequenced from cDNA libraries in the MiSeq Illumina platform. A total of 78 million reads for females and males were assembled in 30,212 and 38,536 contigs, respectively. De novo assembly yielded 86,878 high-quality contigs and, based on published data, it was possible to annotate and identify relevant genes involved in several biological processes. RNA-seq analysis in conjunction with heatmap hierarchical clustering evidenced that pyrethroids modify the ectoparasitic transcriptome in adults, affecting molecular processes associated with the nervous system, cuticle formation, oxidative stress, reproduction, and metabolism, among others. Furthermore, sex-related transcriptome differences were evidenced. Specifically, 534 and 1033 exclusive transcripts were identified for males and females, respectively, and 154 were shared between sexes. For males, estradiol 17-beta-dehydrogenase, sphingolipid delta4-desaturase DES1, ketosamine-3-kinase, and arylsulfatase A, among others, were discovered, while for females, vitellogenin 1, glycoprotein G, transaldolase, and nitric oxide synthase were among those identified. The shared transcripts included annotations for tropomyosin, γ-crystallin A, glutamate receptor-metabotropic, glutathione S-transferase, and carboxipeptidase B. The present study reveals that deltamethrin generates a complex transcriptome response in C. rogercresseyi, thus providing valuable genomic information for developing new delousing drugs.
Duan, Xinle; Wang, Kang; Su, Sha; Tian, Ruizheng; Li, Yuting; Chen, Maohua
2017-01-01
The bird cherry-oat aphid, Rhopalosiphum padi (L.), is one of the most abundant aphid pests of cereals and has a global distribution. Next-generation sequencing (NGS) is a rapid and efficient method for developing molecular markers. However, transcriptomic and genomic resources of R. padi have not been investigated. In this study, we used transcriptome information obtained by RNA-Seq to develop polymorphic microsatellites for investigating population genetics in this species. The transcriptome of R. padi was sequenced on an Illumina HiSeq 2000 platform. A total of 114.4 million raw reads with a GC content of 40.03% was generated. The raw reads were cleaned and assembled into 29,467 unigenes with an N50 length of 1,580 bp. Using several public databases, 82.47% of these unigenes were annotated. Of the annotated unigenes, 8,022 were assigned to COG pathways, 9,895 were assigned to GO pathways, and 14,586 were mapped to 257 KEGG pathways. A total of 7,936 potential microsatellites were identified in 5,564 unigenes, 60 of which were selected randomly and amplified using specific primer pairs. Fourteen loci were found to be polymorphic in the four R. padi populations. The transcriptomic data presented herein will facilitate gene discovery, gene analyses, and development of molecular markers for future studies of R. padi and other closely related aphid species.
Xie, Feng-Yun; Feng, Yu-Long; Wang, Hong-Hui; Ma, Yun-Feng; Yang, Yang; Wang, Yin-Chao; Shen, Wei; Pan, Qing-Jie; Yin, Shen; Sun, Yu-Jiang; Ma, Jun-Yu
2015-01-01
Prior to the mechanization of agriculture and labor-intensive tasks, humans used donkeys (Equus africanus asinus) for farm work and packing. However, as mechanization increased, donkeys have been increasingly raised for meat, milk, and fur in China. To maintain the development of the donkey industry, breeding programs should focus on traits related to these new uses. Compared to conventional marker-assisted breeding plans, genome- and transcriptome-based selection methods are more efficient and effective. To analyze the coding genes of the donkey genome, we assembled the transcriptome of donkey white blood cells de novo. Using transcriptomic deep-sequencing data, we identified 264,714 distinct donkey unigenes and predicted 38,949 protein fragments. We annotated the donkey unigenes by BLAST searches against the non-redundant (NR) protein database. We also compared the donkey protein sequences with those of the horse (E. caballus) and wild horse (E. przewalskii), and linked the donkey protein fragments with mammalian phenotypes. As the outer ear size of donkeys and horses are obviously different, we compared the outer ear size-associated proteins in donkeys and horses. We identified three ear size-associated proteins, HIC1, PRKRA, and KMT2A, with sequence differences among the donkey, horse, and wild horse loci. Since the donkey genome sequence has not been released, the de novo assembled donkey transcriptome is helpful for preliminary investigations of donkey cultivars and for genetic improvement. PMID:26208029
Xie, Feng-Yun; Feng, Yu-Long; Wang, Hong-Hui; Ma, Yun-Feng; Yang, Yang; Wang, Yin-Chao; Shen, Wei; Pan, Qing-Jie; Yin, Shen; Sun, Yu-Jiang; Ma, Jun-Yu
2015-01-01
Prior to the mechanization of agriculture and labor-intensive tasks, humans used donkeys (Equus africanus asinus) for farm work and packing. However, as mechanization increased, donkeys have been increasingly raised for meat, milk, and fur in China. To maintain the development of the donkey industry, breeding programs should focus on traits related to these new uses. Compared to conventional marker-assisted breeding plans, genome- and transcriptome-based selection methods are more efficient and effective. To analyze the coding genes of the donkey genome, we assembled the transcriptome of donkey white blood cells de novo. Using transcriptomic deep-sequencing data, we identified 264,714 distinct donkey unigenes and predicted 38,949 protein fragments. We annotated the donkey unigenes by BLAST searches against the non-redundant (NR) protein database. We also compared the donkey protein sequences with those of the horse (E. caballus) and wild horse (E. przewalskii), and linked the donkey protein fragments with mammalian phenotypes. As the outer ear size of donkeys and horses are obviously different, we compared the outer ear size-associated proteins in donkeys and horses. We identified three ear size-associated proteins, HIC1, PRKRA, and KMT2A, with sequence differences among the donkey, horse, and wild horse loci. Since the donkey genome sequence has not been released, the de novo assembled donkey transcriptome is helpful for preliminary investigations of donkey cultivars and for genetic improvement.
RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome
USDA-ARS?s Scientific Manuscript database
A first analysis of the Glycine max (L.) Merr. (soybean) transcriptome using next generation sequencing technology and RNA-Sequencing (RNA-Seq) is presented. This analysis will provide an important resource for understanding transcription and gene co-regulatory networks in soybean, the most economic...
Tools for Genomic and Transcriptomic Analysis of Microbes at Single-Cell Level
Chen, Zixi; Chen, Lei; Zhang, Weiwen
2017-01-01
Microbiologists traditionally study population rather than individual cells, as it is generally assumed that the status of individual cells will be similar to that observed in the population. However, the recent studies have shown that the individual behavior of each single cell could be quite different from that of the whole population, suggesting the importance of extending traditional microbiology studies to single-cell level. With recent technological advances, such as flow cytometry, next-generation sequencing (NGS), and microspectroscopy, single-cell microbiology has greatly enhanced the understanding of individuality and heterogeneity of microbes in many biological systems. Notably, the application of multiple ‘omics’ in single-cell analysis has shed light on how individual cells perceive, respond, and adapt to the environment, how heterogeneity arises under external stress and finally determines the fate of the whole population, and how microbes survive under natural conditions. As single-cell analysis involves no axenic cultivation of target microorganism, it has also been demonstrated as a valuable tool for dissecting the microbial ‘dark matter.’ In this review, current state-of-the-art tools and methods for genomic and transcriptomic analysis of microbes at single-cell level were critically summarized, including single-cell isolation methods and experimental strategies of single-cell analysis with NGS. In addition, perspectives on the future trends of technology development in the field of single-cell analysis was also presented. PMID:28979258
Fasoli, Marianna; Dal Santo, Silvia; Zenoni, Sara; Tornielli, Giovanni Battista; Farina, Lorenzo; Zamboni, Anita; Porceddu, Andrea; Venturini, Luca; Bicego, Manuele; Murino, Vittorio; Ferrarini, Alberto; Delledonne, Massimo; Pezzotti, Mario
2012-09-01
We developed a genome-wide transcriptomic atlas of grapevine (Vitis vinifera) based on 54 samples representing green and woody tissues and organs at different developmental stages as well as specialized tissues such as pollen and senescent leaves. Together, these samples expressed ∼91% of the predicted grapevine genes. Pollen and senescent leaves had unique transcriptomes reflecting their specialized functions and physiological status. However, microarray and RNA-seq analysis grouped all the other samples into two major classes based on maturity rather than organ identity, namely, the vegetative/green and mature/woody categories. This division represents a fundamental transcriptomic reprogramming during the maturation process and was highlighted by three statistical approaches identifying the transcriptional relationships among samples (correlation analysis), putative biomarkers (O2PLS-DA approach), and sets of strongly and consistently expressed genes that define groups (topics) of similar samples (biclustering analysis). Gene coexpression analysis indicated that the mature/woody developmental program results from the reiterative coactivation of pathways that are largely inactive in vegetative/green tissues, often involving the coregulation of clusters of neighboring genes and global regulation based on codon preference. This global transcriptomic reprogramming during maturation has not been observed in herbaceous annual species and may be a defining characteristic of perennial woody plants.
Díaz, Noelia; Ribas, Laia; Piferrer, Francesc
2014-01-01
Background Food supply is a major factor influencing growth rates in animals. This has important implications for both natural and farmed fish populations, since food restriction may difficult reproduction. However, a study on the effects of food supply on the development of juvenile gonads has never been transcriptionally described in fish. Methods and Findings This study investigated the consequences of growth on gonadal transcriptome of European sea bass in: 1) 4-month-old sexually undifferentiated fish, comparing the gonads of fish with the highest vs. the lowest growth, to explore a possible link between transcriptome and future sex, and 2) testis from 11-month-old juveniles where growth had been manipulated through changes in food supply. The four groups used were: i) sustained fast growth, ii) sustained slow growth, iii) accelerated growth, iv) decelerated growth. The transcriptome of undifferentiated gonads was not drastically affected by initial natural differences in growth. Further, changes in the expression of genes associated with protein turnover were seen, favoring catabolism in slow-growing fish and anabolism in fast-growing fish. Moreover, while fast-growing fish took energy from glucose, as deduced from the pathways affected and the analysis of protein-protein interactions examined, in slow-growing fish lipid metabolism and gluconeogenesis was favored. Interestingly, the highest transcriptomic differences were found when forcing initially fast-growing fish to decelerate their growth, while accelerating growth of initially slow-growing fish resulted in full transcriptomic convergence with sustained fast-growing fish. Conclusions Food availability during sex differentiation shapes the juvenile testis transcriptome, as evidenced by adaptations to different energy balances. Remarkably, this occurs in absence of major histological changes in the testis. Thus, fish are able to recover transcriptionally their testes if they are provided with enough food supply during sex differentiation; however, an initial fast growth does not represent any advantage in terms of transcriptional fitness if later food becomes scarce. PMID:25340342
Validation of two ribosomal RNA removal methods for microbial metatranscriptomics
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Shaomei; Wurtzel, Omri; Singh, Kanwar
2010-10-01
The predominance of rRNAs in the transcriptome is a major technical challenge in sequence-based analysis of cDNAs from microbial isolates and communities. Several approaches have been applied to deplete rRNAs from (meta)transcriptomes, but no systematic investigation of potential biases introduced by any of these approaches has been reported. Here we validated the effectiveness and fidelity of the two most commonly used approaches, subtractive hybridization and exonuclease digestion, as well as combinations of these treatments, on two synthetic five-microorganism metatranscriptomes using massively parallel sequencing. We found that the effectiveness of rRNA removal was a function of community composition and RNA integritymore » for these treatments. Subtractive hybridization alone introduced the least bias in relative transcript abundance, whereas exonuclease and in particular combined treatments greatly compromised mRNA abundance fidelity. Illumina sequencing itself also can compromise quantitative data analysis by introducing a G+C bias between runs.« less
Single-cell isolation by a modular single-cell pipette for RNA-sequencing.
Zhang, Kai; Gao, Min; Chong, Zechen; Li, Ying; Han, Xin; Chen, Rui; Qin, Lidong
2016-11-29
Single-cell transcriptome sequencing highly requires a convenient and reliable method to rapidly isolate a live cell into a specific container such as a PCR tube. Here, we report a modular single-cell pipette (mSCP) consisting of three modular components, a SCP-Tip, an air-displacement pipette (ADP), and ADP-Tips, that can be easily assembled, disassembled, and reassembled. By assembling the SCP-Tip containing a hydrodynamic trap, the mSCP can isolate single cells from 5-10 cells per μL of cell suspension. The mSCP is compatible with microscopic identification of captured single cells to finally achieve 100% single-cell isolation efficiency. The isolated live single cells are in submicroliter volumes and well suitable for single-cell PCR analysis and RNA-sequencing. The mSCP possesses merits of convenience, rapidness, and high efficiency, making it a powerful tool to isolate single cells for transcriptome analysis.
Li, Yong-Fang; Mahalingam, Ramamurthy; Sunkar, Ramanjulu
2017-01-01
Alteration of gene expression is an essential mechanism, which allows plants to respond and adapt to adverse environmental conditions. Transcriptome and proteome analyses in plants exposed to abiotic stresses revealed that protein levels are not correlated with the changes in corresponding mRNAs, indicating regulation at translational level is another major regulator for gene expression. Analysis of translatome, which refers to all mRNAs associated with ribosomes, thus has the potential to bridge the gap between transcriptome and proteome. Polysomal RNA profiling and recently developed ribosome profiling (Ribo-seq) are two main methods for translatome analysis at global level. Here, we describe the classical procedure for polysomal RNA isolation by sucrose gradient ultracentrifugation followed by highthroughput RNA-seq to identify genes regulated at translational level. Polysomal RNA can be further used for a variety of downstream applications including Northern blot analysis, qRT-PCR, RNase protection assay, and microarray-based gene expression profiling.
Single-cell transcriptomics for microbial eukaryotes.
Kolisko, Martin; Boscaro, Vittorio; Burki, Fabien; Lynn, Denis H; Keeling, Patrick J
2014-11-17
One of the greatest hindrances to a comprehensive understanding of microbial genomics, cell biology, ecology, and evolution is that most microbial life is not in culture. Solutions to this problem have mainly focused on whole-community surveys like metagenomics, but these analyses inevitably loose information and present particular challenges for eukaryotes, which are relatively rare and possess large, gene-sparse genomes. Single-cell analyses present an alternative solution that allows for specific species to be targeted, while retaining information on cellular identity, morphology, and partitioning of activities within microbial communities. Single-cell transcriptomics, pioneered in medical research, offers particular potential advantages for uncultivated eukaryotes, but the efficiency and biases have not been tested. Here we describe a simple and reproducible method for single-cell transcriptomics using manually isolated cells from five model ciliate species; we examine impacts of amplification bias and contamination, and compare the efficacy of gene discovery to traditional culture-based transcriptomics. Gene discovery using single-cell transcriptomes was found to be comparable to mass-culture methods, suggesting single-cell transcriptomics is an efficient entry point into genomic data from the vast majority of eukaryotic biodiversity. Copyright © 2014 Elsevier Ltd. All rights reserved.
Maternal Pre-Pregnancy Obesity Is Associated with Altered Placental Transcriptome.
Altmäe, Signe; Segura, Maria Teresa; Esteban, Francisco J; Bartel, Sabine; Brandi, Pilar; Irmler, Martin; Beckers, Johannes; Demmelmair, Hans; López-Sabater, Carmen; Koletzko, Berthold; Krauss-Etschmann, Susanne; Campoy, Cristina
2017-01-01
Maternal obesity has a major impact on pregnancy outcomes. There is growing evidence that maternal obesity has a negative influence on placental development and function, thereby adversely influencing offspring programming and health outcomes. However, the molecular mechanisms underlying these processes are poorly understood. We analysed ten term placenta's whole transcriptomes in obese (n = 5) and normal weight women (n = 5), using the Affymetrix microarray platform. Analyses of expression data were carried out using non-parametric methods. Hierarchical clustering and principal component analysis showed a clear distinction in placental transcriptome between obese and normal weight women. We identified 72 differentially regulated genes, with most being down-regulated in obesity (n = 61). Functional analyses of the targets using DAVID and IPA confirm the dysregulation of previously identified processes and pathways in the placenta from obese women, including inflammation and immune responses, lipid metabolism, cancer pathways, and angiogenesis. In addition, we detected new molecular aspects of obesity-derived effects on the placenta, involving the glucocorticoid receptor signalling pathway and dysregulation of several genes including CCL2, FSTL3, IGFBP1, MMP12, PRG2, PRL, QSOX1, SERPINE2 and TAC3. Our global gene expression profiling approach demonstrates that maternal obesity creates a unique in utero environment that impairs the placental transcriptome.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-01-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
Meta-analysis of pathway enrichment: combining independent and dependent omics data sets.
Kaever, Alexander; Landesfeind, Manuel; Feussner, Kirstin; Morgenstern, Burkhard; Feussner, Ivo; Meinicke, Peter
2014-01-01
A major challenge in current systems biology is the combination and integrative analysis of large data sets obtained from different high-throughput omics platforms, such as mass spectrometry based Metabolomics and Proteomics or DNA microarray or RNA-seq-based Transcriptomics. Especially in the case of non-targeted Metabolomics experiments, where it is often impossible to unambiguously map ion features from mass spectrometry analysis to metabolites, the integration of more reliable omics technologies is highly desirable. A popular method for the knowledge-based interpretation of single data sets is the (Gene) Set Enrichment Analysis. In order to combine the results from different analyses, we introduce a methodical framework for the meta-analysis of p-values obtained from Pathway Enrichment Analysis (Set Enrichment Analysis based on pathways) of multiple dependent or independent data sets from different omics platforms. For dependent data sets, e.g. obtained from the same biological samples, the framework utilizes a covariance estimation procedure based on the nonsignificant pathways in single data set enrichment analysis. The framework is evaluated and applied in the joint analysis of Metabolomics mass spectrometry and Transcriptomics DNA microarray data in the context of plant wounding. In extensive studies of simulated data set dependence, the introduced correlation could be fully reconstructed by means of the covariance estimation based on pathway enrichment. By restricting the range of p-values of pathways considered in the estimation, the overestimation of correlation, which is introduced by the significant pathways, could be reduced. When applying the proposed methods to the real data sets, the meta-analysis was shown not only to be a powerful tool to investigate the correlation between different data sets and summarize the results of multiple analyses but also to distinguish experiment-specific key pathways.
USDA-ARS?s Scientific Manuscript database
Drought tolerance is a complex trait that is governed by multiple genes. To identify the potential candidate genes, comparative analysis of drought stress-responsive transcriptome between drought-tolerant (Triticum aestivum Cv. C306) and drought-sensitive (Triticum aestivum Cv. WL711) genotypes was ...
USDA-ARS?s Scientific Manuscript database
Identification of genes with differential transcript abundance (GDTA) in seedless mutants may enhance understanding of seedless citrus development. Transcriptome analysis was conducted at three time points during early fruit development (Phase 1) of three seedy citrus genotypes: Fallglo [Bower citru...
van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B
2015-01-01
Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia.
Anway, Matthew D.; Skinner, Michael K.
2018-01-01
PURPOSE The ability of an endocrine disruptor exposure during gonadal sex determination to promote a transgenerational prostate disease phenotype was investigated in the current study. METHODS Exposure of an F0 gestating female rat to the endocrine disruptor vinclozolin during F1 embryo gonadal sex determination promoted a transgenerational adult onset prostate disease phenotype. The prostate disease phenotype and physiological parameters were determined for males from F1 to F4 generations and the prostate transcriptome was assessed in the F3 generation. RESULTS Although the prostate in prepubertal animals develops normally, abnormalities involving epithelial cell atrophy, glandular dysgenesis, prostatitis, and hyperplasia of the ventral prostate develop in older animals. The ventral prostate phenotype was transmitted for four generations (F1–F4). Analysis of the ventral prostate transcriptome demonstrated 954 genes had significantly altered expression between control and vinclozolin F3 generation animals. Analysis of isolated ventral prostate epithelial cells identified 259 genes with significantly altered expression between control and vinclozolin F3 generation animals. Characterization of regulated genes demonstrated several cellular pathways were influenced, including calcium and WNT. A number of genes identified have been shown to be associated with prostate disease and cancer, including beta-microseminoprotein (Msp) and tumor necrosis factor receptor superfamily 6 (Fadd). CONCLUSIONS The ability of an endocrine disruptor to promote transgenerational prostate abnormalities appears to involve an epigenetic transgenerational alteration in the prostate transcriptome and male germ-line. Potential epigenetic transgenerational alteration of prostate gene expression by environmental compounds may be important to consider in the etiology of adult onset prostate disease. PMID:18220299
Gao, Yuan; He, Xiaoli; Wu, Bin; Long, Qiliang; Shao, Tianwei; Wang, Zi; Wei, Jianhe; Li, Yong; Ding, Wanlong
2016-01-01
Panax ginseng C. A. Meyer is a highly valued medicinal plant. Cylindrocarpon destructans is a destructive pathogen that causes root rot and significantly reduces the quality and yield of P. ginseng. However, an efficient method to control root rot remains unavailable because of insufficient understanding of the molecular mechanism underlying C. destructans-P. ginseng interaction. In this study, C. destructans-induced transcriptomes at different time points were investigated using RNA sequencing (RNA-Seq). De novo assembly produced 73,335 unigenes for the P. ginseng transcriptome after C. destructans infection, in which 3,839 unigenes were up-regulated. Notably, the abundance of the up-regulated unigenes sharply increased at 0.5 d postinoculation to provide effector-triggered immunity. In total, 24 of 26 randomly selected unigenes can be validated using quantitative reverse transcription (qRT)-PCR. Gene ontology enrichment analysis of these unigenes showed that "defense response to fungus", "defense response" and "response to stress" were enriched. In addition, differentially expressed transcription factors involved in the hormone signaling pathways after C. destructans infection were identified. Finally, differentially expressed unigenes involved in reactive oxygen species and ginsenoside biosynthetic pathway during C. destructans infection were indentified. To our knowledge, this study is the first to report on the dynamic transcriptome triggered by C. destructans. These results improve our understanding of disease resistance in P. ginseng and provide a useful resource for quick detection of induced markers in P. ginseng before the comprehensive outbreak of this disease caused by C. destructans.
Molinaro, Alyssa M; Pearson, Bret J
2016-04-27
The planarian Schmidtea mediterranea is a master regenerator with a large adult stem cell compartment. The lack of transgenic labeling techniques in this animal has hindered the study of lineage progression and has made understanding the mechanisms of tissue regeneration a challenge. However, recent advances in single-cell transcriptomics and analysis methods allow for the discovery of novel cell lineages as differentiation progresses from stem cell to terminally differentiated cell. Here we apply pseudotime analysis and single-cell transcriptomics to identify adult stem cells belonging to specific cellular lineages and identify novel candidate genes for future in vivo lineage studies. We purify 168 single stem and progeny cells from the planarian head, which were subjected to single-cell RNA sequencing (scRNAseq). Pseudotime analysis with Waterfall and gene set enrichment analysis predicts a molecularly distinct neoblast sub-population with neural character (νNeoblasts) as well as a novel alternative lineage. Using the predicted νNeoblast markers, we demonstrate that a novel proliferative stem cell population exists adjacent to the brain. scRNAseq coupled with in silico lineage analysis offers a new approach for studying lineage progression in planarians. The lineages identified here are extracted from a highly heterogeneous dataset with minimal prior knowledge of planarian lineages, demonstrating that lineage purification by transgenic labeling is not a prerequisite for this approach. The identification of the νNeoblast lineage demonstrates the usefulness of the planarian system for computationally predicting cellular lineages in an adult context coupled with in vivo verification.
2010-01-01
Background The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells. Results The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques. Conclusions BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets. PMID:20128918
Brain transcriptome atlases: a computational perspective.
Mahfouz, Ahmed; Huisman, Sjoerd M H; Lelieveldt, Boudewijn P F; Reinders, Marcel J T
2017-05-01
The immense complexity of the mammalian brain is largely reflected in the underlying molecular signatures of its billions of cells. Brain transcriptome atlases provide valuable insights into gene expression patterns across different brain areas throughout the course of development. Such atlases allow researchers to probe the molecular mechanisms which define neuronal identities, neuroanatomy, and patterns of connectivity. Despite the immense effort put into generating such atlases, to answer fundamental questions in neuroscience, an even greater effort is needed to develop methods to probe the resulting high-dimensional multivariate data. We provide a comprehensive overview of the various computational methods used to analyze brain transcriptome atlases.
Single-cell transcriptome conservation in cryopreserved cells and tissues.
Guillaumet-Adkins, Amy; Rodríguez-Esteban, Gustavo; Mereu, Elisabetta; Mendez-Lago, Maria; Jaitin, Diego A; Villanueva, Alberto; Vidal, August; Martinez-Marti, Alex; Felip, Enriqueta; Vivancos, Ana; Keren-Shaul, Hadas; Heath, Simon; Gut, Marta; Amit, Ido; Gut, Ivo; Heyn, Holger
2017-03-01
A variety of single-cell RNA preparation procedures have been described. So far, protocols require fresh material, which hinders complex study designs. We describe a sample preservation method that maintains transcripts in viable single cells, allowing one to disconnect time and place of sampling from subsequent processing steps. We sequence single-cell transcriptomes from >1000 fresh and cryopreserved cells using 3'-end and full-length RNA preparation methods. Our results confirm that the conservation process did not alter transcriptional profiles. This substantially broadens the scope of applications in single-cell transcriptomics and could lead to a paradigm shift in future study designs.
2014-01-01
Background Arabidopsis thaliana, a member of the Brassicaceae family is the dominant genetic model plant. However, while the flowers within the Brassicaceae members are rather uniform, mainly radially symmetrical, mostly white with fixed organ numbers, species within the Cleomaceae, the sister family to the Brassicaceae show a more variable floral morphology. We were interested in understanding the molecular basis for these morphological differences. To this end, the floral transcriptome of a hybrid Tarenaya hassleriana, a Cleomaceae with monosymmetric, bright purple flowers was sequenced, annotated and analyzed in respect to floral regulators. Results We obtained a comprehensive floral transcriptome with high depth and coverage close to saturation analyzed using rarefaction analysis a method well known in biodiversity studies. Gene expression was analyzed by calculating reads per kilobase gene model per million reads (RPKM) and for selected genes in silico expression data was corroborated by qRT-PCR analysis. Candidate transcription factors were identified based on differences in expression pattern between A. thaliana and T. hassleriana, which are likely key regulators of the T. hassleriana specific floral characters such as coloration and male sterility in the hybrid plant used. Analysis of lineage specific genes was carried out with members of the fabids and malvids. Conclusions The floral transcriptome of T. hassleriana provides insights into key pathways involved in the regulation of late anthocyanin biosynthesis, male fertility, flowering time and organ growth regulation which are unique traits compared the model organism A. thaliana. Analysis of lineage specific genes carried out with members of the fabids and malvids suggests an extensive gene birth rate in the lineage leading to core Brassicales while only few genes were potentially lost during core Brassicales evolution, which possibly reflects the result of the At-β whole genome duplication. Our analysis should facilitate further analyses into the molecular mechanisms of floral morphogenesis and pigmentation and the mechanisms underlying the rather diverse floral morphologies in the Cleomaceae. PMID:24548348
Transcription Profiling Analysis of Mango–Fusarium mangiferae Interaction
Liu, Feng; Wu, Jing-bo; Zhan, Ru-lin; Ou, Xiong-chang
2016-01-01
Malformation caused by Fusarium mangiferae is one of the most destructive mango diseases affecting the canopy and floral development, leading to dramatic reduction in fruit yield. To further understand the mechanism of interaction between mango and F. mangiferae, we monitored the transcriptome profiles of buds from susceptible mango plants, which were challenged with F. mangiferae. More than 99 million reads were deduced by RNA-sequencing and were assembled into 121,267 unigenes. Based on the sequence similarity searches, 61,706 unigenes were identified, of which 21,273 and 50,410 were assigned to gene ontology categories and clusters of orthologous groups, respectively, and 33,243 were mapped to 119 KEGG pathways. The differentially expressed genes of mango were detected, having 15,830, 26,061, and 20,146 DEGs respectively, after infection for 45, 75, and 120 days. The analysis of the comparative transcriptome suggests that basic defense mechanisms play important roles in disease resistance. The data also show the transcriptional responses of interactions between mango and the pathogen and more drastic changes in the host transcriptome in response to the pathogen. These results could be used to develop new methods to broaden the resistance of mango to malformation, including the over-expression of key mango genes. PMID:27683574
Dong, Qiongye; Wei, Lei; Zhang, Michael Q; Wang, Xiaowo
2018-06-24
Dysregulation of mRNA splicing has been observed in certain cellular senescence process. However, the common splicing alterations on the whole transcriptome shared by various types of senescence are poorly understood. In order to systematically identify senescence-associated transcriptomic changes in genome-wide scale, we collected RNA sequencing datasets of different human cell types with a variety of senescence-inducing methods from public databases and performed meta-analysis. First, we discovered that a group of RNA binding proteins were consistently down-regulated in diverse senescent samples and identified 406 senescence-associated common differential splicing events. Then, eight differentially expressed RNA binding proteins were predicted to regulate these senescence-associated splicing alterations through an enrichment analysis of their RNA binding information, including motif scanning and enhanced cross-linking immunoprecipitation data. In addition, we constructed the splicing regulatory modules that might contribute to senescence-associated biological processes. Finally, it was confirmed that knockdown of the predicted senescence-associated potential splicing regulators through shRNAs in HepG2 cell line could result in senescence-like splicing changes. Taken together, our work demonstrated a broad range of common changes in mRNA splicing switches and detected their central regulatory RNA binding proteins during senescence. These findings would help to better understand the coordinating splicing alterations in cellular senescence.
Choi, Sun Young; Park, Byeonghyeok; Choi, In-Geol; Sim, Sang Jun; Lee, Sun-Mi; Um, Youngsoon; Woo, Han Min
2016-01-01
The development of high-throughput technology using RNA-seq has allowed understanding of cellular mechanisms and regulations of bacterial transcription. In addition, transcriptome analysis with RNA-seq has been used to accelerate strain improvement through systems metabolic engineering. Synechococcus elongatus PCC 7942, a photosynthetic bacterium, has remarkable potential for biochemical and biofuel production due to photoautotrophic cell growth and direct CO2 conversion. Here, we performed a transcriptome analysis of S. elongatus PCC 7942 using RNA-seq to understand the changes of cellular metabolism and regulation for nitrogen starvation responses. As a result, differentially expressed genes (DEGs) were identified and functionally categorized. With mapping onto metabolic pathways, we probed transcriptional perturbation and regulation of carbon and nitrogen metabolisms relating to nitrogen starvation responses. Experimental evidence such as chlorophyll a and phycobilisome content and the measurement of CO2 uptake rate validated the transcriptome analysis. The analysis suggests that S. elongatus PCC 7942 reacts to nitrogen starvation by not only rearranging the cellular transport capacity involved in carbon and nitrogen assimilation pathways but also by reducing protein synthesis and photosynthesis activities. PMID:27488818
Transcriptomics of cortical gray matter thickness decline during normal aging.
Kochunov, P; Charlesworth, J; Winkler, A; Hong, L E; Nichols, T E; Curran, J E; Sprooten, E; Jahanshad, N; Thompson, P M; Johnson, M P; Kent, J W; Landman, B A; Mitchell, B; Cole, S A; Dyer, T D; Moses, E K; Goring, H H H; Almasy, L; Duggirala, R; Olvera, R L; Glahn, D C; Blangero, J
2013-11-15
We performed a whole-transcriptome correlation analysis, followed by the pathway enrichment and testing of innate immune response pathway analyses to evaluate the hypothesis that transcriptional activity can predict cortical gray matter thickness (GMT) variability during normal cerebral aging. Transcriptome and GMT data were available for 379 individuals (age range=28-85) community-dwelling members of large extended Mexican American families. Collection of transcriptome data preceded that of neuroimaging data by 17 years. Genome-wide gene transcriptome data consisted of 20,413 heritable lymphocytes-based transcripts. GMT measurements were performed from high-resolution (isotropic 800 μm) T1-weighted MRI. Transcriptome-wide and pathway enrichment analysis was used to classify genes correlated with GMT. Transcripts for sixty genes from seven innate immune pathways were tested as specific predictors of GMT variability. Transcripts for eight genes (IGFBP3, LRRN3, CRIP2, SCD, IDS, TCF4, GATA3, and HN1) passed the transcriptome-wide significance threshold. Four orthogonal factors extracted from this set predicted 31.9% of the variability in the whole-brain and between 23.4 and 35% of regional GMT measurements. Pathway enrichment analysis identified six functional categories including cellular proliferation, aggregation, differentiation, viral infection, and metabolism. The integrin signaling pathway was significantly (p<10(-6)) enriched with GMT. Finally, three innate immune pathways (complement signaling, toll-receptors and scavenger and immunoglobulins) were significantly associated with GMT. Expression activity for the genes that regulate cellular proliferation, adhesion, differentiation and inflammation can explain a significant proportion of individual variability in cortical GMT. Our findings suggest that normal cerebral aging is the product of a progressive decline in regenerative capacity and increased neuroinflammation. Copyright © 2013 Elsevier Inc. All rights reserved.
Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes.
Kumar, Vikas; Kutschera, Verena E; Nilsson, Maria A; Janke, Axel
2015-08-07
The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species. The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago. Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic adaptation in foxes. Similar to polar bears, fat metabolism seems to play a central role in adaptation of Arctic foxes to the cold climate, as has been identified in the polar bear, another arctic specialist.
DOGMA: domain-based transcriptome and proteome quality assessment.
Dohmen, Elias; Kremer, Lukas P M; Bornberg-Bauer, Erich; Kemena, Carsten
2016-09-01
Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines. We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content for further analysis. DOGMA provides a very fast way to do quality assessment within seconds. DOGMA is implemented in Python and published under GNU GPL v.3 license. The source code is available on https://ebbgit.uni-muenster.de/domainWorld/DOGMA/ CONTACTS: e.dohmen@wwu.de or c.kemena@wwu.de Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Hsiang, Chien-Yun; Chen, Yueh-Sheng; Ho, Tin-Yun
2009-06-01
Establishment of a comprehensive platform for the assessment of host-biomaterial interaction in vivo is an important issue. Nuclear factor-kappaB (NF-kappaB) is an inducible transcription factor that is activated by numerous stimuli. Therefore, NF-kappaB-dependent luminescent signal in transgenic mice carrying the luciferase genes was used as the guide to monitor the biomaterials-affected organs, and transcriptomic analysis was further applied to evaluate the complex host responses in affected organs in this study. In vivo imaging showed that genipin-cross-linked gelatin conduit (GGC) implantation evoked the strong NF-kappaB activity at 6h in the implanted region, and transcriptomic analysis showed that the expressions of interleukin-6 (IL-6), IL-24, and IL-1 family were up-regulated. A strong luminescent signal was observed in spleen on 14 d, suggesting that GGC implantation might elicit the biological events in spleen. Transcriptomic analysis of spleen showed that 13 Kyoto Encyclopedia of Genes and Genomes pathways belonging to cell cycles, immune responses, and metabolism were significantly altered by GGC implants. Connectivity Map analysis suggested that the gene signatures of GGC were similar to those of compounds that affect lipid or glucose metabolism. GeneSetTest analysis further showed that host responses to GGC implants might be related to diseases states, especially the metabolic and cardiovascular diseases. In conclusion, our data provided a concept of molecular imaging-guided transcriptomic platform for the evaluation and the prediction of host-biomaterial interaction in vivo.
The Anopheles gambiae transcriptome - a turning point for malaria control.
Domingos, A; Pinheiro-Silva, R; Couto, J; do Rosário, V; de la Fuente, J
2017-04-01
Mosquitoes are important vectors of several pathogens and thereby contribute to the spread of diseases, with social, economic and public health impacts. Amongst the approximately 450 species of Anopheles, about 60 are recognized as vectors of human malaria, the most important parasitic disease. In Africa, Anopheles gambiae is the main malaria vector mosquito. Current malaria control strategies are largely focused on drugs and vector control measures such as insecticides and bed-nets. Improvement of current, and the development of new, mosquito-targeted malaria control methods rely on a better understanding of mosquito vector biology. An organism's transcriptome is a reflection of its physiological state and transcriptomic analyses of different conditions that are relevant to mosquito vector competence can therefore yield important information. Transcriptomic analyses have contributed significant information on processes such as blood-feeding parasite-vector interaction, insecticide resistance, and tissue- and stage-specific gene regulation, thereby facilitating the path towards the development of new malaria control methods. Here, we discuss the main applications of transcriptomic analyses in An. gambiae that have led to a better understanding of mosquito vector competence. © 2017 The Royal Entomological Society.
Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P
2012-06-15
The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.
Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P
2012-01-01
The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961
High-throughput full-length single-cell mRNA-seq of rare cells.
Ooi, Chin Chun; Mantalas, Gary L; Koh, Winston; Neff, Norma F; Fuchigami, Teruaki; Wong, Dawson J; Wilson, Robert J; Park, Seung-Min; Gambhir, Sanjiv S; Quake, Stephen R; Wang, Shan X
2017-01-01
Single-cell characterization techniques, such as mRNA-seq, have been applied to a diverse range of applications in cancer biology, yielding great insight into mechanisms leading to therapy resistance and tumor clonality. While single-cell techniques can yield a wealth of information, a common bottleneck is the lack of throughput, with many current processing methods being limited to the analysis of small volumes of single cell suspensions with cell densities on the order of 107 per mL. In this work, we present a high-throughput full-length mRNA-seq protocol incorporating a magnetic sifter and magnetic nanoparticle-antibody conjugates for rare cell enrichment, and Smart-seq2 chemistry for sequencing. We evaluate the efficiency and quality of this protocol with a simulated circulating tumor cell system, whereby non-small-cell lung cancer cell lines (NCI-H1650 and NCI-H1975) are spiked into whole blood, before being enriched for single-cell mRNA-seq by EpCAM-functionalized magnetic nanoparticles and the magnetic sifter. We obtain high efficiency (> 90%) capture and release of these simulated rare cells via the magnetic sifter, with reproducible transcriptome data. In addition, while mRNA-seq data is typically only used for gene expression analysis of transcriptomic data, we demonstrate the use of full-length mRNA-seq chemistries like Smart-seq2 to facilitate variant analysis of expressed genes. This enables the use of mRNA-seq data for differentiating cells in a heterogeneous population by both their phenotypic and variant profile. In a simulated heterogeneous mixture of circulating tumor cells in whole blood, we utilize this high-throughput protocol to differentiate these heterogeneous cells by both their phenotype (lung cancer versus white blood cells), and mutational profile (H1650 versus H1975 cells), in a single sequencing run. This high-throughput method can help facilitate single-cell analysis of rare cell populations, such as circulating tumor or endothelial cells, with demonstrably high-quality transcriptomic data.
Celedon, Jose M; Yuen, Macaire M S; Chiang, Angela; Henderson, Hannah; Reid, Karen E; Bohlmann, Jörg
2017-11-01
Plant defenses often involve specialized cells and tissues. In conifers, specialized cells of the bark are important for defense against insects and pathogens. Using laser microdissection, we characterized the transcriptomes of cortical resin duct cells, phenolic cells and phloem of white spruce (Picea glauca) bark under constitutive and methyl jasmonate (MeJa)-induced conditions, and we compared these transcriptomes with the transcriptome of the bark tissue complex. Overall, ~3700 bark transcripts were differentially expressed in response to MeJa. Approximately 25% of transcripts were expressed in only one cell type, revealing cell specialization at the transcriptome level. MeJa caused cell-type-specific transcriptome responses and changed the overall patterns of cell-type-specific transcript accumulation. Comparison of transcriptomes of the conifer bark tissue complex and specialized cells resolved a masking effect inherent to transcriptome analysis of complex tissues, and showed the actual cell-type-specific transcriptome signatures. Characterization of cell-type-specific transcriptomes is critical to reveal the dynamic patterns of spatial and temporal display of constitutive and induced defense systems in a complex plant tissue or organ. This was demonstrated with the improved resolution of spatially restricted expression of sets of genes of secondary metabolism in the specialized cell types. © 2017 The Authors The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.
Cheng, Yunqing; Liu, Jianfeng; Zhang, Huidi; Wang, Ju; Zhao, Yixin; Geng, Wanting
2015-01-01
A high ratio of blank fruit in hazelnut (Corylus heterophylla Fisch) is a very common phenomenon that causes serious yield losses in northeast China. The development of blank fruit in the Corylus genus is known to be associated with embryo abortion. However, little is known about the molecular mechanisms responsible for embryo abortion during the nut development stage. Genomic information for C. heterophylla Fisch is not available; therefore, data related to transcriptome and gene expression profiling of developing and abortive ovules are needed. In this study, de novo transcriptome sequencing and RNA-seq analysis were conducted using short-read sequencing technology (Illumina HiSeq 2000). The results of the transcriptome assembly analysis revealed genetic information that was associated with the fruit development stage. Two digital gene expression libraries were constructed, one for a full (normally developing) ovule and one for an empty (abortive) ovule. Transcriptome sequencing and assembly results revealed 55,353 unigenes, including 18,751 clusters and 36,602 singletons. These results were annotated using the public databases NR, NT, Swiss-Prot, KEGG, COG, and GO. Using digital gene expression profiling, gene expression differences in developing and abortive ovules were identified. A total of 1,637 and 715 unigenes were significantly upregulated and downregulated, respectively, in abortive ovules, compared with developing ovules. Quantitative real-time polymerase chain reaction analysis was used in order to verify the differential expression of some genes. The transcriptome and digital gene expression profiling data of normally developing and abortive ovules in hazelnut provide exhaustive information that will improve our understanding of the molecular mechanisms of abortive ovule formation in hazelnut.
Impact of Transcriptomics on Our Understanding of Pulmonary Fibrosis
Vukmirovic, Milica; Kaminski, Naftali
2018-01-01
Idiopathic pulmonary fibrosis (IPF) is a lethal fibrotic lung disease characterized by aberrant remodeling of the lung parenchyma with extensive changes to the phenotypes of all lung resident cells. The introduction of transcriptomics, genome scale profiling of thousands of RNA transcripts, caused a significant inversion in IPF research. Instead of generating hypotheses based on animal models of disease, or biological plausibility, with limited validation in humans, investigators were able to generate hypotheses based on unbiased molecular analysis of human samples and then use animal models of disease to test their hypotheses. In this review, we describe the insights made from transcriptomic analysis of human IPF samples. We describe how transcriptomic studies led to identification of novel genes and pathways involved in the human IPF lung such as: matrix metalloproteinases, WNT pathway, epithelial genes, role of microRNAs among others, as well as conceptual insights such as the involvement of developmental pathways and deep shifts in epithelial and fibroblast phenotypes. The impact of lung and transcriptomic studies on disease classification, endotype discovery, and reproducible biomarkers is also described in detail. Despite these impressive achievements, the impact of transcriptomic studies has been limited because they analyzed bulk tissue and did not address the cellular and spatial heterogeneity of the IPF lung. We discuss new emerging technologies and applications, such as single-cell RNAseq and microenvironment analysis that may address cellular and spatial heterogeneity. We end by making the point that most current tissue collections and resources are not amenable to analysis using the novel technologies. To take advantage of the new opportunities, we need new efforts of sample collections, this time focused on access to all the microenvironments and cells in the IPF lung. PMID:29670881
APAtrap: identification and quantification of alternative polyadenylation sites from RNA-seq data.
Ye, Congting; Long, Yuqi; Ji, Guoli; Li, Qingshun Quinn; Wu, Xiaohui
2018-06-01
Alternative polyadenylation (APA) has been increasingly recognized as a crucial mechanism that contributes to transcriptome diversity and gene expression regulation. As RNA-seq has become a routine protocol for transcriptome analysis, it is of great interest to leverage such unprecedented collection of RNA-seq data by new computational methods to extract and quantify APA dynamics in these transcriptomes. However, research progress in this area has been relatively limited. Conventional methods rely on either transcript assembly to determine transcript 3' ends or annotated poly(A) sites. Moreover, they can neither identify more than two poly(A) sites in a gene nor detect dynamic APA site usage considering more than two poly(A) sites. We developed an approach called APAtrap based on the mean squared error model to identify and quantify APA sites from RNA-seq data. APAtrap is capable of identifying novel 3' UTRs and 3' UTR extensions, which contributes to locating potential poly(A) sites in previously overlooked regions and improving genome annotations. APAtrap also aims to tally all potential poly(A) sites and detect genes with differential APA site usages between conditions. Extensive comparisons of APAtrap with two other latest methods, ChangePoint and DaPars, using various RNA-seq datasets from simulation studies, human and Arabidopsis demonstrate the efficacy and flexibility of APAtrap for any organisms with an annotated genome. Freely available for download at https://apatrap.sourceforge.io. liqq@xmu.edu.cn or xhuister@xmu.edu.cn. Supplementary data are available at Bioinformatics online.
Epigenetic transgenerational inheritance of somatic transcriptomes and epigenetic control regions
2012-01-01
Background Environmentally induced epigenetic transgenerational inheritance of adult onset disease involves a variety of phenotypic changes, suggesting a general alteration in genome activity. Results Investigation of different tissue transcriptomes in male and female F3 generation vinclozolin versus control lineage rats demonstrated all tissues examined had transgenerational transcriptomes. The microarrays from 11 different tissues were compared with a gene bionetwork analysis. Although each tissue transgenerational transcriptome was unique, common cellular pathways and processes were identified between the tissues. A cluster analysis identified gene modules with coordinated gene expression and each had unique gene networks regulating tissue-specific gene expression and function. A large number of statistically significant over-represented clusters of genes were identified in the genome for both males and females. These gene clusters ranged from 2-5 megabases in size, and a number of them corresponded to the epimutations previously identified in sperm that transmit the epigenetic transgenerational inheritance of disease phenotypes. Conclusions Combined observations demonstrate that all tissues derived from the epigenetically altered germ line develop transgenerational transcriptomes unique to the tissue, but common epigenetic control regions in the genome may coordinately regulate these tissue-specific transcriptomes. This systems biology approach provides insight into the molecular mechanisms involved in the epigenetic transgenerational inheritance of a variety of adult onset disease phenotypes. PMID:23034163
Transcriptome Analysis of Lactococcus lactis in Coculture with Saccharomyces cerevisiae▿
Maligoy, Mathieu; Mercade, Myriam; Cocaign-Bousquet, Muriel; Loubiere, Pascal
2008-01-01
The study of microbial interactions in mixed cultures remains an important conceptual and methodological challenge for which transcriptome analysis could prove to be the essential method for improving our understanding. However, the use of whole-genome DNA chips is often restricted to the pure culture of the species for which the chips were designed. In this study, massive cross-hybridization was observed between the foreign cDNA and the specific Lactococcus lactis DNA chip. A very simple method is proposed to considerably decrease this nonspecific hybridization, consisting of adding the microbial partner's DNA. A correlation was established between the resulting cross-hybridization and the phylogenetic distance between the microbial partners. The response of L. lactis to the presence of Saccharomyces cerevisiae was analyzed during the exponential growth phase in fermentors under defined growth conditions. Although no differences between growth kinetics were observed for the pure and the mixed cultures of L. lactis, the mRNA levels of 158 genes were significantly modified. More particularly, a strong reorientation of pyrimidine metabolism was observed when L. lactis was grown in mixed cultures. These changes in transcript abundance were demonstrated to be regulated by the ethanol produced by the yeast and were confirmed by an independent method (quantitative reverse transcription-PCR). PMID:17993564
Gonzalez, Sergio; Clavijo, Bernardo; Rivarola, Máximo; Moreno, Patricio; Fernandez, Paula; Dopazo, Joaquín; Paniego, Norma
2017-02-22
In the last years, applications based on massively parallelized RNA sequencing (RNA-seq) have become valuable approaches for studying non-model species, e.g., without a fully sequenced genome. RNA-seq is a useful tool for detecting novel transcripts and genetic variations and for evaluating differential gene expression by digital measurements. The large and complex datasets resulting from functional genomic experiments represent a challenge in data processing, management, and analysis. This problem is especially significant for small research groups working with non-model species. We developed a web-based application, called ATGC transcriptomics, with a flexible and adaptable interface that allows users to work with new generation sequencing (NGS) transcriptomic analysis results using an ontology-driven database. This new application simplifies data exploration, visualization, and integration for a better comprehension of the results. ATGC transcriptomics provides access to non-expert computer users and small research groups to a scalable storage option and simple data integration, including database administration and management. The software is freely available under the terms of GNU public license at http://atgcinta.sourceforge.net .
Global Transcriptome Analysis of Staphylococcus aureus Response to Hydrogen Peroxide†
Chang, Wook; Small, David A.; Toghrol, Freshteh; Bentley, William E.
2006-01-01
Staphylococcus aureus responds with protective strategies against phagocyte-derived reactive oxidants to infect humans. Herein, we report the transcriptome analysis of the cellular response of S. aureus to hydrogen peroxide-induced oxidative stress. The data indicate that the oxidative response includes the induction of genes involved in virulence, DNA repair, and notably, anaerobic metabolism. PMID:16452450
Nookaew, Intawat; Papini, Marta; Pornputtapong, Natapol; Scalcinati, Gionata; Fagerberg, Linn; Uhlén, Matthias; Nielsen, Jens
2012-01-01
RNA-seq, has recently become an attractive method of choice in the studies of transcriptomes, promising several advantages compared with microarrays. In this study, we sought to assess the contribution of the different analytical steps involved in the analysis of RNA-seq data generated with the Illumina platform, and to perform a cross-platform comparison based on the results obtained through Affymetrix microarray. As a case study for our work we, used the Saccharomyces cerevisiae strain CEN.PK 113-7D, grown under two different conditions (batch and chemostat). Here, we asses the influence of genetic variation on the estimation of gene expression level using three different aligners for read-mapping (Gsnap, Stampy and TopHat) on S288c genome, the capabilities of five different statistical methods to detect differential gene expression (baySeq, Cuffdiff, DESeq, edgeR and NOISeq) and we explored the consistency between RNA-seq analysis using reference genome and de novo assembly approach. High reproducibility among biological replicates (correlation ≥0.99) and high consistency between the two platforms for analysis of gene expression levels (correlation ≥0.91) are reported. The results from differential gene expression identification derived from the different statistical methods, as well as their integrated analysis results based on gene ontology annotation are in good agreement. Overall, our study provides a useful and comprehensive comparison between the two platforms (RNA-seq and microrrays) for gene expression analysis and addresses the contribution of the different steps involved in the analysis of RNA-seq data. PMID:22965124
Mykles, Donald L.; Burnett, Karen G.; Durica, David S.; Joyce, Blake L.; McCarthy, Fiona M.; Schmidt, Carl J.; Stillman, Jonathon H.
2016-01-01
High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the “Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology” symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. PMID:27639274
Ma, Chuang; Wang, Xiangfeng
2012-09-01
One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.
Ma, Chuang; Wang, Xiangfeng
2012-01-01
One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey’s biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses. PMID:22797655
Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm
Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.
2015-01-01
There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes. PMID:25806041
Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm
Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; ...
2015-03-10
There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less
Single-feature polymorphism discovery in the barley transcriptome
Rostoks, Nils; Borevitz, Justin O; Hedley, Peter E; Russell, Joanne; Mudie, Sharon; Morris, Jenny; Cardle, Linda; Marshall, David F; Waugh, Robbie
2005-01-01
A probe-level model for analysis of GeneChip gene-expression data is presented which identified more than 10,000 single-feature polymorphisms (SFP) between two barley genotypes. The method has good sensitivity, as 67% of known single-nucleotide polymorphisms (SNP) were called as SFPs. This method is applicable to all oligonucleotide microarray data, accounts for SNP effects in gene-expression data and represents an efficient and versatile approach for highly parallel marker identification in large genomes. PMID:15960806
Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng
2014-01-01
Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154
Transcriptomic Analysis of Phenotypic Changes in Birch (Betula platyphylla) Autotetraploids
Mu, Huai-Zhi; Liu, Zi-Jia; Lin, Lin; Li, Hui-Yu; Jiang, Jing; Liu, Gui-Feng
2012-01-01
Plant breeders have focused much attention on polyploid trees because of their importance to forestry. To evaluate the impact of intraspecies genome duplication on the transcriptome, a series of Betula platyphylla autotetraploids and diploids were generated from four full-sib families. The phenotypes and transcriptomes of these autotetraploid individuals were compared with those of diploid trees. Autotetraploids were generally superior in breast-height diameter, volume, leaf, fruit and stoma and were generally inferior in height compared to diploids. Transcriptome data revealed numerous changes in gene expression attributable to autotetraploidization, which resulted in the upregulation of 7052 unigenes and the downregulation of 3658 unigenes. Pathway analysis revealed that the biosynthesis and signal transduction of indoleacetate (IAA) and ethylene were altered after genome duplication, which may have contributed to phenotypic changes. These results shed light on variations in birch autotetraploidization and help identify important genes for the genetic engineering of birch trees. PMID:23202935
De novo Assembly and Analysis of the Chilean Pencil Catfish Trichomycterus areolatus Transcriptome
Schulze, Thomas T.; Ali, Jonathan M.; Bartlett, Maggie L.; McFarland, Madalyn M.; Clement, Emalie J.; Won, Harim I.; Sanford, Austin G.; Monzingo, Elyssa B.; Martens, Matthew C.; Hemsley, Ryan M.; Kumar, Sidharta; Gouin, Nicolas; Kolok, Alan S.; Davis, Paul H.
2016-01-01
Trichomycterus areolatus is an endemic species of pencil catfish that inhabits the riffles and rapids of many freshwater ecosystems of Chile. Despite its unique adaptation to Chile's high gradient watersheds and therefore potential application in the investigation of ecosystem integrity and environmental contamination, relatively little is known regarding the molecular biology of this environmental sentinel. Here, we detail the assembly of the Trichomycterus areolatus transcriptome, a molecular resource for the study of this organism and its molecular response to the environment. RNA-Seq reads were obtained by next-generation sequencing with an Illumina® platform and processed using PRINSEQ. The transcriptome assembly was performed using TRINITY assembler. Transcriptome validation was performed by functional characterization with KOG, KEGG, and GO analyses. Additionally, differential expression analysis highlights sex-specific expression patterns, and a list of endocrine and oxidative stress related transcripts are included. PMID:27672404
Cañas, Rafael A; Feito, Isabel; Fuente-Maqueda, José Francisco; Ávila, Concepción; Majada, Juan; Cánovas, Francisco M
2015-11-06
Maritime pine (Pinus pinaster Aiton) grows in a range of different climates in the southwestern Mediterranean region and the existence of a variety of latitudinal ecotypes or provenances is well established. In this study, we have conducted a deep analysis of the transcriptome in needles from two P. pinaster provenances, Leiria (Portugal) and Tamrabta (Morocco), which were grown in northern Spain under the same conditions. An oligonucleotide microarray (PINARRAY3) and RNA-Seq were used for whole-transcriptome analyses, and we found that 90.95% of the data were concordant between the two platforms. Furthermore, the two methods identified very similar percentages of differentially expressed genes with values of 5.5% for PINARRAY3 and 5.7% for RNA-Seq. In total, 6,023 transcripts were shared and 88 differentially expressed genes overlapped in the two platforms. Among the differentially expressed genes, all transport related genes except aquaporins were expressed at higher levels in Tamrabta than in Leiria. In contrast, genes involved in secondary metabolism were expressed at higher levels in Tamrabta, and photosynthesis-related genes were expressed more highly in Leiria. The genes involved in light sensing in plants were well represented in the differentially expressed groups of genes. In addition, increased levels of hormones such as abscisic acid, gibberellins, jasmonic and salicylic acid were observed in Leiria. Both transcriptome platforms have proven to be useful resources, showing complementary and reliable results. The results presented here highlight the different abilities of the two maritime pine populations to sense environmental conditions and reveal one type of regulation that can be ascribed to different genetic and epigenetic backgrounds.
Identifier mapping performance for integrating transcriptomics and proteomics experimental results
2011-01-01
Background Studies integrating transcriptomic data with proteomic data can illuminate the proteome more clearly than either separately. Integromic studies can deepen understanding of the dynamic complex regulatory relationship between the transcriptome and the proteome. Integrating these data dictates a reliable mapping between the identifier nomenclature resultant from the two high-throughput platforms. However, this kind of analysis is well known to be hampered by lack of standardization of identifier nomenclature among proteins, genes, and microarray probe sets. Therefore data integration may also play a role in critiquing the fallible gene identifications that both platforms emit. Results We compared three freely available internet-based identifier mapping resources for mapping UniProt accessions (ACCs) to Affymetrix probesets identifications (IDs): DAVID, EnVision, and NetAffx. Liquid chromatography-tandem mass spectrometry analyses of 91 endometrial cancer and 7 noncancer samples generated 11,879 distinct ACCs. For each ACC, we compared the retrieval sets of probeset IDs from each mapping resource. We confirmed a high level of discrepancy among the mapping resources. On the same samples, mRNA expression was available. Therefore, to evaluate the quality of each ACC-to-probeset match, we calculated proteome-transcriptome correlations, and compared the resources presuming that better mapping of identifiers should generate a higher proportion of mapped pairs with strong inter-platform correlations. A mixture model for the correlations fitted well and supported regression analysis, providing a window into the performance of the mapping resources. The resources have added and dropped matches over two years, but their overall performance has not changed. Conclusions The methods presented here serve to achieve concrete context-specific insight, to support well-informed decisions in choosing an ID mapping strategy for "omic" data merging. PMID:21619611
Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique
Li, Chaozheng; Weng, Shaoping; Chen, Yonggui; Yu, Xiaoqiang; Lü, Ling; Zhang, Haiqing; He, Jianguo; Xu, Xiaopeng
2012-01-01
Background Pacific white shrimp (Litopenaeus vannamei), the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. Methodology/Principal Findings This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp) with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG) categories, 8171 unigenes were assigned into 51 Gene ontology (GO) functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. Conclusions/Significance The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei. PMID:23071809
2012-01-01
Background Common carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event. Results We have sequenced the transcriptome of common carp using 454 pyrosequencing. After assembling the 454 contigs and the published common carp sequences together, we obtained 49,669 contigs and identified genes using homology searches and an ab initio method. We identified 4,651 orthologous pairs between common carp and zebrafish and found 129,984 paralogous pairs within the common carp. An estimation of the synonymous substitution rate in the orthologous pairs indicated that common carp and zebrafish diverged 120 million years ago (MYA). We identified one round of genome duplication in common carp and estimated that it had occurred 5.6 to 11.3 MYA. In zebrafish, no genome duplication event after speciation was observed, suggesting that, compared to zebrafish, common carp had undergone an additional genome duplication event. We annotated the common carp contigs with Gene Ontology terms and KEGG pathways. Compared with zebrafish gene annotations, we found that a set of biological processes and pathways were enriched in common carp. Conclusions The assembled contigs helped us to estimate the time of the fourth-round of genome duplication in common carp. The resource that we have built as part of this study will help advance functional genomics and genome annotation studies in the future. PMID:22424280
Transcriptome analysis by strand-specific sequencing of complementary DNA
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-01-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-10-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
Transcriptome Analysis of Spartina pectinata in Response to Freezing Stress
Nah, Gyoungju; Lee, Moonsub; Kim, Do-Soon; Rayburn, A. Lane; Voigt, Thomas; Lee, D. K.
2016-01-01
Prairie cordgrass (Spartina pectinata), a perennial C4 grass native to the North American prairie, has several distinctive characteristics that potentially make it a model crop for production in stressful environments. However, little is known about the transcriptome dynamics of prairie cordgrass despite its unique freezing stress tolerance. Therefore, the purpose of this work was to explore the transcriptome dynamics of prairie cordgrass in response to freezing stress at -5°C for 5 min and 30 min. We used a RNA-sequencing method to assemble the S. pectinata leaf transcriptome and performed gene-expression profiling of the transcripts under freezing treatment. Six differentially expressed gene (DEG) groups were categorized from the profiling. In addition, two major consecutive orders of gene expression were observed in response to freezing; the first being the acute up-regulation of genes involved in plasma membrane modification, calcium-mediated signaling, proteasome-related proteins, and transcription regulators (e.g., MYB and WRKY). The follow-up and second response was of genes involved in encoding the putative anti-freezing protein and the previously known DNA and cell-damage-repair proteins. Moreover, we identified the genes involved in epigenetic regulation and circadian-clock expression. Our results indicate that freezing response in S. pectinata reflects dynamic changes in rapid-time duration, as well as in metabolic, transcriptional, post-translational, and epigenetic regulation. PMID:27032112
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-02-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Zhu, Li-Ping; Yue, Xin-Jing; Han, Kui; Li, Zhi-Feng; Zheng, Lian-Shuai; Yi, Xiu-Nan; Wang, Hai-Long; Zhang, You-Ming; Li, Yue-Zhong
2015-07-22
Exotic genes, especially clustered multiple-genes for a complex pathway, are normally integrated into chromosome for heterologous expression. The influences of insertion sites on heterologous expression and allotropic expressions of exotic genes on host remain mostly unclear. We compared the integration and expression efficiencies of single and multiple exotic genes that were inserted into Myxococcus xanthus genome by transposition and attB-site-directed recombination. While the site-directed integration had a rather stable chloramphenicol acetyl transferase (CAT) activity, the transposition produced varied CAT enzyme activities. We attempted to integrate the 56-kb gene cluster for the biosynthesis of antitumor polyketides epothilones into M. xanthus genome by site-direction but failed, which was determined to be due to the insertion size limitation at the attB site. The transposition technique produced many recombinants with varied production capabilities of epothilones, which, however, were not paralleled to the transcriptional characteristics of the local sites where the genes were integrated. Comparative transcriptomics analysis demonstrated that the allopatric integrations caused selective changes of host transcriptomes, leading to varied expressions of epothilone genes in different mutants. With the increase of insertion fragment size, transposition is a more practicable integration method for the expression of exotic genes. Allopatric integrations selectively change host transcriptomes, which lead to varied expression efficiencies of exotic genes.
Transcriptomic analysis of flower development in wintersweet (Chimonanthus praecox).
Liu, Daofeng; Sui, Shunzhao; Ma, Jing; Li, Zhineng; Guo, Yulong; Luo, Dengpan; Yang, Jianfeng; Li, Mingyang
2014-01-01
Wintersweet (Chimonanthus praecox) is familiar as a garden plant and woody ornamental flower. On account of its unique flowering time and strong fragrance, it has a high ornamental and economic value. Despite a long history of human cultivation, our understanding of wintersweet genetics and molecular biology remains scant, reflecting a lack of basic genomic and transcriptomic data. In this study, we assembled three cDNA libraries, from three successive stages in flower development, designated as the flower bud with displayed petal, open flower and senescing flower stages. Using the Illumina RNA-Seq method, we obtained 21,412,928, 26,950,404, 24,912,954 qualified Illumina reads, respectively, for the three successive stages. The pooled reads from all three libraries were then assembled into 106,995 transcripts, 51,793 of which were annotated in the NCBI non-redundant protein database. Of these annotated sequences, 32,649 and 21,893 transcripts were assigned to gene ontology categories and clusters of orthologous groups, respectively. We could map 15,587 transcripts onto 312 pathways using the Kyoto Encyclopedia of Genes and Genomes pathway database. Based on these transcriptomic data, we obtained a large number of candidate genes that were differentially expressed at the open flower and senescing flower stages. An analysis of differentially expressed genes involved in plant hormone signal transduction pathways indicated that although flower opening and senescence may be independent of the ethylene signaling pathway in wintersweet, salicylic acid may be involved in the regulation of flower senescence. We also succeeded in isolating key genes of floral scent biosynthesis and proposed a biosynthetic pathway for monoterpenes and sesquiterpenes in wintersweet flowers, based on the annotated sequences. This comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in wintersweet. And our data provided a useful database for further research of wintersweet and other Calycanthaceae family plants.
Transcriptomic Analysis of Flower Development in Wintersweet (Chimonanthus praecox)
Liu, Daofeng; Sui, Shunzhao; Ma, Jing; Li, Zhineng; Guo, Yulong; Luo, Dengpan; Yang, Jianfeng; Li, Mingyang
2014-01-01
Wintersweet (Chimonanthus praecox) is familiar as a garden plant and woody ornamental flower. On account of its unique flowering time and strong fragrance, it has a high ornamental and economic value. Despite a long history of human cultivation, our understanding of wintersweet genetics and molecular biology remains scant, reflecting a lack of basic genomic and transcriptomic data. In this study, we assembled three cDNA libraries, from three successive stages in flower development, designated as the flower bud with displayed petal, open flower and senescing flower stages. Using the Illumina RNA-Seq method, we obtained 21,412,928, 26,950,404, 24,912,954 qualified Illumina reads, respectively, for the three successive stages. The pooled reads from all three libraries were then assembled into 106,995 transcripts, 51,793 of which were annotated in the NCBI non-redundant protein database. Of these annotated sequences, 32,649 and 21,893 transcripts were assigned to gene ontology categories and clusters of orthologous groups, respectively. We could map 15,587 transcripts onto 312 pathways using the Kyoto Encyclopedia of Genes and Genomes pathway database. Based on these transcriptomic data, we obtained a large number of candidate genes that were differentially expressed at the open flower and senescing flower stages. An analysis of differentially expressed genes involved in plant hormone signal transduction pathways indicated that although flower opening and senescence may be independent of the ethylene signaling pathway in wintersweet, salicylic acid may be involved in the regulation of flower senescence. We also succeeded in isolating key genes of floral scent biosynthesis and proposed a biosynthetic pathway for monoterpenes and sesquiterpenes in wintersweet flowers, based on the annotated sequences. This comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in wintersweet. And our data provided a useful database for further research of wintersweet and other Calycanthaceae family plants. PMID:24489818
Wu, Qing-jun; Wang, Shao-li; Yang, Xin; Yang, Ni-na; Li, Ru-mei; Jiao, Xiao-guo; Pan, Hui-peng; Liu, Bai-ming; Su, Qi; Xu, Bao-yun; Hu, Song-nian; Zhou, Xu-guo; Zhang, You-jun
2012-01-01
Background Bemisia tabaci (Gennadius) is a phloem-feeding insect poised to become one of the major insect pests in open field and greenhouse production systems throughout the world. The high level of resistance to insecticides is a main factor that hinders continued use of insecticides for suppression of B. tabaci. Despite its prevalence, little is known about B. tabaci at the genome level. To fill this gap, an invasive B. tabaci B biotype was subjected to pyrosequencing-based transcriptome analysis to identify genes and gene networks putatively involved in various physiological and toxicological processes. Methodology and Principal Findings Using Roche 454 pyrosequencing, 857,205 reads containing approximately 340 megabases were obtained from the B. tabaci transcriptome. De novo assembly generated 178,669 unigenes including 30,980 from insects, 17,881 from bacteria, and 129,808 from the nohit. A total of 50,835 (28.45%) unigenes showed similarity to the non-redundant database in GenBank with a cut-off E-value of 10–5. Among them, 40,611 unigenes were assigned to one or more GO terms and 6,917 unigenes were assigned to 288 known pathways. De novo metatranscriptome analysis revealed highly diverse bacterial symbionts in B. tabaci, and demonstrated the host-symbiont cooperation in amino acid production. In-depth transcriptome analysis indentified putative molecular markers, and genes potentially involved in insecticide resistance and nutrient digestion. The utility of this transcriptome was validated by a thiamethoxam resistance study, in which annotated cytochrome P450 genes were significantly overexpressed in the resistant B. tabaci in comparison to its susceptible counterparts. Conclusions This transcriptome/metatranscriptome analysis sheds light on the molecular understanding of symbiosis and insecticide resistance in an agriculturally important phloem-feeding insect pest, and lays the foundation for future functional genomics research of the B. tabaci complex. Moreover, current pyrosequencing effort greatly enriched the existing whitefly EST database, and makes RNAseq a viable option for future genomic analysis. PMID:22558125
Comprehensive comparative analysis of 5'-end RNA-sequencing methods.
Adiconis, Xian; Haber, Adam L; Simmons, Sean K; Levy Moonshine, Ami; Ji, Zhe; Busby, Michele A; Shi, Xi; Jacques, Justin; Lancaster, Madeline A; Pan, Jen Q; Regev, Aviv; Levin, Joshua Z
2018-06-04
Specialized RNA-seq methods are required to identify the 5' ends of transcripts, which are critical for studies of gene regulation, but these methods have not been systematically benchmarked. We directly compared six such methods, including the performance of five methods on a single human cellular RNA sample and a new spike-in RNA assay that helps circumvent challenges resulting from uncertainties in annotation and RNA processing. We found that the 'cap analysis of gene expression' (CAGE) method performed best for mRNA and that most of its unannotated peaks were supported by evidence from other genomic methods. We applied CAGE to eight brain-related samples and determined sample-specific transcription start site (TSS) usage, as well as a transcriptome-wide shift in TSS usage between fetal and adult brain.
Li, Bojiang; Dong, Chao; Li, Pinghua; Ren, Zhuqing; Wang, Han; Yu, Fengxiang; Ning, Caibo; Liu, Kaiqing; Wei, Wei; Huang, Ruihua; Chen, Jie; Wu, Wangjun; Liu, Honglin
2016-10-17
Meat color is considered to be the most important indicator of meat quality, however, the molecular mechanisms underlying traits related to meat color remain mostly unknown. In this study, to elucidate the molecular basis of meat color, we constructed six cDNA libraries from biceps femoris (Bf) and soleus (Sol), which exhibit obvious differences in meat color, and analyzed the whole-transcriptome differences between Bf (white muscle) and Sol (red muscle) using high-throughput sequencing technology. Using DEseq2 method, we identified 138 differentially expressed genes (DEGs) between Bf and Sol. Using DEGseq method, we identified 770, 810, and 476 DEGs in comparisons between Bf and Sol in three separate animals. Of these DEGs, 52 were overlapping DEGs. Using these data, we determined the enriched GO terms, metabolic pathways and candidate genes associated with meat color traits. Additionally, we mapped 114 non-redundant DEGs to the meat color QTLs via a comparative analysis with the porcine quantitative trait loci (QTL) database. Overall, our data serve as a valuable resource for identifying genes whose functions are critical for meat color traits and can accelerate studies of the molecular mechanisms of meat color formation.
Li, Bojiang; Dong, Chao; Li, Pinghua; Ren, Zhuqing; Wang, Han; Yu, Fengxiang; Ning, Caibo; Liu, Kaiqing; Wei, Wei; Huang, Ruihua; Chen, Jie; Wu, Wangjun; Liu, Honglin
2016-01-01
Meat color is considered to be the most important indicator of meat quality, however, the molecular mechanisms underlying traits related to meat color remain mostly unknown. In this study, to elucidate the molecular basis of meat color, we constructed six cDNA libraries from biceps femoris (Bf) and soleus (Sol), which exhibit obvious differences in meat color, and analyzed the whole-transcriptome differences between Bf (white muscle) and Sol (red muscle) using high-throughput sequencing technology. Using DEseq2 method, we identified 138 differentially expressed genes (DEGs) between Bf and Sol. Using DEGseq method, we identified 770, 810, and 476 DEGs in comparisons between Bf and Sol in three separate animals. Of these DEGs, 52 were overlapping DEGs. Using these data, we determined the enriched GO terms, metabolic pathways and candidate genes associated with meat color traits. Additionally, we mapped 114 non-redundant DEGs to the meat color QTLs via a comparative analysis with the porcine quantitative trait loci (QTL) database. Overall, our data serve as a valuable resource for identifying genes whose functions are critical for meat color traits and can accelerate studies of the molecular mechanisms of meat color formation. PMID:27748458
Mendes, Filipa; Sieuwerts, Sander; de Hulster, Erik; Almering, Marinka J. H.; Luttik, Marijke A. H.; Pronk, Jack T.; Smid, Eddy J.; Bron, Peter A.
2013-01-01
Mixed populations of Saccharomyces cerevisiae yeasts and lactic acid bacteria occur in many dairy, food, and beverage fermentations, but knowledge about their interactions is incomplete. In the present study, interactions between Saccharomyces cerevisiae and Lactobacillus delbrueckii subsp. bulgaricus, two microorganisms that co-occur in kefir fermentations, were studied during anaerobic growth on lactose. By combining physiological and transcriptome analysis of the two strains in the cocultures, five mechanisms of interaction were identified. (i) Lb. delbrueckii subsp. bulgaricus hydrolyzes lactose, which cannot be metabolized by S. cerevisiae, to galactose and glucose. Subsequently, galactose, which cannot be metabolized by Lb. delbrueckii subsp. bulgaricus, is excreted and provides a carbon source for yeast. (ii) In pure cultures, Lb. delbrueckii subsp. bulgaricus grows only in the presence of increased CO2 concentrations. In anaerobic mixed cultures, the yeast provides this CO2 via alcoholic fermentation. (iii) Analysis of amino acid consumption from the defined medium indicated that S. cerevisiae supplied alanine to the bacterium. (iv) A mild but significant low-iron response in the yeast transcriptome, identified by DNA microarray analysis, was consistent with the chelation of iron by the lactate produced by Lb. delbrueckii subsp. bulgaricus. (v) Transcriptome analysis of Lb. delbrueckii subsp. bulgaricus in mixed cultures showed an overrepresentation of transcripts involved in lipid metabolism, suggesting either a competition of the two microorganisms for fatty acids or a response to the ethanol produced by S. cerevisiae. This study demonstrates that chemostat-based transcriptome analysis is a powerful tool to investigate microbial interactions in mixed populations. PMID:23872557
ZHANG, YAFANG; CROFTON, ELIZABETH J.; FAN, XIUZHEN; LI, DINGGE; KONG, FANPING; SINHA, MALA; LUXON, BRUCE A.; SPRATT, HEIDI M.; LICHTI, CHERYL F.; GREEN, THOMAS A.
2016-01-01
Transcriptomic and proteomic approaches have separately proven effective at identifying novel mechanisms affecting addiction-related behavior; however, it is difficult to prioritize the many promising leads from each approach. A convergent secondary analysis of proteomic and transcriptomic results can glean additional information to help prioritize promising leads. The current study is a secondary analysis of the convergence of recently published separate transcriptomic and proteomic analyses of nucleus accumbens (NAc) tissue from rats subjected to environmental enrichment vs. isolation and cocaine self-administration vs. saline. Multiple bioinformatics approaches (e.g. Gene Ontology (GO) analysis, Ingenuity Pathway Analysis (IPA), and Gene Set Enrichment Analysis (GSEA)) were used to interrogate these rich data sets. Although there was little correspondence between mRNA vs. protein at the individual target level, good correspondence was found at the level of gene/protein sets, particularly for the environmental enrichment manipulation. These data identify gene sets where there is a positive relationship between changes in mRNA and protein (e.g. glycolysis, ATP synthesis, translation elongation factor activity, etc.) and gene sets where there is an inverse relationship (e.g. ribosomes, Rho GTPase signaling, protein ubiquitination, etc.). Overall environmental enrichment produced better correspondence than cocaine self-administration. The individual targets contributing to mRNA and protein effects were largely not overlapping. As a whole, these results confirm that robust transcriptomic and proteomic data sets can provide similar results at the gene/protein set level even when there is little correspondence at the individual target level and little overlap in the targets contributing to the effects. PMID:27717806
Mendes, Filipa; Sieuwerts, Sander; de Hulster, Erik; Almering, Marinka J H; Luttik, Marijke A H; Pronk, Jack T; Smid, Eddy J; Bron, Peter A; Daran-Lapujade, Pascale
2013-10-01
Mixed populations of Saccharomyces cerevisiae yeasts and lactic acid bacteria occur in many dairy, food, and beverage fermentations, but knowledge about their interactions is incomplete. In the present study, interactions between Saccharomyces cerevisiae and Lactobacillus delbrueckii subsp. bulgaricus, two microorganisms that co-occur in kefir fermentations, were studied during anaerobic growth on lactose. By combining physiological and transcriptome analysis of the two strains in the cocultures, five mechanisms of interaction were identified. (i) Lb. delbrueckii subsp. bulgaricus hydrolyzes lactose, which cannot be metabolized by S. cerevisiae, to galactose and glucose. Subsequently, galactose, which cannot be metabolized by Lb. delbrueckii subsp. bulgaricus, is excreted and provides a carbon source for yeast. (ii) In pure cultures, Lb. delbrueckii subsp. bulgaricus grows only in the presence of increased CO2 concentrations. In anaerobic mixed cultures, the yeast provides this CO2 via alcoholic fermentation. (iii) Analysis of amino acid consumption from the defined medium indicated that S. cerevisiae supplied alanine to the bacterium. (iv) A mild but significant low-iron response in the yeast transcriptome, identified by DNA microarray analysis, was consistent with the chelation of iron by the lactate produced by Lb. delbrueckii subsp. bulgaricus. (v) Transcriptome analysis of Lb. delbrueckii subsp. bulgaricus in mixed cultures showed an overrepresentation of transcripts involved in lipid metabolism, suggesting either a competition of the two microorganisms for fatty acids or a response to the ethanol produced by S. cerevisiae. This study demonstrates that chemostat-based transcriptome analysis is a powerful tool to investigate microbial interactions in mixed populations.
2015-01-01
Background Investigations into novel biomarkers using omics techniques generate large amounts of data. Due to their size and numbers of attributes, these data are suitable for analysis with machine learning methods. A key component of typical machine learning pipelines for omics data is feature selection, which is used to reduce the raw high-dimensional data into a tractable number of features. Feature selection needs to balance the objective of using as few features as possible, while maintaining high predictive power. This balance is crucial when the goal of data analysis is the identification of highly accurate but small panels of biomarkers with potential clinical utility. In this paper we propose a heuristic for the selection of very small feature subsets, via an iterative feature elimination process that is guided by rule-based machine learning, called RGIFE (Rule-guided Iterative Feature Elimination). We use this heuristic to identify putative biomarkers of osteoarthritis (OA), articular cartilage degradation and synovial inflammation, using both proteomic and transcriptomic datasets. Results and discussion Our RGIFE heuristic increased the classification accuracies achieved for all datasets when no feature selection is used, and performed well in a comparison with other feature selection methods. Using this method the datasets were reduced to a smaller number of genes or proteins, including those known to be relevant to OA, cartilage degradation and joint inflammation. The results have shown the RGIFE feature reduction method to be suitable for analysing both proteomic and transcriptomics data. Methods that generate large ‘omics’ datasets are increasingly being used in the area of rheumatology. Conclusions Feature reduction methods are advantageous for the analysis of omics data in the field of rheumatology, as the applications of such techniques are likely to result in improvements in diagnosis, treatment and drug discovery. PMID:25923811
Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang
2014-01-01
Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10−5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis. PMID:25551661
2011-01-01
Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes. PMID:21333005
Principles for circadian orchestration of metabolic pathways.
Thurley, Kevin; Herbst, Christopher; Wesener, Felix; Koller, Barbara; Wallach, Thomas; Maier, Bert; Kramer, Achim; Westermark, Pål O
2017-02-14
Circadian rhythms govern multiple aspects of animal metabolism. Transcriptome-, proteome- and metabolome-wide measurements have revealed widespread circadian rhythms in metabolism governed by a cellular genetic oscillator, the circadian core clock. However, it remains unclear if and under which conditions transcriptional rhythms cause rhythms in particular metabolites and metabolic fluxes. Here, we analyzed the circadian orchestration of metabolic pathways by direct measurement of enzyme activities, analysis of transcriptome data, and developing a theoretical method called circadian response analysis. Contrary to a common assumption, we found that pronounced rhythms in metabolic pathways are often favored by separation rather than alignment in the times of peak activity of key enzymes. This property holds true for a set of metabolic pathway motifs (e.g., linear chains and branching points) and also under the conditions of fast kinetics typical for metabolic reactions. By circadian response analysis of pathway motifs, we determined exact timing separation constraints on rhythmic enzyme activities that allow for substantial rhythms in pathway flux and metabolite concentrations. Direct measurements of circadian enzyme activities in mouse skeletal muscle confirmed that such timing separation occurs in vivo.
Principles for circadian orchestration of metabolic pathways
Thurley, Kevin; Herbst, Christopher; Wesener, Felix; Koller, Barbara; Wallach, Thomas; Maier, Bert; Kramer, Achim
2017-01-01
Circadian rhythms govern multiple aspects of animal metabolism. Transcriptome-, proteome- and metabolome-wide measurements have revealed widespread circadian rhythms in metabolism governed by a cellular genetic oscillator, the circadian core clock. However, it remains unclear if and under which conditions transcriptional rhythms cause rhythms in particular metabolites and metabolic fluxes. Here, we analyzed the circadian orchestration of metabolic pathways by direct measurement of enzyme activities, analysis of transcriptome data, and developing a theoretical method called circadian response analysis. Contrary to a common assumption, we found that pronounced rhythms in metabolic pathways are often favored by separation rather than alignment in the times of peak activity of key enzymes. This property holds true for a set of metabolic pathway motifs (e.g., linear chains and branching points) and also under the conditions of fast kinetics typical for metabolic reactions. By circadian response analysis of pathway motifs, we determined exact timing separation constraints on rhythmic enzyme activities that allow for substantial rhythms in pathway flux and metabolite concentrations. Direct measurements of circadian enzyme activities in mouse skeletal muscle confirmed that such timing separation occurs in vivo. PMID:28159888
Breinholt, Jesse W; Earl, Chandra; Lemmon, Alan R; Lemmon, Emily Moriarty; Xiao, Lei; Kawahara, Akito Y
2018-01-01
The advent of next-generation sequencing technology has allowed for thecollection of large portions of the genome for phylogenetic analysis. Hybrid enrichment and transcriptomics are two techniques that leverage next-generation sequencing and have shown much promise. However, methods for processing hybrid enrichment data are still limited. We developed a pipeline for anchored hybrid enrichment (AHE) read assembly, orthology determination, contamination screening, and data processing for sequences flanking the target "probe" region. We apply this approach to study the phylogeny of butterflies and moths (Lepidoptera), a megadiverse group of more than 157,000 described species with poorly understood deep-level phylogenetic relationships. We introduce a new, 855 locus AHE kit for Lepidoptera phylogenetics and compare resulting trees to those from transcriptomes. The enrichment kit was designed from existing genomes, transcriptomes, and expressed sequence tags and was used to capture sequence data from 54 species from 23 lepidopteran families. Phylogenies estimated from AHE data were largely congruent with trees generated from transcriptomes, with strong support for relationships at all but the deepest taxonomic levels. We combine AHE and transcriptomic data to generate a new Lepidoptera phylogeny, representing 76 exemplar species in 42 families. The tree provides robust support for many relationships, including those among the seven butterfly families. The addition of AHE data to an existing transcriptomic dataset lowers node support along the Lepidoptera backbone, but firmly places taxa with AHE data on the phylogeny. Combining taxa sequenced for AHE with existing transcriptomes and genomes resulted in a tree with strong support for (Calliduloidea $+$ Gelechioidea $+$ Thyridoidea) $+$ (Papilionoidea $+$ Pyraloidea $+$ Macroheterocera). To examine the efficacy of AHE at a shallow taxonomic level, phylogenetic analyses were also conducted on a sister group representing a more recent divergence, the Saturniidae and Sphingidae. These analyses utilized sequences from the probe region and data flanking it, nearly doubled the size of the dataset; resulting trees supported new phylogenetics relationships, especially within the Saturniidae and Sphingidae (e.g., Hemarina derived in the latter). We hope that our data processing pipeline, hybrid enrichment gene set, and approach of combining AHE data with transcriptomes will be useful for the broader systematics community. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Wong, Kim; Navarro, José Fernández; Bergenstråhle, Ludvig; Ståhl, Patrik L; Lundeberg, Joakim
2018-06-01
Spatial Transcriptomics (ST) is a method which combines high resolution tissue imaging with high troughput transcriptome sequencing data. This data must be aligned with the images for correct visualization, a process that involves several manual steps. Here we present ST Spot Detector, a web tool that automates and facilitates this alignment through a user friendly interface. jose.fernandez.navarro@scilifelab.se. Supplementary data are available at Bioinformatics online.
A high-throughput approach to profile RNA structure.
Delli Ponti, Riccardo; Marti, Stefanie; Armaos, Alexandros; Tartaglia, Gian Gaetano
2017-03-17
Here we introduce the Computational Recognition of Secondary Structure (CROSS) method to calculate the structural profile of an RNA sequence (single- or double-stranded state) at single-nucleotide resolution and without sequence length restrictions. We trained CROSS using data from high-throughput experiments such as Selective 2΄-Hydroxyl Acylation analyzed by Primer Extension (SHAPE; Mouse and HIV transcriptomes) and Parallel Analysis of RNA Structure (PARS; Human and Yeast transcriptomes) as well as high-quality NMR/X-ray structures (PDB database). The algorithm uses primary structure information alone to predict experimental structural profiles with >80% accuracy, showing high performances on large RNAs such as Xist (17 900 nucleotides; Area Under the ROC Curve AUC of 0.75 on dimethyl sulfate (DMS) experiments). We integrated CROSS in thermodynamics-based methods to predict secondary structure and observed an increase in their predictive power by up to 30%. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
International Standards for Genomes, Transcriptomes, and Metagenomes
Mason, Christopher E.; Afshinnekoo, Ebrahim; Tighe, Scott; Wu, Shixiu; Levy, Shawn
2017-01-01
Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine. PMID:28337071
Single-Cell Semiconductor Sequencing
Kohn, Andrea B.; Moroz, Tatiana P.; Barnes, Jeffrey P.; Netherton, Mandy; Moroz, Leonid L.
2014-01-01
RNA-seq or transcriptome analysis of individual cells and small-cell populations is essential for virtually any biomedical field. It is especially critical for developmental, aging, and cancer biology as well as neuroscience where the enormous heterogeneity of cells present a significant methodological and conceptual challenge. Here we present two methods that allow for fast and cost-efficient transcriptome sequencing from ultra-small amounts of tissue or even from individual cells using semiconductor sequencing technology (Ion Torrent, Life Technologies). The first method is a reduced representation sequencing which maximizes capture of RNAs and preserves transcripts’ directionality. The second, a template-switch protocol, is designed for small mammalian neurons. Both protocols, from cell/tissue isolation to final sequence data, take up to 4 days. The efficiency of these protocols has been validated with single hippocampal neurons and various invertebrate tissues including individually identified neurons within a simpler memory-forming circuit of Aplysia californica and early (1-, 2-, 4-, 8-cells) embryonic and developmental stages from basal metazoans. PMID:23929110
Whitney, T J; Gardner, D G; Mott, M L; Brandon, M
2010-03-09
The unusual life cycle of Dictyostelium discoideum, in which an extra-cellular stressor such as starvation induces the development of a multicellular fruiting body consisting of stalk cells and spores from a culture of identical amoebae, provides an excellent model for investigating the molecular control of differentiation and the transition from single- to multi-cellular life, a key transition in development. We utilized serial analysis of gene expression (SAGE), a molecular method that is unbiased by dependence on previously identified genes, to obtain a transcriptome from a high-density culture of amoebae, in order to examine the transition to multi-cellular development. The SAGE method provides relative expression levels, which allows us to rank order the expressed genes. We found that a large number of ribosomal proteins were expressed at high levels, while various components of the proteosome were expressed at low levels. The only identifiable transmembrane signaling system components expressed in amoebae are related to quorum sensing, and their expression levels were relatively low. The most highly expressed gene in the amoeba transcriptome, dutA untranslated RNA, is a molecule with unknown function that may serve as an inhibitor of translation. These results suggest that high-density amoebae have not initiated development, and they also suggest a mechanism by which the transition into the development program is controlled.
Parreira, Valeria R; Russell, Kay; Athanasiadou, Spiridoula; Prescott, John F
2016-08-12
Necrotic enteritis (NE) caused by netB-positive type A Clostridium perfringens is an important bacterial disease of poultry. Through its complex regulatory system, C. perfringens orchestrates the expression of a collection of toxins and extracellular enzymes that are crucial for the development of the disease; environmental conditions play an important role in their regulation. In this study, and for the first time, global transcriptomic analysis was performed on ligated intestinal loops in chickens colonized with a netB-positive C. perfringens strain, as well as the same strain propagated in vitro under various nutritional and environmental conditions. Analysis of the respective pathogen transcriptomes revealed up to 673 genes that were significantly expressed in vivo. Gene expression profiles in vivo were most similar to those of C. perfringens grown in nutritionally-deprived conditions. Taken together, our results suggest a bacterial transcriptome responses to the early stages of adaptation, and colonization of, the chicken intestine. Our work also reveals how netB-positive C. perfringens reacts to different environmental conditions including those in the chicken intestine.
Srivastava, Smriti; Singh, Rajesh K.; Pathak, Garima; Goel, Ridhi; Asif, Mehar Hasan; Sane, Aniruddha P.; Sane, Vidhu A.
2016-01-01
Ripening in mango is under a complex control of ethylene. In an effort to understand the complex spatio-temporal control of ripening we have made use of a popular N. Indian variety “Dashehari” This variety ripens from the stone inside towards the peel outside and forms jelly in the pulp in ripe fruits. Through a combination of 454 and Illumina sequencing, a transcriptomic analysis of gene expression from unripe and midripe stages have been performed in triplicates. Overall 74,312 unique transcripts with ≥1 FPKM were obtained. The transcripts related to 127 pathways were identified in “Dashehari” mango transcriptome by the KEGG analysis. These pathways ranged from detoxification, ethylene biosynthesis, carbon metabolism and aromatic amino acid degradation. The transcriptome study reveals differences not only in expression of softening associated genes but also those that govern ethylene biosynthesis and other nutritional characteristics. This study could help to develop ripening related markers for selective breeding to reduce the problems of excess jelly formation during softening in the “Dashehari” variety. PMID:27586495
Radhakrishna, Auji; Dwivedi, Krishna Kumar; Srivastava, Manoj Kumar; Roy, A K; Malaviya, D R; Kaushal, P
2018-06-01
Guinea grass ( Panicum maximum Jacq), an important fodder crop of humid and sub-humid tropical regions, reproduces through apomixis, a method of clonal propagation through seeds. Lack of knowledge of the genetic and molecular control of this phenomena has hindered the genetic improvement of this crop. The dataset provided here represents the first RNA-Seq based assembly and analysis of florets at pre-meiotic stage from the apomictic and sexual genotypes of guinea grass. The raw sequence files in FASTQ format were deposited in the NCBI SRA database with accession number SRP115883. A total of 24.8 Gb raw sequence data, corresponding to 17,96,65,827 raw reads was obtained by paired end sequencing. We used Trinity for de-novo assembly and identified 57,647 transcripts in sexual and 49,093 transcripts in apomictic type. This transcriptome data will be useful for identification and comparative analysis of genes regulating the mode of reproduction in grasses.
Gene expression profiling of human breast tissue samples using SAGE-Seq.
Wu, Zhenhua Jeremy; Meyer, Clifford A; Choudhury, Sibgat; Shipitsin, Michail; Maruyama, Reo; Bessarabova, Marina; Nikolskaya, Tatiana; Sukumar, Saraswati; Schwartzman, Armin; Liu, Jun S; Polyak, Kornelia; Liu, X Shirley
2010-12-01
We present a powerful application of ultra high-throughput sequencing, SAGE-Seq, for the accurate quantification of normal and neoplastic mammary epithelial cell transcriptomes. We develop data analysis pipelines that allow the mapping of sense and antisense strands of mitochondrial and RefSeq genes, the normalization between libraries, and the identification of differentially expressed genes. We find that the diversity of cancer transcriptomes is significantly higher than that of normal cells. Our analysis indicates that transcript discovery plateaus at 10 million reads/sample, and suggests a minimum desired sequencing depth around five million reads. Comparison of SAGE-Seq and traditional SAGE on normal and cancerous breast tissues reveals higher sensitivity of SAGE-Seq to detect less-abundant genes, including those encoding for known breast cancer-related transcription factors and G protein-coupled receptors (GPCRs). SAGE-Seq is able to identify genes and pathways abnormally activated in breast cancer that traditional SAGE failed to call. SAGE-Seq is a powerful method for the identification of biomarkers and therapeutic targets in human disease.
Horizontal gene transfer is a significant driver of gene innovation in dinoflagellates.
Wisecaver, Jennifer H; Brosnahan, Michael L; Hackett, Jeremiah D
2013-01-01
The dinoflagellates are an evolutionarily and ecologically important group of microbial eukaryotes. Previous work suggests that horizontal gene transfer (HGT) is an important source of gene innovation in these organisms. However, dinoflagellate genomes are notoriously large and complex, making genomic investigation of this phenomenon impractical with currently available sequencing technology. Fortunately, de novo transcriptome sequencing and assembly provides an alternative approach for investigating HGT. We sequenced the transcriptome of the dinoflagellate Alexandrium tamarense Group IV to investigate how HGT has contributed to gene innovation in this group. Our comprehensive A. tamarense Group IV gene set was compared with those of 16 other eukaryotic genomes. Ancestral gene content reconstruction of ortholog groups shows that A. tamarense Group IV has the largest number of gene families gained (314-1,563 depending on inference method) relative to all other organisms in the analysis (0-782). Phylogenomic analysis indicates that genes horizontally acquired from bacteria are a significant proportion of this gene influx, as are genes transferred from other eukaryotes either through HGT or endosymbiosis. The dinoflagellates also display curious cases of gene loss associated with mitochondrial metabolism including the entire Complex I of oxidative phosphorylation. Some of these missing genes have been functionally replaced by bacterial and eukaryotic xenologs. The transcriptome of A. tamarense Group IV lends strong support to a growing body of evidence that dinoflagellate genomes are extraordinarily impacted by HGT.
Horizontal Gene Transfer is a Significant Driver of Gene Innovation in Dinoflagellates
Wisecaver, Jennifer H.; Brosnahan, Michael L.; Hackett, Jeremiah D.
2013-01-01
The dinoflagellates are an evolutionarily and ecologically important group of microbial eukaryotes. Previous work suggests that horizontal gene transfer (HGT) is an important source of gene innovation in these organisms. However, dinoflagellate genomes are notoriously large and complex, making genomic investigation of this phenomenon impractical with currently available sequencing technology. Fortunately, de novo transcriptome sequencing and assembly provides an alternative approach for investigating HGT. We sequenced the transcriptome of the dinoflagellate Alexandrium tamarense Group IV to investigate how HGT has contributed to gene innovation in this group. Our comprehensive A. tamarense Group IV gene set was compared with those of 16 other eukaryotic genomes. Ancestral gene content reconstruction of ortholog groups shows that A. tamarense Group IV has the largest number of gene families gained (314–1,563 depending on inference method) relative to all other organisms in the analysis (0–782). Phylogenomic analysis indicates that genes horizontally acquired from bacteria are a significant proportion of this gene influx, as are genes transferred from other eukaryotes either through HGT or endosymbiosis. The dinoflagellates also display curious cases of gene loss associated with mitochondrial metabolism including the entire Complex I of oxidative phosphorylation. Some of these missing genes have been functionally replaced by bacterial and eukaryotic xenologs. The transcriptome of A. tamarense Group IV lends strong support to a growing body of evidence that dinoflagellate genomes are extraordinarily impacted by HGT. PMID:24259313
Single-nucleus RNA-seq of differentiating human myoblasts reveals the extent of fate heterogeneity
Zeng, Weihua; Jiang, Shan; Kong, Xiangduo; El-Ali, Nicole; Ball, Alexander R.; Ma, Christopher I-Hsing; Hashimoto, Naohiro; Yokomori, Kyoko; Mortazavi, Ali
2016-01-01
Myoblasts are precursor skeletal muscle cells that differentiate into fused, multinucleated myotubes. Current single-cell microfluidic methods are not optimized for capturing very large, multinucleated cells such as myotubes. To circumvent the problem, we performed single-nucleus transcriptome analysis. Using immortalized human myoblasts, we performed RNA-seq analysis of single cells (scRNA-seq) and single nuclei (snRNA-seq) and found them comparable, with a distinct enrichment for long non-coding RNAs (lncRNAs) in snRNA-seq. We then compared snRNA-seq of myoblasts before and after differentiation. We observed the presence of mononucleated cells (MNCs) that remained unfused and analyzed separately from multi-nucleated myotubes. We found that while the transcriptome profiles of myoblast and myotube nuclei are relatively homogeneous, MNC nuclei exhibited significant heterogeneity, with the majority of them adopting a distinct mesenchymal state. Primary transcripts for microRNAs (miRNAs) that participate in skeletal muscle differentiation were among the most differentially expressed lncRNAs, which we validated using NanoString. Our study demonstrates that snRNA-seq provides reliable transcriptome quantification for cells that are otherwise not amenable to current single-cell platforms. Our results further indicate that snRNA-seq has unique advantage in capturing nucleus-enriched lncRNAs and miRNA precursors that are useful in mapping and monitoring differential miRNA expression during cellular differentiation. PMID:27566152
Brownian model of transcriptome evolution and phylogenetic network visualization between tissues.
Gu, Xun; Ruan, Hang; Su, Zhixi; Zou, Yangyun
2017-09-01
While phylogenetic analysis of transcriptomes of the same tissue is usually congruent with the species tree, the controversy emerges when multiple tissues are included, that is, whether species from the same tissue are clustered together, or different tissues from the same species are clustered together. Recent studies have suggested that phylogenetic network approach may shed some lights on our understanding of multi-tissue transcriptome evolution; yet the underlying evolutionary mechanism remains unclear. In this paper we develop a Brownian-based model of transcriptome evolution under the phylogenetic network that can statistically distinguish between the patterns of species-clustering and tissue-clustering. Our model can be used as a null hypothesis (neutral transcriptome evolution) for testing any correlation in tissue evolution, can be applied to cancer transcriptome evolution to study whether two tumors of an individual appeared independently or via metastasis, and can be useful to detect convergent evolution at the transcriptional level. Copyright © 2017. Published by Elsevier Inc.
Jiménez-Guerrero, Irene; Acosta-Jurado, Sebastián; Navarro-Gómez, Pilar; López-Baena, Francisco Javier; Ollero, Francisco Javier
2017-01-01
Simultaneous quantification of transcripts of the whole bacterial genome allows the analysis of the global transcriptional response under changing conditions. RNA-seq and microarrays are the most used techniques to measure these transcriptomic changes, and both complement each other in transcriptome profiling. In this review, we exhaustively compiled the symbiosis-related transcriptomic reports (microarrays and RNA sequencing) carried out hitherto in rhizobia. This review is specially focused on transcriptomic changes that takes place when five rhizobial species, Bradyrhizobium japonicum (=diazoefficiens) USDA 110, Rhizobium leguminosarum biovar viciae 3841, Rhizobium tropici CIAT 899, Sinorhizobium (=Ensifer) meliloti 1021 and S. fredii HH103, recognize inducing flavonoids, plant-exuded phenolic compounds that activate the biosynthesis and export of Nod factors (NF) in all analysed rhizobia. Interestingly, our global transcriptomic comparison also indicates that each rhizobial species possesses its own arsenal of molecular weapons accompanying the set of NF in order to establish a successful interaction with host legumes. PMID:29267254
[Applications of meta-analysis in multi-omics].
Han, Mingfei; Zhu, Yunping
2014-07-01
As a statistical method integrating multi-features and multi-data, meta-analysis was introduced to the field of life science in the 1990s. With the rapid advances in high-throughput technologies, life omics, the core of which are genomics, transcriptomics and proteomics, is becoming the new hot spot of life science. Although the fast output of massive data has promoted the development of omics study, it results in excessive data that are difficult to integrate systematically. In this case, meta-analysis is frequently applied to analyze different types of data and is improved continuously. Here, we first summarize the representative meta-analysis methods systematically, and then study the current applications of meta-analysis in various omics fields, finally we discuss the still-existing problems and the future development of meta-analysis.
Ubrihien, Rodney P; Ezaz, Tariq; Taylor, Anne M; Stevens, Mark M; Krikowa, Frank; Foster, Simon; Maher, William A
2017-04-01
This study describes the transcriptomic response of the Australian endemic freshwater gastropod Isidorella newcombi exposed to 80±1μg/L of copper for 3days. Analysis of copper tissue concentration, lysosomal membrane destabilisation and RNA-seq were conducted. Copper tissue concentrations confirmed that copper was bioaccumulated by the snails. Increased lysosomal membrane destabilisation in the copper-exposed snails indicated that the snails were stressed as a result of the exposure. Both copper tissue concentrations and lysosomal destabilisation were significantly greater in snails exposed to copper. In order to interpret the RNA-seq data from an ecotoxicological perspective an integrated biological response model was developed that grouped transcriptomic responses into those associated with copper transport and storage, survival mechanisms and cell death. A conceptual model of expected transcriptomic changes resulting from the copper exposure was developed as a basis to assess transcriptomic responses. Transcriptomic changes were evident at all the three levels of the integrated biological response model. Despite lacking statistical significance, increased expression of the gene encoding copper transporting ATPase provided an indication of increased internal transport of copper. Increased expression of genes associated with endocytosis are associated with increased transport of copper to the lysosome for storage in a detoxified form. Survival mechanisms included metabolic depression and processes associated with cellular repair and recycling. There was transcriptomic evidence of increased cell death by apoptosis in the copper-exposed organisms. Increased apoptosis is supported by the increase in lysosomal membrane destabilisation in the copper-exposed snails. Transcriptomic changes relating to apoptosis, phagocytosis, protein degradation and the lysosome were evident and these processes can be linked to the degradation of post-apoptotic debris. The study identified contaminant specific transcriptomic markers as well as markers of general stress. From an ecotoxicological perspective, the use of a framework to group transcriptomic responses into those associated with copper transport, survival and cell death assisted with the complex process of interpretation of RNA-seq data. The broad adoption of such a framework in ecotoxicology studies would assist in comparison between studies and the identification of reliable transcriptomic markers of contaminant exposure and response. Copyright © 2017 Elsevier B.V. All rights reserved.
Rai, Amit; Yamazaki, Mami; Takahashi, Hiroki; Nakamura, Michimi; Kojoma, Mareshige; Suzuki, Hideyuki; Saito, Kazuki
2016-01-01
The Panax genus has been a source of natural medicine, benefitting human health over the ages, among which the Panax japonicus represents an important species. Our understanding of several key pathways and enzymes involved in the biosynthesis of ginsenosides, a pharmacologically active class of metabolites and a major chemical constituents of the rhizome extracts from the Panax species, are limited. Limited genomic information, and lack of studies on comparative transcriptomics across the Panax species have restricted our understanding of the biosynthetic mechanisms of these and many other important classes of phytochemicals. Herein, we describe Illumina based RNA sequencing analysis to characterize the transcriptome and expression profiles of genes expressed in the five tissues of P. japonicus, and its comparison with other Panax species. RNA sequencing and de novo transcriptome assembly for P. japonicus resulted in a total of 135,235 unigenes with 78,794 (58.24%) unigenes being annotated using NCBI-nr database. Transcriptome profiling, and gene ontology enrichment analysis for five tissues of P. japonicus showed that although overall processes were evenly conserved across all tissues. However, each tissue was characterized by several unique unigenes with the leaves showing the most unique unigenes among the tissues studied. A comparative analysis of the P. japonicus transcriptome assembly with publically available transcripts from other Panax species, namely, P. ginseng, P. notoginseng, and P. quinquefolius also displayed high sequence similarity across all Panax species, with P. japonicus showing highest similarity with P. ginseng. Annotation of P. japonicus transcriptome resulted in the identification of putative genes encoding all enzymes from the triterpene backbone biosynthetic pathways, and identified 24 and 48 unigenes annotated as cytochrome P450 (CYP) and glycosyltransferases (GT), respectively. These CYPs and GTs annotated unigenes were conserved across all Panax species and co-expressed with other the transcripts involved in the triterpenoid backbone biosynthesis pathways. Unigenes identified in this study represent strong candidates for being involved in the triterpenoid saponins biosynthesis, and can serve as a basis for future validation studies. PMID:27148308
Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes
Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise
2009-01-01
Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885
Multivariate inference of pathway activity in host immunity and response to therapeutics
Goel, Gautam; Conway, Kara L.; Jaeger, Martin; Netea, Mihai G.; Xavier, Ramnik J.
2014-01-01
Developing a quantitative view of how biological pathways are regulated in response to environmental factors is central for understanding of disease phenotypes. We present a computational framework, named Multivariate Inference of Pathway Activity (MIPA), which quantifies degree of activity induced in a biological pathway by computing five distinct measures from transcriptomic profiles of its member genes. Statistical significance of inferred activity is examined using multiple independent self-contained tests followed by a competitive analysis. The method incorporates a new algorithm to identify a subset of genes that may regulate the extent of activity induced in a pathway. We present an in-depth evaluation of specificity, robustness, and reproducibility of our method. We benchmarked MIPA's false positive rate at less than 1%. Using transcriptomic profiles representing distinct physiological and disease states, we illustrate applicability of our method in (i) identifying gene–gene interactions in autophagy-dependent response to Salmonella infection, (ii) uncovering gene–environment interactions in host response to bacterial and viral pathogens and (iii) identifying driver genes and processes that contribute to wound healing and response to anti-TNFα therapy. We provide relevant experimental validation that corroborates the accuracy and advantage of our method. PMID:25147207
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system. PMID:24009133
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system.
Aging-like Changes in the Transcriptome of Irradiated Microglia
Li, Matthew D.; Burns, Terry C.; Kumar, Sunny; Morgan, Alexander A.; Sloan, Steven A.; Palmer, Theo D.
2014-01-01
Whole brain irradiation remains important in the management of brain tumors. Although necessary for improving survival outcomes, cranial irradiation also results in cognitive decline in long-term survivors. A chronic inflammatory state characterized by microglial activation has been implicated in radiation-induced brain injury. We here provide the first comprehensive transcriptional profile of irradiated microglia. Fluorescence-activated cell sorting (FACS) was used to isolate CD11b+ microglia from the hippocampi of C57BL/6 and Balb/c mice 1 month after 10Gy cranial irradiation. Affymetrix gene expression profiles were evaluated using linear modeling, rank product analyses. One month after irradiation, a conserved irradiation signature across strains was identified, comprising 448 and 85 differentially up- and down-regulated genes, respectively. Gene set enrichment analysis (GSEA) demonstrated enrichment for inflammation, including M1 macrophage-associated genes, but also an unexpected enrichment for extracellular matrix and blood coagulation-related gene sets, in contrast previously described microglial states. Weighted gene co-expression network analysis (WGCNA) confirmed these findings and further revealed alterations in mitochondrial function. The RNA-seq transcriptome of microglia 24h post-radiation proved similar to the 1-month transcriptome, but additionally featured alterations in apoptotic and lysosomal gene expression. Re-analysis of published aging mouse microglia transcriptome data demonstrated striking similarity to the 1 month irradiated microglia transcriptome, suggesting that shared mechanisms may underlie aging and chronic irradiation-induced cognitive decline. PMID:25690519
Costa, Fabrizio; Alba, Rob; Schouten, Henk; Soglio, Valeria; Gianfranceschi, Luca; Serra, Sara; Musacchi, Stefano; Sansavini, Silviero; Costa, Guglielmo; Fei, Zhangjun; Giovannoni, James
2010-10-25
Fruit development, maturation and ripening consists of a complex series of biochemical and physiological changes that in climacteric fruits, including apple and tomato, are coordinated by the gaseous hormone ethylene. These changes lead to final fruit quality and understanding of the functional machinery underlying these processes is of both biological and practical importance. To date many reports have been made on the analysis of gene expression in apple. In this study we focused our investigation on the role of ethylene during apple maturation, specifically comparing transcriptomics of normal ripening with changes resulting from application of the hormone receptor competitor 1-methylcyclopropene. To gain insight into the molecular process regulating ripening in apple, and to compare to tomato (model species for ripening studies), we utilized both homologous and heterologous (tomato) microarray to profile transcriptome dynamics of genes involved in fruit development and ripening, emphasizing those which are ethylene regulated.The use of both types of microarrays facilitated transcriptome comparison between apple and tomato (for the later using data previously published and available at the TED: tomato expression database) and highlighted genes conserved during ripening of both species, which in turn represent a foundation for further comparative genomic studies. The cross-species analysis had the secondary aim of examining the efficiency of heterologous (specifically tomato) microarray hybridization for candidate gene identification as related to the ripening process. The resulting transcriptomics data revealed coordinated gene expression during fruit ripening of a subset of ripening-related and ethylene responsive genes, further facilitating the analysis of ethylene response during fruit maturation and ripening. Our combined strategy based on microarray hybridization enabled transcriptome characterization during normal climacteric apple ripening, as well as definition of ethylene-dependent transcriptome changes. Comparison with tomato fruit maturation and ethylene responsive transcriptome activity facilitated identification of putative conserved orthologous ripening-related genes, which serve as an initial set of candidates for assessing conservation of gene activity across genomes of fruit bearing plant species.
Transcriptomic analysis of flower development in tea (Camellia sinensis (L.)).
Liu, Feng; Wang, Yu; Ding, Zhaotang; Zhao, Lei; Xiao, Jun; Wang, Linjun; Ding, Shibo
2017-10-05
Flowering is a critical and complicated process in plant development, involving interactions of numerous endogenous and environmental factors, but little is known about the complex network regulating flower development in tea plants. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptomic analysis assembles gene-related information involved in reproductive growth of C. sinensis. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction were enriched among the DEGs. Furthermore, 207 flowering-associated unigenes were identified from our database. Some transcription factors, such as WRKY, ERF, bHLH, MYB and MADS-box were shown to be up-regulated in floral transition, which might play the role of progression of flowering. Furthermore, 14 genes were selected for confirmation of expression levels using quantitative real-time PCR (qRT-PCR). The comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in C. sinensis. Our data also provided a useful database for further research of tea and other species of plants. Copyright © 2017 Elsevier B.V. All rights reserved.
Moskalev, Alexey А; Kudryavtseva, Anna V; Graphodatsky, Alexander S; Beklemisheva, Violetta R; Serdyukova, Natalya A; Krutovsky, Konstantin V; Sharov, Vadim V; Kulakovskiy, Ivan V; Lando, Andrey S; Kasianov, Artem S; Kuzmin, Dmitry A; Putintseva, Yuliya A; Feranchuk, Sergey I; Shaposhnikov, Mikhail V; Fraifeld, Vadim E; Toren, Dmitri; Snezhkina, Anastasia V; Sitnik, Vasily V
2017-12-28
Gray whale, Eschrichtius robustus (E. robustus), is a single member of the family Eschrichtiidae, which is considered to be the most primitive in the class Cetacea. Gray whale is often described as a "living fossil". It is adapted to extreme marine conditions and has a high life expectancy (77 years). The assembly of a gray whale genome and transcriptome will allow to carry out further studies of whale evolution, longevity, and resistance to extreme environment. In this work, we report the first de novo assembly and primary analysis of the E. robustus genome and transcriptome based on kidney and liver samples. The presented draft genome assembly is complete by 55% in terms of a total genome length, but only by 24% in terms of the BUSCO complete gene groups, although 10,895 genes were identified. Transcriptome annotation and comparison with other whale species revealed robust expression of DNA repair and hypoxia-response genes, which is expected for whales. This preliminary study of the gray whale genome and transcriptome provides new data to better understand the whale evolution and the mechanisms of their adaptation to the hypoxic conditions.
Ponce, Dalia; Brinkman, Diane L; Potriquet, Jeremy; Mulvenna, Jason
2016-04-05
Jellyfish venoms are rich sources of toxins designed to capture prey or deter predators, but they can also elicit harmful effects in humans. In this study, an integrated transcriptomic and proteomic approach was used to identify putative toxins and their potential role in the venom of the scyphozoan jellyfish Chrysaora fuscescens. A de novo tentacle transcriptome, containing more than 23,000 contigs, was constructed and used in proteomic analysis of C. fuscescens venom to identify potential toxins. From a total of 163 proteins identified in the venom proteome, 27 were classified as putative toxins and grouped into six protein families: proteinases, venom allergens, C-type lectins, pore-forming toxins, glycoside hydrolases and enzyme inhibitors. Other putative toxins identified in the transcriptome, but not the proteome, included additional proteinases as well as lipases and deoxyribonucleases. Sequence analysis also revealed the presence of ShKT domains in two putative venom proteins from the proteome and an additional 15 from the transcriptome, suggesting potential ion channel blockade or modulatory activities. Comparison of these potential toxins to those from other cnidarians provided insight into their possible roles in C. fuscescens venom and an overview of the diversity of potential toxin families in cnidarian venoms.
Diray-Arce, Joann; Clement, Mark; Gul, Bilquees; Khan, M Ajmal; Nielsen, Brent L
2015-05-06
Improvement of crop production is needed to feed the growing world population as the amount and quality of agricultural land decreases and soil salinity increases. This has stimulated research on salt tolerance in plants. Most crops tolerate a limited amount of salt to survive and produce biomass, while halophytes (salt-tolerant plants) have the ability to grow with saline water utilizing specific biochemical mechanisms. However, little is known about the genes involved in salt tolerance. We have characterized the transcriptome of Suaeda fruticosa, a halophyte that has the ability to sequester salts in its leaves. Suaeda fruticosa is an annual shrub in the family Chenopodiaceae found in coastal and inland regions of Pakistan and Mediterranean shores. This plant is an obligate halophyte that grows optimally from 200-400 mM NaCl and can grow at up to 1000 mM NaCl. High throughput sequencing technology was performed to provide understanding of genes involved in the salt tolerance mechanism. De novo assembly of the transcriptome and analysis has allowed identification of differentially expressed and unique genes present in this non-conventional crop. Twelve sequencing libraries prepared from control (0 mM NaCl treated) and optimum (300 mM NaCl treated) plants were sequenced using Illumina Hiseq 2000 to investigate differential gene expression between shoots and roots of Suaeda fruticosa. The transcriptome was assembled de novo using Velvet and Oases k-45 and clustered using CDHIT-EST. There are 54,526 unigenes; among these 475 genes are downregulated and 44 are upregulated when samples from plants grown under optimal salt are compared with those grown without salt. BLAST analysis identified the differentially expressed genes, which were categorized in gene ontology terms and their pathways. This work has identified potential genes involved in salt tolerance in Suaeda fruticosa, and has provided an outline of tools to use for de novo transcriptome analysis. The assemblies that were used provide coverage of a considerable proportion of the transcriptome, which allows analysis of differential gene expression and identification of genes that may be involved in salt tolerance. The transcriptome may serve as a reference sequence for study of other succulent halophytes.
Analysis, annotation, and profiling of the oat seed transcriptome
USDA-ARS?s Scientific Manuscript database
Novel high-throughput next generation sequencing (NGS) technologies are providing opportunities to explore genomes and transcriptomes in a cost-effective manner. To construct a gene expression atlas of developing oat (Avena sativa) seeds, two software packages specifically designed for RNA-seq (Trin...
A comprehensive analysis of the human placenta transcriptome
USDA-ARS?s Scientific Manuscript database
As the conduit for nutrients and growth signals, the placenta is critical to establishing an environment sufficient for fetal growth and development. To better understand the mechanisms regulating placental development and gene expression, we characterized the transcriptome of term placenta from 20 ...
Chloroplast microsatellite markers for Artocarpus (Moraceae) developed from transcriptome sequences
USDA-ARS?s Scientific Manuscript database
Premise of the study: Chloroplast microsatellite loci were characterized from transcriptomes of Artocarpus (A.) altilis (breadfruit) and A. camansi (breadnut). They were tested in A. odoratissimus (terap) and A. altilis and evaluated in silico for two congeners. Methods and Results: 15 simple seque...
The present study investigated whether combining of targeted analytical chemistry methods with unsupervised, data-rich methodologies (i.e. transcriptomics) can be utilized to evaluate relative contributions of wastewater treatment plant (WWTP) effluents to biological effects. The...
Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088
Zajaczkowski, Esmi L; Zhao, Qiong-Yi; Zhang, Zong Hong; Li, Xiang; Wei, Wei; Marshall, Paul R; Leighton, Laura J; Nainar, Sarah; Feng, Chao; Spitale, Robert C; Bredy, Timothy W
2018-06-15
Transcriptome-wide expression profiling of neurons has provided important insights into the underlying molecular mechanisms and gene expression patterns that transpire during learning and memory formation. However, there is a paucity of tools for profiling stimulus-induced RNA within specific neuronal cell populations. A bioorthogonal method to chemically label nascent (i.e., newly transcribed) RNA in a cell-type-specific and temporally controlled manner, which is also amenable to bioconjugation via click chemistry, was recently developed and optimized within conventional immortalized cell lines. However, its value within a more fragile and complicated cellular system such as neurons, as well as for transcriptome-wide expression profiling, has yet to be demonstrated. Here, we report the visualization and sequencing of activity-dependent nascent RNA derived from neurons using this labeling method. This work has important implications for improving transcriptome-wide expression profiling and visualization of nascent RNA in neurons, which has the potential to provide valuable insights into the mechanisms underlying neural plasticity, learning, and memory.
Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.
Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.
Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei
2015-01-01
Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.
Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning
Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei
2015-01-01
Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning. PMID:25874455
Lu, Taofeng; Sun, Yujiao; Ma, Qin; Zhu, Minghao; Liu, Dan; Ma, Jianzhang; Ma, Yuehui; Chen, Hongyan; Guan, Weijun
2016-12-01
The Siberian tiger, Panthera tigris altaica, is an endangered species, and much more work is needed to protect this species, which is still vulnerable to extinction. Conservation efforts may be supported by the genetic assessment of wild populations, for which highly specific microsatellite markers are required. However, only a limited amount of genetic sequence data is available for this species. To identify the genes involved in the lung transcriptome and to develop additional simple sequence repeat (SSR) markers for the Siberian tiger, we used high-throughput RNA-Seq to characterize the Siberian tiger transcriptome in lung tissue (designated 'PTA-lung') and a pooled tissue sample (designated 'PTA'). Approximately 47.5 % (33,187/69,836) of the lung transcriptome was annotated in four public databases (Nr, Swiss-Prot, KEGG, and COG). The annotated genes formed a potential pool for gene identification in the tiger. An analysis of the genes differentially expressed in the PTA lung, and PTA samples revealed that the tiger may have suffered a series of diseases before death. In total, 1062 non-redundant SSRs were identified in the Siberian tiger transcriptome. Forty-three primer pairs were randomly selected for amplification reactions, and 26 of the 43 pairs were also used to evaluate the levels of genetic polymorphism. Fourteen primer pairs (32.56 %) amplified products that were polymorphic in size in P. tigris altaica. In conclusion, the transcriptome sequences will provide a valuable genomic resource for genetic research, and these new SSR markers comprise a reasonable number of loci for the genetic analysis of wild and captive populations of P. tigris altaica.
Divina, Petr; Vlcek, Cestmír; Strnad, Petr; Paces, Václav; Forejt, Jirí
2005-03-05
We generated the gene expression profile of the total testis from the adult C57BL/6J male mice using serial analysis of gene expression (SAGE). Two high-quality SAGE libraries containing a total of 76 854 tags were constructed. An extensive bioinformatic analysis and comparison of SAGE transcriptomes of the total testis, testicular somatic cells and other mouse tissues was performed and the theory of male-biased gene accumulation on the X chromosome was tested. We sorted out 829 genes predominantly expressed from the germinal part and 944 genes from the somatic part of the testis. The genes preferentially and specifically expressed in total testis and testicular somatic cells were identified by comparing the testis SAGE transcriptomes to the available transcriptomes of seven non-testis tissues. We uncovered chromosomal clusters of adjacent genes with preferential expression in total testis and testicular somatic cells by a genome-wide search and found that the clusters encompassed a significantly higher number of genes than expected by chance. We observed a significant 3.2-fold enrichment of the proportion of X-linked genes specific for testicular somatic cells, while the proportions of X-linked genes specific for total testis and for other tissues were comparable. In contrast to the tissue-specific genes, an under-representation of X-linked genes in the total testis transcriptome but not in the transcriptomes of testicular somatic cells and other tissues was detected. Our results provide new evidence in favor of the theory of male-biased genes accumulation on the X chromosome in testicular somatic cells and indicate the opposite action of the meiotic X-inactivation in testicular germ cells.
Divina, Petr; Vlček, Čestmír; Strnad, Petr; Pačes, Václav; Forejt, Jiří
2005-01-01
Background We generated the gene expression profile of the total testis from the adult C57BL/6J male mice using serial analysis of gene expression (SAGE). Two high-quality SAGE libraries containing a total of 76 854 tags were constructed. An extensive bioinformatic analysis and comparison of SAGE transcriptomes of the total testis, testicular somatic cells and other mouse tissues was performed and the theory of male-biased gene accumulation on the X chromosome was tested. Results We sorted out 829 genes predominantly expressed from the germinal part and 944 genes from the somatic part of the testis. The genes preferentially and specifically expressed in total testis and testicular somatic cells were identified by comparing the testis SAGE transcriptomes to the available transcriptomes of seven non-testis tissues. We uncovered chromosomal clusters of adjacent genes with preferential expression in total testis and testicular somatic cells by a genome-wide search and found that the clusters encompassed a significantly higher number of genes than expected by chance. We observed a significant 3.2-fold enrichment of the proportion of X-linked genes specific for testicular somatic cells, while the proportions of X-linked genes specific for total testis and for other tissues were comparable. In contrast to the tissue-specific genes, an under-representation of X-linked genes in the total testis transcriptome but not in the transcriptomes of testicular somatic cells and other tissues was detected. Conclusion Our results provide new evidence in favor of the theory of male-biased genes accumulation on the X chromosome in testicular somatic cells and indicate the opposite action of the meiotic X-inactivation in testicular germ cells. PMID:15748293
Selenium supplementation prevents metabolic and transcriptomic responses to cadmium in mouse lung.
Hu, Xin; Chandler, Joshua D; Fernandes, Jolyn; Orr, Michael L; Hao, Li; Uppal, Karan; Neujahr, David C; Jones, Dean P; Go, Young-Mi
2018-04-12
The protective effect of selenium (Se) on cadmium (Cd) toxicity is well documented, but underlying mechanisms are unclear. Male mice fed standard diet were given Cd (CdCl 2 , 18 μmol/L) in drinking water with or without Se (Na 2 SeO 4, 20 μmol/L) for 16 weeks. Lungs were analyzed for Cd concentration, transcriptomics and metabolomics. Data were analyzed with biostatistics, bioinformatics, pathway enrichment analysis, and combined transcriptome-metabolome-wide association study. Mice treated with Cd had higher lung Cd content (1.7 ± 0.4 pmol/mg protein) than control mice (0.8 ± 0.3 pmol/mg protein) or mice treated with Cd and Se (0.4 ± 0.1 pmol/mg protein). Gene set enrichment analysis of transcriptomics data showed that Se prevented Cd effects on inflammatory and myogenesis genes and diminished Cd effects on several other pathways. Similarly, Se prevented Cd-disrupted metabolic pathways in amino acid metabolism and urea cycle. Integrated transcriptome and metabolome network analysis showed that Cd treatment had a network structure with fewer gene-metabolite clusters compared to control. Centrality measurements showed that Se counteracted changes in a group of Cd-responsive genes including Zdhhc11, (protein-cysteine S-palmitoyltransferase), Ighg1 (immunoglobulin heavy constant gamma-1) and associated changes in metabolite concentrations. Co-administration of Se with Cd prevented Cd increase in lung and prevented Cd-associated pathway and network responses of the transcriptome and metabolome. Se protection against Cd toxicity in lung involves complex systems responses. Environmental Cd stimulates proinflammatory and profibrotic signaling. The present results indicate that dietary or supplemental Se could be useful to mitigate Cd toxicity. Published by Elsevier B.V.
Transcriptome of interstitial cells of Cajal reveals unique and selective gene signatures
Park, Paul J.; Fuchs, Robert; Wei, Lai; Jorgensen, Brian G.; Redelman, Doug; Ward, Sean M.; Sanders, Kenton M.
2017-01-01
Transcriptome-scale data can reveal essential clues into understanding the underlying molecular mechanisms behind specific cellular functions and biological processes. Transcriptomics is a continually growing field of research utilized in biomarker discovery. The transcriptomic profile of interstitial cells of Cajal (ICC), which serve as slow-wave electrical pacemakers for gastrointestinal (GI) smooth muscle, has yet to be uncovered. Using copGFP-labeled ICC mice and flow cytometry, we isolated ICC populations from the murine small intestine and colon and obtained their transcriptomes. In analyzing the transcriptome, we identified a unique set of ICC-restricted markers including transcription factors, epigenetic enzymes/regulators, growth factors, receptors, protein kinases/phosphatases, and ion channels/transporters. This analysis provides new and unique insights into the cellular and biological functions of ICC in GI physiology. Additionally, we constructed an interactive ICC genome browser (http://med.unr.edu/physio/transcriptome) based on the UCSC genome database. To our knowledge, this is the first online resource that provides a comprehensive library of all known genetic transcripts expressed in primary ICC. Our genome browser offers a new perspective into the alternative expression of genes in ICC and provides a valuable reference for future functional studies. PMID:28426719
Musser, Jacob M; Wagner, Günter P
2015-11-01
We elaborate a framework for investigating the evolutionary history of morphological characters. We argue that morphological character trees generated by phylogenetic analysis of transcriptomes provide a useful tool for identifying causal gene expression differences underlying the development and evolution of morphological characters. They also enable rigorous testing of different models of morphological character evolution and origination, including the hypothesis that characters originate via divergence of repeated ancestral characters. Finally, morphological character trees provide evidence that character transcriptomes undergo concerted evolution. We argue that concerted evolution of transcriptomes can explain the so-called "species signal" found in several recent comparative transcriptome studies. The species signal is the phenomenon that transcriptomes cluster by species rather than character type, even though the characters are older than the respective species. We suggest the species signal is a natural consequence of concerted gene expression evolution resulting from mutations that alter gene regulatory network interactions shared by the characters under comparison. Thus, character trees generated from transcriptomes allow us to investigate the variational independence, or individuation, of morphological characters at the level of genetic programs. © 2015 Wiley Periodicals, Inc.
Transcriptomic changes throughout post-hatch development in Gallus gallus pituitary
Lamont, Susan J; Schmidt, Carl J
2016-01-01
The pituitary gland is a neuroendocrine organ that works closely with the hypothalamus to affect multiple processes within the body including the stress response, metabolism, growth and immune function. Relative tissue expression (rEx) is a transcriptome analysis method that compares the genes expressed in a particular tissue to the genes expressed in all other tissues with available data. Using rEx, the aim of this study was to identify genes that are uniquely or more abundantly expressed in the pituitary when compared to all other collected chicken tissues. We applied rEx to define genes enriched in the chicken pituitaries at days 21, 22 and 42 post-hatch. rEx analysis identified 25 genes shared between all time points, 295 genes shared between days 21 and 22 and 407 genes unique to day 42. The 25 genes shared by all time points are involved in morphogenesis and general nervous tissue development. The 295 shared genes between days 21 and 22 are involved in neurogenesis and nervous system development and differentiation. The 407 unique day 42 genes are involved in pituitary development, endocrine system development and other hormonally related gene ontology terms. Overall, rEx analysis indicates a focus on nervous system/tissue development at days 21 and 22. By day 42, in addition to nervous tissue development, there is expression of genes involved in the endocrine system, possibly for maturation and preparation for reproduction. This study defines the transcriptome of the chicken pituitary gland and aids in understanding the expressed genes critical to its function and maturation. PMID:27856505
Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F.; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D.; Ober, Carole; Nicolae, Dan L.; Barnes, Kathleen C.; London, Stephanie J.; Gilliland, Frank; Weiss, Scott T.; Raby, Benjamin A.; Cohn, Lauren
2015-01-01
Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively. Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma. PMID:25763605
Xu, Zhifeng; Zhu, Wenyi; Liu, Yanchao; Liu, Xing; Chen, Qiushuang; Peng, Miao; Wang, Xiangzun; Shen, Guangmao; He, Lin
2014-01-01
The carmine spider mite (CSM), Tetranychus cinnabarinus, is an important pest mite in agriculture, because it can develop insecticide resistance easily. To gain valuable gene information and molecular basis for the future insecticide resistance study of CSM, the first transcriptome analysis of CSM was conducted. A total of 45,016 contigs and 25,519 unigenes were generated from the de novo transcriptome assembly, and 15,167 unigenes were annotated via BLAST querying against current databases, including nr, SwissProt, the Clusters of Orthologous Groups (COGs), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO). Aligning the transcript to Tetranychus urticae genome, the 19255 (75.45%) of the transcripts had significant (e-value <10-5) matches to T. urticae DNA genome, 19111 sequences matched to T. urticae proteome with an average protein length coverage of 42.55%. Core Eukaryotic Genes Mapping Approach (CEGMA) analysis identified 435 core eukaryotic genes (CEGs) in the CSM dataset corresponding to 95% coverage. Ten gene categories that relate to insecticide resistance in arthropod were generated from CSM transcriptome, including 53 P450-, 22 GSTs-, 23 CarEs-, 1 AChE-, 7 GluCls-, 9 nAChRs-, 8 GABA receptor-, 1 sodium channel-, 6 ATPase- and 12 Cyt b genes. We developed significant molecular resources for T. cinnabarinus putatively involved in insecticide resistance. The transcriptome assembly analysis will significantly facilitate our study on the mechanism of adapting environmental stress (including insecticide) in CSM at the molecular level, and will be very important for developing new control strategies against this pest mite.
Rai, Amit; Nakaya, Taiki; Shimizu, Yohei; Rai, Megha; Nakamura, Michimi; Suzuki, Hideyuki; Saito, Kazuki; Yamazaki, Mami
2018-05-29
Lithospermum officinale is a valuable source of bioactive metabolites with medicinal and industrial values. However, little is known about genes involved in the biosynthesis of these metabolites, primarily due to the lack of genome or transcriptome resources. This study presents the first effort to establish and characterize de novo transcriptome assembly resource for L. officinale and expression analysis for three of its tissues, namely leaf, stem, and root. Using over 4Gbps of RNA-sequencing datasets, we obtained de novo transcriptome assembly of L. officinale , consisting of 77,047 unigenes with assembly N50 value as 1524 bps. Based on transcriptome annotation and functional classification, 52,766 unigenes were assigned with putative genes functions, gene ontology terms, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. KEGG pathway and gene ontology enrichment analysis using highly expressed unigenes across three tissues and targeted metabolome analysis showed active secondary metabolic processes enriched specifically in the root of L. officinale . Using co-expression analysis, we also identified 20 and 48 unigenes representing different enzymes of lithospermic/chlorogenic acid and shikonin biosynthesis pathways, respectively. We further identified 15 candidate unigenes annotated as cytochrome P450 with the highest expression in the root of L. officinale as novel genes with a role in key biochemical reactions toward shikonin biosynthesis. Thus, through this study, we not only generated a high-quality genomic resource for L. officinale but also propose candidate genes to be involved in shikonin biosynthesis pathways for further functional characterization. Georg Thieme Verlag KG Stuttgart · New York.
Perigone Lobe Transcriptome Analysis Provides Insights into Rafflesia cantleyi Flower Development.
Lee, Xin-Wei; Mat-Isa, Mohd-Noor; Mohd-Elias, Nur-Atiqah; Aizat-Juhari, Mohd Afiq; Goh, Hoe-Han; Dear, Paul H; Chow, Keng-See; Haji Adam, Jumaat; Mohamed, Rahmah; Firdaus-Raih, Mohd; Wan, Kiew-Lian
2016-01-01
Rafflesia is a biologically enigmatic species that is very rare in occurrence and possesses an extraordinary morphology. This parasitic plant produces a gigantic flower up to one metre in diameter with no leaves, stem or roots. However, little is known about the floral biology of this species especially at the molecular level. In an effort to address this issue, we have generated and characterised the transcriptome of the Rafflesia cantleyi flower, and performed a comparison with the transcriptome of its floral bud to predict genes that are expressed and regulated during flower development. Approximately 40 million sequencing reads were generated and assembled de novo into 18,053 transcripts with an average length of 641 bp. Of these, more than 79% of the transcripts had significant matches to annotated sequences in the public protein database. A total of 11,756 and 7,891 transcripts were assigned to Gene Ontology categories and clusters of orthologous groups respectively. In addition, 6,019 transcripts could be mapped to 129 pathways in Kyoto Encyclopaedia of Genes and Genomes Pathway database. Digital abundance analysis identified 52 transcripts with very high expression in the flower transcriptome of R. cantleyi. Subsequently, analysis of differential expression between developing flower and the floral bud revealed a set of 105 transcripts with potential role in flower development. Our work presents a deep transcriptome resource analysis for the developing flower of R. cantleyi. Genes potentially involved in the growth and development of the R. cantleyi flower were identified and provide insights into biological processes that occur during flower development.
Qi, Xiwu; Yu, Xu; Xu, Daohua; Fang, Hailing; Dong, Ke; Li, Weilin; Liang, Chengyuan
2017-01-01
Lonicera japonica is an important medicinal plant that has been widely used in traditional Chinese medicine for thousands of years. The pharmacological activities of L. japonica are mainly due to its rich natural active ingredients, most of which are secondary metabolites. CYP450s are a large, complex, and widespread superfamily of proteins that participate in many endogenous and exogenous metabolic reactions, especially secondary metabolism. Here, we identified CYP450s in L. japonica transcriptome and analyzed CYP450s that may be involved in chlorogenic acid (CGA) biosynthesis. The recent availability of L. japonica transcriptome provided opportunity to identify CYP450s in this herb. BLAST based method and HMM based method were used to identify CYP450s in L. japonica transcriptome. Then, phylogenetic analysis, conserved motifs analysis, GO annotation, and KEGG annotation analyses were conducted to characterize the identified CYP450s. qRT-PCR was used to explore expression patterns of five CGA biosynthesis related CYP450s. In this study, 151 putative CYP450s with complete cytochrome P450 domain, which belonged to 10 clans, 45 families and 76 subfamilies, were identified in L. japonica transcriptome. Phylogenetic analysis classified these CYP450s into two major branches, A-type (47%) and non-A type (53%). Both types of CYP450s had conserved motifs in L. japonica . The differences of typical motif sequences between A-type and non-A type CYP450s in L. japonica were similar with other plants. GO classification indicated that non-A type CYP450s participated in more molecular functions and biological processes than A-type. KEGG pathway annotation totally assigned 47 CYP450s to 25 KEGG pathways. From these data, we cloned two LjC3Hs (CYP98A subfamily) and three LjC4Hs (CYP73A subfamily) that may be involved in biosynthesis of CGA, the major ingredient for pharmacological activities of L. japonica . qRT-PCR results indicated that two LjC3Hs exhibited oppositing expression patterns during the flower development and LjC3H2 exhibited a similar expression pattern with CGA concentration measured by HPLC. The expression patterns of three LjC4Hs were quite different and the expression pattern of LjC4H3 was quite similar with that of LjC3H1 . Our results provide a comprehensive identification and characterization of CYP450s in L. japonica . Five CGA biosynthesis related CYP450s were cloned and their expression patterns were explored. The different expression patterns of two LjC3Hs and three LjC4Hs may be due to functional divergence of both substrate and catalytic specificity during plant evolution. The co-expression pattern of LjC3H1 and LjC4H3 strongly suggested that they were under coordinated regulation by the same transcription factors due to same cis elements in their promoters. In conclusion, this study provides insight into CYP450s and will effectively facilitate the research of biosynthesis of CGA in L. japonica .
Phelix, C F; Feltus, F A
2015-01-01
Measuring biomarkers from plant tissue samples is challenging and expensive when the desire is to integrate transcriptomics, fluxomics, metabolomics, lipidomics, proteomics, physiomics and phenomics. We present a computational biology method where only the transcriptome needs to be measured and is used to derive a set of parameters for deterministic kinetic models of metabolic pathways. The technology is called Transcriptome-To-Metabolome (TTM) biosimulations, currently under commercial development, but available for non-commercial use by researchers. The simulated results on metabolites of 30 primary and secondary metabolic pathways in rice (Oryza sativa) were used as the biomarkers to predict whether the transcriptome was from a plant that had been under drought conditions. The rice transcriptomes were accessed from public archives and each individual plant was simulated. This unique quality of the TTM technology allows standard analyses on biomarker assessments, i.e. sensitivity, specificity, positive and negative predictive values, accuracy, receiver operator characteristics (ROC) curve and area under the ROC curve (AUC). Two validation methods were also used, the holdout and 10-fold cross validations. Initially 17 metabolites were identified as candidate biomarkers based on either statistical significance on binary phenotype when compared with control samples or recognition from the literature. The top three biomarkers based on AUC were gibberellic acid 12 (0.89), trehalose (0.80) and sn1-palmitate-sn2-oleic-phosphatidylglycerol (0.70). Neither heat map analyses of transcriptomes nor all 300 metabolites clustered the stressed and control groups effectively. The TTM technology allows the emergent properties of the integrated system to generate unique and useful 'Omics' information. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Transcriptome profiling reveals regulatory mechanisms underlying Corolla Senescence in Petunia
USDA-ARS?s Scientific Manuscript database
Genetic regulatory mechanisms that govern petal natural senescence in petunia is complicated and unclear. To identify key genes and pathways that regulate the process, we initiated a transcriptome analysis in petunia petals at four developmental time points, including petal opening without anthesis ...
Placental transcriptome co-expression analysis reveals conserved regulatory program across gestation
USDA-ARS?s Scientific Manuscript database
Mammalian development in utero is absolutely dependent on proper placental development, which is ultimately regulated by the placental genome. The regulation of the placental genome can be directly studied by exploring the underlying organization of the placental transcriptome through a systematic a...
Won, Harim I.; Schulze, Thomas T.; Clement, Emalie J.; Watson, Gabrielle F.; Watson, Sean M.; Warner, Rosalie C.; Ramler, Elizabeth A. M.; Witte, Elias J.; Schoenbeck, Mark A.; Rauter, Claudia M.; Davis, Paul H.
2018-01-01
Burying beetles (Nicrophorus spp.) are among the relatively few insects that provide parental care while not belonging to the eusocial insects such as ants or bees. This behavior incurs energy costs as evidenced by immune deficits and shorter life-spans in reproducing beetles. In the absence of an assembled transcriptome, relatively little is known concerning the molecular biology of these beetles. This work details the assembly and analysis of the Nicrophorus orbicollis transcriptome at multiple developmental stages. RNA-Seq reads were obtained by next-generation sequencing and the transcriptome was assembled using the Trinity assembler. Validation of the assembly was performed by functional characterization using Gene Ontology (GO), Eukaryotic Orthologous Groups (KOG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. Differential expression analysis highlights developmental stage-specific expression patterns, and immunity-related transcripts are discussed. The data presented provides a valuable molecular resource to aid further investigation into immunocompetence throughout this organism's sexual development. PMID:29707046
Generation of a foveomacular transcriptome
Bernstein, Steven; Wong, Paul W.
2014-01-01
Purpose Organizing molecular biologic data is a growing challenge since the rate of data accumulation is steadily increasing. Information relevant to a particular biologic query can be difficult to extract from the comprehensive databases currently available. We present a data collection and organization model designed to ameliorate these problems and applied it to generate an expressed sequence tag (EST)–based foveomacular transcriptome. Methods Using Perl, MySQL, EST libraries, screening, and human foveomacular gene expression as a model system, we generated a foveomacular transcriptome database enriched for molecularly relevant data. Results Using foveomacula as a gene expression model tissue, we identified and organized 6,056 genes expressed in that tissue. Of those identified genes, 3,480 had not been previously described as expressed in the foveomacula. Internal experimental controls as well as comparison of our data set to published data sets suggest we do not yet have a complete description of the foveomacula transcriptome. Conclusions We present an organizational method designed to amplify the utility of data pertinent to a specific research interest. Our method is generic enough to be applicable to a variety of conditions yet focused enough to allow for specialized study. PMID:24991187
Arsenomics: omics of arsenic metabolism in plants
Tripathi, Rudra Deo; Tripathi, Preeti; Dwivedi, Sanjay; Dubey, Sonali; Chatterjee, Sandipan; Chakrabarty, Debasis; Trivedi, Prabodh K.
2012-01-01
Arsenic (As) contamination of drinking water and groundwater used for irrigation can lead to contamination of the food chain and poses serious health risk to people worldwide. To reduce As intake through the consumption of contaminated food, identification of the mechanisms for As accumulation and detoxification in plant is a prerequisite to develop efficient phytoremediation methods and safer crops with reduced As levels. Transcriptome, proteome, and metabolome analysis of any organism reflects the total biological activities at any given time which are responsible for the adaptation of the organism to the surrounding environmental conditions. As these approaches are very important in analyzing plant As transport and accumulation, we termed “Arsenomics” as approach which deals transcriptome, proteome, and metabolome alterations during As exposure. Although, various studies have been performed to understand modulation in transcriptome in response to As, many important questions need to be addressed regarding the translated proteins of plants at proteomic and metabolomic level, resulting in various ecophysiological responses. In this review, the comprehensive knowledge generated in this area has been compiled and analyzed. There is a need to strengthen Arsenomics which will lead to build up tools to develop As-free plants for safe consumption. PMID:22934029
UniVIO: A Multiple Omics Database with Hormonome and Transcriptome Data from Rice
Sakurai, Tetsuya; Sakakibara, Hitoshi
2013-01-01
Plant hormones play important roles as signaling molecules in the regulation of growth and development by controlling the expression of downstream genes. Since the hormone signaling system represents a complex network involving functional cross-talk through the mutual regulation of signaling and metabolism, a comprehensive and integrative analysis of plant hormone concentrations and gene expression is important for a deeper understanding of hormone actions. We have developed a database named Uniformed Viewer for Integrated Omics (UniVIO: http://univio.psc.riken.jp/), which displays hormone-metabolome (hormonome) and transcriptome data in a single formatted (uniformed) heat map. At the present time, hormonome and transcriptome data obtained from 14 organ parts of rice plants at the reproductive stage and seedling shoots of three gibberellin signaling mutants are included in the database. The hormone concentration and gene expression data can be searched by substance name, probe ID, gene locus ID or gene description. A correlation search function has been implemented to enable users to obtain information of correlated substance accumulation and gene expression. In the correlation search, calculation method, range of correlation coefficient and plant samples can be selected freely. PMID:23314752
Transcriptional profiling of CD31(+) cells isolated from murine embryonic stem cells.
Mariappan, Devi; Winkler, Johannes; Chen, Shuhua; Schulz, Herbert; Hescheler, Jürgen; Sachinidis, Agapios
2009-02-01
Identification of genes involved in endothelial differentiation is of great interest for the understanding of the cellular and molecular mechanisms involved in the development of new blood vessels. Mouse embryonic stem (mES) cells serve as a potential source of endothelial cells for transcriptomic analysis. We isolated endothelial cells from 8-days old embryoid bodies by immuno-magnetic separation using platelet endothelial cell adhesion molecule-1 (also known as CD31) expressed on both early and mature endothelial cells. CD31(+) cells exhibit endothelial-like behavior by being able to incorporate DiI-labeled acetylated low-density lipoprotein as well as form tubular structures on matrigel. Quantitative and semi-quantitative PCR analysis further demonstrated the increased expression of endothelial transcripts. To ascertain the specific transcriptomic identity of the CD31(+) cells, large-scale microarray analysis was carried out. Comparative bioinformatic analysis reveals an enrichment of the gene ontology categories angiogenesis, blood vessel morphogenesis, vasculogenesis and blood coagulation in the CD31(+) cell population. Based on the transcriptomic signatures of the CD31(+) cells, we conclude that this ES cell-derived population contains endothelial-like cells expressing a mesodermal marker BMP2 and possess an angiogenic potential. The transcriptomic characterization of CD31(+) cells enables an in vitro functional genomic model to identify genes required for angiogenesis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Fei; Maslov, Sergei; Yoo, Shinjae
Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less
He, Fei; Maslov, Sergei; Yoo, Shinjae; ...
2016-05-25
Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less
High-confidence coding and noncoding transcriptome maps
2017-01-01
The advent of high-throughput RNA sequencing (RNA-seq) has led to the discovery of unprecedentedly immense transcriptomes encoded by eukaryotic genomes. However, the transcriptome maps are still incomplete partly because they were mostly reconstructed based on RNA-seq reads that lack their orientations (known as unstranded reads) and certain boundary information. Methods to expand the usability of unstranded RNA-seq data by predetermining the orientation of the reads and precisely determining the boundaries of assembled transcripts could significantly benefit the quality of the resulting transcriptome maps. Here, we present a high-performing transcriptome assembly pipeline, called CAFE, that significantly improves the original assemblies, respectively assembled with stranded and/or unstranded RNA-seq data, by orienting unstranded reads using the maximum likelihood estimation and by integrating information about transcription start sites and cleavage and polyadenylation sites. Applying large-scale transcriptomic data comprising 230 billion RNA-seq reads from the ENCODE, Human BodyMap 2.0, The Cancer Genome Atlas, and GTEx projects, CAFE enabled us to predict the directions of about 220 billion unstranded reads, which led to the construction of more accurate transcriptome maps, comparable to the manually curated map, and a comprehensive lncRNA catalog that includes thousands of novel lncRNAs. Our pipeline should not only help to build comprehensive, precise transcriptome maps from complex genomes but also to expand the universe of noncoding genomes. PMID:28396519
Morey, Jeanine S; Burek Huntington, Kathy A; Campbell, Michelle; Clauss, Tonya M; Goertz, Caroline E; Hobbs, Roderick C; Lunardi, Denise; Moors, Amanda J; Neely, Marion G; Schwacke, Lori H; Van Dolah, Frances M
2017-10-01
Assessing the health of marine mammal sentinel species is crucial to understanding the impacts of environmental perturbations on marine ecosystems and human health. In Arctic regions, beluga whales, Delphinapterus leucas, are upper level predators that may serve as a sentinel species, potentially forecasting impacts on human health. While gene expression profiling from blood transcriptomes has widely been used to assess health status and environmental exposures in human and veterinary medicine, its use in wildlife has been limited due to the lack of available genomes and baseline data. To this end we constructed the first beluga whale blood transcriptome de novo from samples collected during annual health assessments of the healthy Bristol Bay, AK stock during 2012-2014 to establish baseline information on the content and variation of the beluga whale blood transcriptome. The Trinity transcriptome assembly from beluga was comprised of 91,325 transcripts that represented a wide array of cellular functions and processes and was extremely similar in content to the blood transcriptome of another cetacean, the bottlenose dolphin. Expression of hemoglobin transcripts was much lower in beluga (25.6% of TPM, transcripts per million) than has been observed in many other mammals. A T12A amino acid substitution in the HBB sequence of beluga whales, but not bottlenose dolphins, was identified and may play a role in low temperature adaptation. The beluga blood transcriptome was extremely stable between sex and year, with no apparent clustering of samples by principle components analysis and <4% of genes differentially expressed (EBseq, FDR<0.05). While the impacts of season, sexual maturity, disease, and geography on the beluga blood transcriptome must be established, the presence of transcripts involved in stress, detoxification, and immune functions indicate that blood gene expression analyses may provide information on health status and exposure. This study provides a wealth of transcriptomic data on beluga whales and provides a sizeable pool of preliminary data for comparison with other studies in beluga whale. Copyright © 2017 Elsevier B.V. All rights reserved.
Chauhan, Pallavi; Hansson, Bengt; Kraaijeveld, Ken; de Knijff, Peter; Svensson, Erik I; Wellenreuther, Maren
2014-09-22
There is growing interest in odonates (damselflies and dragonflies) as model organisms in ecology and evolutionary biology but the development of genomic resources has been slow. So far only one draft genome (Ladona fulva) and one transcriptome assembly (Enallagma hageni) have been published. Odonates have some of the most advanced visual systems among insects and several species are colour polymorphic, and genomic and transcriptomic data would allow studying the genomic architecture of these interesting traits and make detailed comparative studies between related species possible. Here, we present a comprehensive de novo transcriptome assembly for the blue-tailed damselfly Ischnura elegans (Odonata: Coenagrionidae) built from short-read RNA-seq data. The transcriptome analysis in this paper provides a first step towards identifying genes and pathways underlying the visual and colour systems in this insect group. Illumina RNA sequencing performed on tissues from the head, thorax and abdomen generated 428,744,100 paired-ends reads amounting to 110 Gb of sequence data, which was assembled de novo with Trinity. A transcriptome was produced after filtering and quality checking yielding a final set of 60,232 high quality transcripts for analysis. CEGMA software identified 247 out of 248 ultra-conserved core proteins as 'complete' in the transcriptome assembly, yielding a completeness of 99.6%. BLASTX and InterProScan annotated 55% of the assembled transcripts and showed that the three tissue types differed both qualitatively and quantitatively in I. elegans. Differential expression identified 8,625 transcripts to be differentially expressed in head, thorax and abdomen. Targeted analyses of vision and colour functional pathways identified the presence of four different opsin types and three pigmentation pathways. We also identified transcripts involved in temperature sensitivity, thermoregulation and olfaction. All these traits and their associated transcripts are of considerable ecological and evolutionary interest for this and other insect orders. Our work presents a comprehensive transcriptome resource for the ancient insect order Odonata and provides insight into their biology and physiology. The transcriptomic resource can provide a foundation for future investigations into this diverse group, including the evolution of colour, vision, olfaction and thermal adaptation.
Transcriptomics provides unique solutions for understanding the impact of complex mixtures and their components on aquatic systems. Here we describe the application of transcriptomics analysis of in situ fathead minnow exposures for assessing biological impacts of wastewater trea...
USDA-ARS?s Scientific Manuscript database
Natural rubber biosynthesis in guayule (Parthenium argentatum) is associated with moderately cold night temperatures. To begin to dissect the molecular events triggered by cold temperatures that govern rubber synthesis induction in guayule, the transcriptome of bark tissue, where rubber is produced...
USDA-ARS?s Scientific Manuscript database
Next generation sequencing technologies and improved bioinformatics methods have provided opportunities to study sequence variability in complex polyploid transcriptomes. In this study, we used a diverse panel of twenty-two Arachis accessions representing seven Arachis hypogaea market classes, A-, B...
Pal, Tarun; Malhotra, Nikhil; Chanumolu, Sree Krishna; Chauhan, Rajinder Singh
2015-07-01
The transcriptomes of Aconitum heterophyllum were assembled and characterized for the first time to decipher molecular components contributing to biosynthesis and accumulation of metabolites in tuberous roots. Aconitum heterophyllum Wall., popularly known as Atis, is a high-value medicinal herb of North-Western Himalayas. No information exists as of today on genetic factors contributing to the biosynthesis of secondary metabolites accumulating in tuberous roots, thereby, limiting genetic interventions towards genetic improvement of A. heterophyllum. Illumina paired-end sequencing followed by de novo assembly yielded 75,548 transcripts for root transcriptome and 39,100 transcripts for shoot transcriptome with minimum length of 200 bp. Biological role analysis of root versus shoot transcriptomes assigned 27,596 and 16,604 root transcripts; 12,340 and 9398 shoot transcripts into gene ontology and clusters of orthologous group, respectively. KEGG pathway mapping assigned 37 and 31 transcripts onto starch-sucrose metabolism while 329 and 341 KEGG orthologies associated with transcripts were found to be involved in biosynthesis of various secondary metabolites for root and shoot transcriptomes, respectively. In silico expression profiling of the mevalonate/2-C-methyl-D-erythritol 4-phosphate (non-mevalonate) pathway genes for aconites biosynthesis revealed 4 genes HMGR (3-hydroxy-3-methylglutaryl-CoA reductase), MVK (mevalonate kinase), MVDD (mevalonate diphosphate decarboxylase) and HDS (1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase) with higher expression in root transcriptome compared to shoot transcriptome suggesting their key role in biosynthesis of aconite alkaloids. Five genes, GMPase (geranyl diphosphate mannose pyrophosphorylase), SHAGGY, RBX1 (RING-box protein 1), SRF receptor kinases and β-amylase, implicated in tuberous root formation in other plant species showed higher levels of expression in tuberous roots compared to shoots. A total of 15,487 transcription factors belonging to bHLH, MYB, bZIP families and 399 ABC transporters which regulate biosynthesis and accumulation of bioactive compounds were identified in root and shoot transcriptomes. The expression of 5 ABC transporters involved in tuberous root development was validated by quantitative PCR analysis. Network connectivity diagrams were drawn for starch-sucrose metabolism and isoquinoline alkaloid biosynthesis associated with tuberous root growth and secondary metabolism, respectively, in root transcriptome of A. heterophyllum. The current endeavor will be of practical importance in planning a suitable genetic intervention strategy for the improvement of A. heterophyllum.
Profiling the venom gland transcriptomes of Costa Rican snakes by 454 pyrosequencing
2011-01-01
Background A long term research goal of venomics, of applied importance for improving current antivenom therapy, but also for drug discovery, is to understand the pharmacological potential of venoms. Individually or combined, proteomic and transcriptomic studies have demonstrated their feasibility to explore in depth the molecular diversity of venoms. In the absence of genome sequence, transcriptomes represent also valuable searchable databases for proteomic projects. Results The venom gland transcriptomes of 8 Costa Rican taxa from 5 genera (Crotalus, Bothrops, Atropoides, Cerrophidion, and Bothriechis) of pitvipers were investigated using high-throughput 454 pyrosequencing. 100,394 out of 330,010 masked reads produced significant hits in the available databases. 5.165,220 nucleotides (8.27%) were masked by RepeatMasker, the vast majority of which corresponding to class I (retroelements) and class II (DNA transposons) mobile elements. BLAST hits included 79,991 matches to entries of the taxonomic suborder Serpentes, of which 62,433 displayed similarity to documented venom proteins. Strong discrepancies between the transcriptome-computed and the proteome-gathered toxin compositions were obvious at first sight. Although the reasons underlaying this discrepancy are elusive, since no clear trend within or between species is apparent, the data indicate that individual mRNA species may be translationally controlled in a species-dependent manner. The minimum number of genes from each toxin family transcribed into the venom gland transcriptome of each species was calculated from multiple alignments of reads matched to a full-length reference sequence of each toxin family. Reads encoding ORF regions of Kazal-type inhibitor-like proteins were uniquely found in Bothriechis schlegelii and B. lateralis transcriptomes, suggesting a genus-specific recruitment event during the early-Middle Miocene. A transcriptome-based cladogram supports the large divergence between A. mexicanus and A. picadoi, and a closer kinship between A. mexicanus and C. godmani. Conclusions Our comparative next-generation sequencing (NGS) analysis reveals taxon-specific trends governing the formulation of the venom arsenal. Knowledge of the venom proteome provides hints on the translation efficiency of toxin-coding transcripts, contributing thereby to a more accurate interpretation of the transcriptome. The application of NGS to the analysis of snake venom transcriptomes, may represent the tool for opening the door to systems venomics. PMID:21605378
Transcriptome analysis of 20 taxonomically related benzylisoquinoline alkaloid-producing plants.
Hagel, Jillian M; Morris, Jeremy S; Lee, Eun-Jeong; Desgagné-Penix, Isabel; Bross, Crystal D; Chang, Limei; Chen, Xue; Farrow, Scott C; Zhang, Ye; Soh, Jung; Sensen, Christoph W; Facchini, Peter J
2015-09-18
Benzylisoquinoline alkaloids (BIAs) represent a diverse class of plant specialized metabolites sharing a common biosynthetic origin beginning with tyrosine. Many BIAs have potent pharmacological activities, and plants accumulating them boast long histories of use in traditional medicine and cultural practices. The decades-long focus on a select number of plant species as model systems has allowed near or full elucidation of major BIA pathways, including those of morphine, sanguinarine and berberine. However, this focus has created a dearth of knowledge surrounding non-model species, which also are known to accumulate a wide-range of BIAs but whose biosynthesis is thus far entirely unexplored. Further, these non-model species represent a rich source of catalyst diversity valuable to plant biochemists and emerging synthetic biology efforts. In order to access the genetic diversity of non-model plants accumulating BIAs, we selected 20 species representing 4 families within the Ranunculales. RNA extracted from each species was processed for analysis by both 1) Roche GS-FLX Titanium and 2) Illumina GA/HiSeq platforms, generating a total of 40 deep-sequencing transcriptome libraries. De novo assembly, annotation and subsequent full-length coding sequence (CDS) predictions indicated greater success for most species using the Illumina-based platform. Assembled data for each transcriptome were deposited into an established web-based BLAST portal ( www.phytometasyn.ca) to allow public access. Homology-based mining of libraries using BIA-biosynthetic enzymes as queries yielded ~850 gene candidates potentially involved in alkaloid biosynthesis. Expression analysis of these candidates was performed using inter-library FPKM normalization methods. These expression data provide a basis for the rational selection of gene candidates, and suggest possible metabolic bottlenecks within BIA metabolism. Phylogenetic analysis was performed for each of 15 different enzyme/protein groupings, highlighting many novel genes with potential involvement in the formation of one or more alkaloid types, including morphinan, aporphine, and phthalideisoquinoline alkaloids. Transcriptome resources were used to design and execute a case study of candidate N-methyltransferases (NMTs) from Glaucium flavum, which revealed predicted and novel enzyme activities. This study establishes an essential resource for the isolation and discovery of 1) functional homologues and 2) entirely novel catalysts within BIA metabolism. Functional analysis of G. flavum NMTs demonstrated the utility of this resource and underscored the importance of empirical determination of proposed enzymatic function. Publically accessible, fully annotated, BLAST-accessible transcriptomes were not previously available for most species included in this report, despite the rich repertoire of bioactive alkaloids found in these plants and their importance to traditional medicine. The results presented herein provide essential sequence information and inform experimental design for the continued elucidation of BIA metabolism.
Ponce, Dalia; Brinkman, Diane L.; Potriquet, Jeremy; Mulvenna, Jason
2016-01-01
Jellyfish venoms are rich sources of toxins designed to capture prey or deter predators, but they can also elicit harmful effects in humans. In this study, an integrated transcriptomic and proteomic approach was used to identify putative toxins and their potential role in the venom of the scyphozoan jellyfish Chrysaora fuscescens. A de novo tentacle transcriptome, containing more than 23,000 contigs, was constructed and used in proteomic analysis of C. fuscescens venom to identify potential toxins. From a total of 163 proteins identified in the venom proteome, 27 were classified as putative toxins and grouped into six protein families: proteinases, venom allergens, C-type lectins, pore-forming toxins, glycoside hydrolases and enzyme inhibitors. Other putative toxins identified in the transcriptome, but not the proteome, included additional proteinases as well as lipases and deoxyribonucleases. Sequence analysis also revealed the presence of ShKT domains in two putative venom proteins from the proteome and an additional 15 from the transcriptome, suggesting potential ion channel blockade or modulatory activities. Comparison of these potential toxins to those from other cnidarians provided insight into their possible roles in C. fuscescens venom and an overview of the diversity of potential toxin families in cnidarian venoms. PMID:27058558
Dried Blood Spot RNA Transcriptomes Correlate with Transcriptomes Derived from Whole Blood RNA.
Reust, Mary J; Lee, Myung Hee; Xiang, Jenny; Zhang, Wei; Xu, Dong; Batson, Tatiana; Zhang, Tuo; Downs, Jennifer A; Dupnik, Kathryn M
2018-05-01
Obtaining RNA from clinical samples collected in resource-limited settings can be costly and challenging. The goals of this study were to 1) optimize messenger RNA extraction from dried blood spots (DBS) and 2) determine how transcriptomes generated from DBS RNA compared with RNA isolated from blood collected in Tempus tubes. We studied paired samples collected from eight adults in rural Tanzania. Venous blood was collected on Whatman 903 Protein Saver cards and in tubes with RNA preservation solution. Our optimal DBS RNA extraction used 8 × 3-mm DBS punches as the starting material, bead beater disruption at maximum speed for 60 seconds, extraction with Illustra RNAspin Mini RNA Isolation kit, and purification with Zymo RNA Concentrator kit. Spearman correlations of normalized gene counts in DBS versus whole blood ranged from 0.887 to 0.941. Bland-Altman plots did not show a trend toward over- or under-counting at any gene size. We report a method to obtain sufficient RNA from DBS to generate a transcriptome. The DBS transcriptome gene counts correlated well with whole blood transcriptome gene counts. Dried blood spots for transcriptome studies could be an option when field conditions preclude appropriate collection, storage, or transport of whole blood for RNA studies.
Fathead minnow and zebrafish are among the most intensively studied fish species in environmental toxicogenomics. To aid the assessment and interpretation of subtle transcriptomic effects from treatment conditions of interest, there needs to be a better characterization and unde...
USDA-ARS?s Scientific Manuscript database
Sclerotinia sclerotiorum and S. trifoliorum are two closely related devastating plant pathogens. Extensive research has been conducted on S. sclerotiorum and its genome sequences are available. To take advantages of the genomic information of S. sclerotiorum, we compared the transcriptome of S. tr...
USDA-ARS?s Scientific Manuscript database
To analyze transcriptome response to virus infection, we have assembled currently available microarray data on changes in gene expression levels in compatible Arabidopsis-virus interactions. We used the mean r (Pearson’s correlation coefficient) for neighboring pairs to estimate pairwise local simil...
USDA-ARS?s Scientific Manuscript database
Aspergillus flavus and aflatoxin contamination in the field are known to be influenced by numerous stress factors, particularly drought and heat stress. However, the purpose of aflatoxin production is unknown. Here, we report transcriptome analyses comprised of 282.6 Gb of sequencing data describing...
USDA-ARS?s Scientific Manuscript database
Alternative splicing is a well-known phenomenon that dramatically increases eukaryotic transcriptome diversity. The extent of mRNA isoform diversity among porcine tissues was assessed using Pacific Biosciences single-molecule long-read isoform sequencing (Iso-Seq) and Illumina short read sequencing ...
USDA-ARS?s Scientific Manuscript database
Understanding the molecular and genetic mechanisms underlying variation in seed composition and contents among different genotypes is important for soybean oil quality improvement. We designed a bioinformatics approach to compare seed transcriptomes of 9 soybean genotypes varying in oil composition ...
Genomic and transcriptomic approaches to study immunology in cyprinids: What is next?
Petit, Jules; David, Lior; Dirks, Ron; Wiegertjes, Geert F
2017-10-01
Accelerated by the introduction of Next-Generation Sequencing (NGS), a number of genomes of cyprinid fish species have been drafted, leading to a highly valuable collective resource of comparative genome information on cyprinids (Cyprinidae). In addition, NGS-based transcriptome analyses of different developmental stages, organs, or cell types, increasingly contribute to the understanding of complex physiological processes, including immune responses. Cyprinids are a highly interesting family because they comprise one of the most-diversified families of teleosts and because of their variation in ploidy level, with diploid, triploid, tetraploid, hexaploid and sometimes even octoploid species. The wealth of data obtained from NGS technologies provides both challenges and opportunities for immunological research, which will be discussed here. Correct interpretation of ploidy effects on immune responses requires knowledge of the degree of functional divergence between duplicated genes, which can differ even between closely-related cyprinid fish species. We summarize NGS-based progress in analysing immune responses and discuss the importance of respecting the presence of (multiple) duplicated gene sequences when performing transcriptome analyses for detailed understanding of complex physiological processes. Progressively, advances in NGS technology are providing workable methods to further elucidate the implications of gene duplication events and functional divergence of duplicates genes and proteins involved in immune responses in cyprinids. We conclude with discussing how future applications of NGS technologies and analysis methods could enhance immunological research and understanding. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
The Physcomitrella patens gene atlas project: large-scale RNA-seq based expression data.
Perroud, Pierre-François; Haas, Fabian B; Hiss, Manuel; Ullrich, Kristian K; Alboresi, Alessandro; Amirebrahimi, Mojgan; Barry, Kerrie; Bassi, Roberto; Bonhomme, Sandrine; Chen, Haodong; Coates, Juliet C; Fujita, Tomomichi; Guyon-Debast, Anouchka; Lang, Daniel; Lin, Junyan; Lipzen, Anna; Nogué, Fabien; Oliver, Melvin J; Ponce de León, Inés; Quatrano, Ralph S; Rameau, Catherine; Reiss, Bernd; Reski, Ralf; Ricca, Mariana; Saidi, Younousse; Sun, Ning; Szövényi, Péter; Sreedasyam, Avinash; Grimwood, Jane; Stacey, Gary; Schmutz, Jeremy; Rensing, Stefan A
2018-07-01
High-throughput RNA sequencing (RNA-seq) has recently become the method of choice to define and analyze transcriptomes. For the model moss Physcomitrella patens, although this method has been used to help analyze specific perturbations, no overall reference dataset has yet been established. In the framework of the Gene Atlas project, the Joint Genome Institute selected P. patens as a flagship genome, opening the way to generate the first comprehensive transcriptome dataset for this moss. The first round of sequencing described here is composed of 99 independent libraries spanning 34 different developmental stages and conditions. Upon dataset quality control and processing through read mapping, 28 509 of the 34 361 v3.3 gene models (83%) were detected to be expressed across the samples. Differentially expressed genes (DEGs) were calculated across the dataset to permit perturbation comparisons between conditions. The analysis of the three most distinct and abundant P. patens growth stages - protonema, gametophore and sporophyte - allowed us to define both general transcriptional patterns and stage-specific transcripts. As an example of variation of physico-chemical growth conditions, we detail here the impact of ammonium supplementation under standard growth conditions on the protonemal transcriptome. Finally, the cooperative nature of this project allowed us to analyze inter-laboratory variation, as 13 different laboratories around the world provided samples. We compare differences in the replication of experiments in a single laboratory and between different laboratories. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.
Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud
Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.
2015-01-01
Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053
Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree
Rangiah, Kannan; Mahesh, HB; Rajamani, Anantharamanan; Shirke, Meghana D.; Russiachand, Heikham; Loganathan, Ramya Malarini; Shankara Lingu, Chandana; Siddappa, Shilpa; Ramamurthy, Aishwarya; Sathyanarayana, BN
2015-01-01
Neem (Azadirachta indica A. Juss) is one of the most versatile tropical evergreen tree species known in India since the Vedic period (1500 BC–600 BC). Neem tree is a rich source of limonoids, having a wide spectrum of activity against insect pests and microbial pathogens. Complex tetranortriterpenoids such as azadirachtin, salanin and nimbin are the major active principles isolated from neem seed. Absolutely nothing is known about the biochemical pathways of these metabolites in neem tree. To identify genes and pathways in neem, we sequenced neem genomes and transcriptomes using next generation sequencing technologies. Assembly of Illumina and 454 sequencing reads resulted in 267 Mb, which accounts for 70% of estimated size of neem genome. We predicted 44,495 genes in the neem genome, of which 32,278 genes were expressed in neem tissues. Neem genome consists about 32.5% (87 Mb) of repetitive DNA elements. Neem tree is phylogenetically related to citrus, Citrus sinensis. Comparative analysis anchored 62% (161 Mb) of assembled neem genomic contigs onto citrus chromomes. Ultrahigh performance liquid chromatography-mass spectrometry-selected reaction monitoring (UHPLC-MS/SRM) method was used to quantify azadirachtin, nimbin, and salanin from neem tissues. Weighted Correlation Network Analysis (WCGNA) of expressed genes and metabolites resulted in identification of possible candidate genes involved in azadirachtin biosynthesis pathway. This study provides genomic, transcriptomic and quantity of top three neem metabolites resource, which will accelerate basic research in neem to understand biochemical pathways. PMID:26290780
Meng, Fanli; Yang, Mingyu; Li, Yang; Li, Tianyu; Liu, Xinxin; Wang, Guoyue; Wang, Zhanchun; Jin, Xianhao; Li, Wenbin
2018-01-01
RNA interference (RNAi) is useful for controlling pests of agriculturally important crops. The soybean pod borer (SPB) is the most important soybean pest in Northeastern Asia. In an earlier study, we confirmed that the SPB could be controlled via transgenic plant-mediated RNAi. Here, the SPB transcriptome was sequenced to identify RNAi-related genes, and also to establish an RNAi-of-RNAi assay system for evaluating genes involved in the SPB systemic RNAi response. The core RNAi genes, as well as genes potentially involved in double-stranded RNA (dsRNA) uptake were identified based on SPB transcriptome sequences. A phylogenetic analysis and the characterization of these core components as well as dsRNA uptake related genes revealed that they contain conserved domains essential for the RNAi pathway. The results of the RNAi-of-RNAi assay involving Laccas e 2 (a critical cuticle pigmentation gene) as a marker showed that genes encoding the sid-like ( Sil1 ), scavenger receptor class C ( Src ), and scavenger receptor class B ( Srb3 and Srb4 ) proteins of the endocytic pathway were required for SPB cellular uptake of dsRNA. The SPB response was inferred to contain three functional small RNA pathways (i.e., miRNA, siRNA, and piRNA pathways). Additionally, the SPB systemic RNA response may rely on systemic RNA interference deficient transmembrane channel-mediated and receptor-mediated endocytic pathways. The results presented herein may be useful for developing RNAi-mediated methods to control SPB infestations in soybean.
Meng, Fanli; Yang, Mingyu; Li, Yang; Li, Tianyu; Liu, Xinxin; Wang, Guoyue; Wang, Zhanchun; Jin, Xianhao; Li, Wenbin
2018-01-01
RNA interference (RNAi) is useful for controlling pests of agriculturally important crops. The soybean pod borer (SPB) is the most important soybean pest in Northeastern Asia. In an earlier study, we confirmed that the SPB could be controlled via transgenic plant-mediated RNAi. Here, the SPB transcriptome was sequenced to identify RNAi-related genes, and also to establish an RNAi-of-RNAi assay system for evaluating genes involved in the SPB systemic RNAi response. The core RNAi genes, as well as genes potentially involved in double-stranded RNA (dsRNA) uptake were identified based on SPB transcriptome sequences. A phylogenetic analysis and the characterization of these core components as well as dsRNA uptake related genes revealed that they contain conserved domains essential for the RNAi pathway. The results of the RNAi-of-RNAi assay involving Laccase 2 (a critical cuticle pigmentation gene) as a marker showed that genes encoding the sid-like (Sil1), scavenger receptor class C (Src), and scavenger receptor class B (Srb3 and Srb4) proteins of the endocytic pathway were required for SPB cellular uptake of dsRNA. The SPB response was inferred to contain three functional small RNA pathways (i.e., miRNA, siRNA, and piRNA pathways). Additionally, the SPB systemic RNA response may rely on systemic RNA interference deficient transmembrane channel-mediated and receptor-mediated endocytic pathways. The results presented herein may be useful for developing RNAi-mediated methods to control SPB infestations in soybean. PMID:29773992
Sager, Monica; Yeat, Nai Chien; Pajaro-Van der Stadt, Stefan; Lin, Charlotte; Ren, Qiuyin; Lin, Jimmy
2015-01-01
Transcriptomic technologies are evolving to diagnose cancer earlier and more accurately to provide greater predictive and prognostic utility to oncologists and patients. Digital techniques such as RNA sequencing are replacing still-imaging techniques to provide more detailed analysis of the transcriptome and aberrant expression that causes oncogenesis, while companion diagnostics are developing to determine the likely effectiveness of targeted treatments. This article examines recent advancements in molecular profiling research and technology as applied to cancer diagnosis, clinical applications and predictions for the future of personalized medicine in oncology.
Reddy, Sreekanth P; Britto, Ramona; Vinnakota, Katyayni; Aparna, Hebbar; Sreepathi, Hari Kishore; Thota, Balaram; Kumari, Arpana; Shilpa, B M; Vrinda, M; Umesh, Srikantha; Samuel, Cini; Shetty, Mitesh; Tandon, Ashwani; Pandey, Paritosh; Hegde, Sridevi; Hegde, A S; Balasubramaniam, Anandh; Chandramouli, B A; Santosh, Vani; Kondaiah, Paturu; Somasundaram, Kumaravel; Rao, M R Satyanarayana
2008-05-15
Current methods of classification of astrocytoma based on histopathologic methods are often subjective and less accurate. Although patients with glioblastoma have grave prognosis, significant variability in patient outcome is observed. Therefore, the aim of this study was to identify glioblastoma diagnostic and prognostic markers through microarray analysis. We carried out transcriptome analysis of 25 diffusely infiltrating astrocytoma samples [WHO grade II--diffuse astrocytoma, grade III--anaplastic astrocytoma, and grade IV--glioblastoma (GBM)] using cDNA microarrays containing 18,981 genes. Several of the markers identified were also validated by real-time reverse transcription quantitative PCR and immunohistochemical analysis on an independent set of tumor samples (n = 100). Survival analysis was carried out for two markers on another independent set of retrospective cases (n = 51). We identified several differentially regulated grade-specific genes. Independent validation by real-time reverse transcription quantitative PCR analysis found growth arrest and DNA-damage-inducible alpha (GADD45alpha) and follistatin-like 1 (FSTL1) to be up-regulated in most GBMs (both primary and secondary), whereas superoxide dismutase 2 and adipocyte enhancer binding protein 1 were up-regulated in the majority of primary GBM. Further, identification of the grade-specific expression of GADD45alpha and FSTL1 by immunohistochemical staining reinforced our findings. Analysis of retrospective GBM cases with known survival data revealed that cytoplasmic overexpression of GADD45alpha conferred better survival while the coexpression of FSTL1 with p53 was associated with poor survival. Our study reveals that GADD45alpha and FSTLI are GBM-specific whereas superoxide dismutase 2 and adipocyte enhancer binding protein 1 are primary GBM-specific diagnostic markers. Whereas GADD45alpha overexpression confers a favorable prognosis, FSTL1 overexpression is a hallmark of poor prognosis in GBM patients.
Kang, Yun; McMillan, Ian; Norris, Michael H; Hoang, Tung T
2015-07-01
Until recently, transcriptome analyses of single cells have been confined to eukaryotes. The information obtained from single-cell transcripts can provide detailed insight into spatiotemporal gene expression, and it could be even more valuable if expanded to prokaryotic cells. Transcriptome analysis of single prokaryotic cells is a recently developed and powerful tool. Here we describe a procedure that allows amplification of the total transcript of a single prokaryotic cell for in-depth analysis. This is performed by using a laser-capture microdissection instrument for single-cell isolation, followed by reverse transcription via Moloney murine leukemia virus, degradation of chromosomal DNA with McrBC and DpnI restriction enzymes, single-stranded cDNA (ss-cDNA) ligation using T4 polynucleotide kinase and CircLigase, and polymerization of ss-cDNA to double-stranded cDNA (ds-cDNA) by Φ29 polymerase. This procedure takes ∼5 d, and sufficient amounts of ds-cDNA can be obtained from single-cell RNA template for further microarray analysis.
Oh, Dong-Ha; Barkla, Bronwyn J; Vera-Estrella, Rosario; Pantoja, Omar; Lee, Sang-Yeol; Bohnert, Hans J; Dassanayake, Maheshi
2015-08-01
Mesembryanthemum crystallinum (ice plant) exhibits extreme tolerance to salt. Epidermal bladder cells (EBCs), developing on the surface of aerial tissues and specialized in sodium sequestration and other protective functions, are critical for the plant's stress adaptation. We present the first transcriptome analysis of EBCs isolated from intact plants, to investigate cell type-specific responses during plant salt adaptation. We developed a de novo assembled, nonredundant EBC reference transcriptome. Using RNAseq, we compared the expression patterns of the EBC-specific transcriptome between control and salt-treated plants. The EBC reference transcriptome consists of 37 341 transcript-contigs, of which 7% showed significantly different expression between salt-treated and control samples. We identified significant changes in ion transport, metabolism related to energy generation and osmolyte accumulation, stress signalling, and organelle functions, as well as a number of lineage-specific genes of unknown function, in response to salt treatment. The salinity-induced EBC transcriptome includes active transcript clusters, refuting the view of EBCs as passive storage compartments in the whole-plant stress response. EBC transcriptomes, differing from those of whole plants or leaf tissue, exemplify the importance of cell type-specific resolution in understanding stress adaptive mechanisms. No claim to original US government works. New Phytologist © 2015 New Phytologist Trust.
Xu, Hai-Ming; Kong, Xiang-Dong; Chen, Fei; Huang, Ji-Xiang; Lou, Xiang-Yang; Zhao, Jian-Yi
2015-10-24
Brassica napus is an important oilseed crop. Dissection of the genetic architecture underlying oil-related biological processes will greatly facilitates the genetic improvement of rapeseed. The differential gene expression during pod development offers a snapshot on the genes responsible for oil accumulation in. To identify candidate genes in the linkage peaks reported previously, we used RNA sequencing (RNA-Seq) technology to analyze the pod transcriptomes of German cultivar Sollux and Chinese inbred line Gaoyou. The RNA samples were collected for RNA-Seq at 5-7, 15-17 and 25-27 days after flowering (DAF). Bioinformatics analysis was performed to investigate differentially expressed genes (DEGs). Gene annotation analysis was integrated with QTL mapping and Brassica napus pod transcriptome profiling to detect potential candidate genes in oilseed. Four hundred sixty five and two thousand, one hundred fourteen candidate DEGs were identified, respectively, between two varieties at the same stages and across different periods of each variety. Then, 33 DEGs between Sollux and Gaoyou were identified as the candidate genes affecting seed oil content by combining those DEGs with the quantitative trait locus (QTL) mapping results, of which, one was found to be homologous to Arabidopsis thaliana lipid-related genes. Intervarietal DEGs of lipid pathways in QTL regions represent important candidate genes for oil-related traits. Integrated analysis of transcriptome profiling, QTL mapping and comparative genomics with other relative species leads to efficient identification of most plausible functional genes underlying oil-content related characters, offering valuable resources for bettering breeding program of Brassica napus. This study provided a comprehensive overview on the pod transcriptomes of two varieties with different oil-contents at the three developmental stages.
Methods, Tools and Current Perspectives in Proteogenomics *
Ruggles, Kelly V.; Krug, Karsten; Wang, Xiaojing; Clauser, Karl R.; Wang, Jing; Payne, Samuel H.; Fenyö, David; Zhang, Bing; Mani, D. R.
2017-01-01
With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications. PMID:28456751
Meta-analytic framework for liquid association.
Wang, Lin; Liu, Silvia; Ding, Ying; Yuan, Shin-Sheng; Ho, Yen-Yi; Tseng, George C
2017-07-15
Although coexpression analysis via pair-wise expression correlation is popularly used to elucidate gene-gene interactions at the whole-genome scale, many complicated multi-gene regulations require more advanced detection methods. Liquid association (LA) is a powerful tool to detect the dynamic correlation of two gene variables depending on the expression level of a third variable (LA scouting gene). LA detection from single transcriptomic study, however, is often unstable and not generalizable due to cohort bias, biological variation and limited sample size. With the rapid development of microarray and NGS technology, LA analysis combining multiple gene expression studies can provide more accurate and stable results. In this article, we proposed two meta-analytic approaches for LA analysis (MetaLA and MetaMLA) to combine multiple transcriptomic studies. To compensate demanding computing, we also proposed a two-step fast screening algorithm for more efficient genome-wide screening: bootstrap filtering and sign filtering. We applied the methods to five Saccharomyces cerevisiae datasets related to environmental changes. The fast screening algorithm reduced 98% of running time. When compared with single study analysis, MetaLA and MetaMLA provided stronger detection signal and more consistent and stable results. The top triplets are highly enriched in fundamental biological processes related to environmental changes. Our method can help biologists understand underlying regulatory mechanisms under different environmental exposure or disease states. A MetaLA R package, data and code for this article are available at http://tsenglab.biostat.pitt.edu/software.htm. ctseng@pitt.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
2010-01-01
Background Fruit development, maturation and ripening consists of a complex series of biochemical and physiological changes that in climacteric fruits, including apple and tomato, are coordinated by the gaseous hormone ethylene. These changes lead to final fruit quality and understanding of the functional machinery underlying these processes is of both biological and practical importance. To date many reports have been made on the analysis of gene expression in apple. In this study we focused our investigation on the role of ethylene during apple maturation, specifically comparing transcriptomics of normal ripening with changes resulting from application of the hormone receptor competitor 1-Methylcyclopropene. Results To gain insight into the molecular process regulating ripening in apple, and to compare to tomato (model species for ripening studies), we utilized both homologous and heterologous (tomato) microarray to profile transcriptome dynamics of genes involved in fruit development and ripening, emphasizing those which are ethylene regulated. The use of both types of microarrays facilitated transcriptome comparison between apple and tomato (for the later using data previously published and available at the TED: tomato expression database) and highlighted genes conserved during ripening of both species, which in turn represent a foundation for further comparative genomic studies. The cross-species analysis had the secondary aim of examining the efficiency of heterologous (specifically tomato) microarray hybridization for candidate gene identification as related to the ripening process. The resulting transcriptomics data revealed coordinated gene expression during fruit ripening of a subset of ripening-related and ethylene responsive genes, further facilitating the analysis of ethylene response during fruit maturation and ripening. Conclusion Our combined strategy based on microarray hybridization enabled transcriptome characterization during normal climacteric apple ripening, as well as definition of ethylene-dependent transcriptome changes. Comparison with tomato fruit maturation and ethylene responsive transcriptome activity facilitated identification of putative conserved orthologous ripening-related genes, which serve as an initial set of candidates for assessing conservation of gene activity across genomes of fruit bearing plant species. PMID:20973957
TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data.
Lim, Jae Hyun; Lee, Soo Youn; Kim, Ju Han
2017-03-01
High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.
Huang, Zixia; Gallot, Aurore; Lao, Nga T; Puechmaille, Sébastien J; Foley, Nicole M; Jebb, David; Bekaert, Michaël; Teeling, Emma C
2016-01-01
The acquisition of tissue samples from wild populations is a constant challenge in conservation biology, especially for endangered species and protected species where nonlethal sampling is the only option. Whole blood has been suggested as a nonlethal sample type that contains a high percentage of bodywide and genomewide transcripts and therefore can be used to assess the transcriptional status of an individual, and to infer a high percentage of the genome. However, only limited quantities of blood can be nonlethally sampled from small species and it is not known if enough genetic material is contained in only a few drops of blood, which represents the upper limit of sample collection for some small species. In this study, we developed a nonlethal sampling method, the laboratory protocols and a bioinformatic pipeline to sequence and assemble the whole blood transcriptome, using Illumina RNA-Seq, from wild greater mouse-eared bats (Myotis myotis). For optimal results, both ribosomal and globin RNAs must be removed before library construction. Treatment of DNase is recommended but not required enabling the use of smaller amounts of starting RNA. A large proportion of protein-coding genes (61%) in the genome were expressed in the blood transcriptome, comparable to brain (65%), kidney (63%) and liver (58%) transcriptomes, and up to 99% of the mitogenome (excluding D-loop) was recovered in the RNA-Seq data. In conclusion, this nonlethal blood sampling method provides an opportunity for a genomewide transcriptomic study of small, endangered or critically protected species, without sacrificing any individuals. © 2015 John Wiley & Sons Ltd.
Morris, Renée; Mehta, Prachi
2018-01-01
In mammals, the central nervous system (CNS) is constituted of various cellular elements, posing a challenge to isolating specific cell types to investigate their expression profile. As a result, tissue homogenization is not amenable to analyses of motor neurons profiling as these represent less than 10% of the total spinal cord cell population. One way to tackle the problem of tissue heterogeneity and obtain meaningful genomic, proteomic, and transcriptomic profiling is to use laser capture microdissection technology (LCM). In this chapter, we describe protocols for the capture of isolated populations of motor neurons from spinal cord tissue sections and for downstream transcriptomic analysis of motor neurons with RT-PCR. We have also included a protocol for the immunological confirmation that the captured neurons are indeed motor neurons. Although focused on spinal cord motor neurons, these protocols can be easily optimized for the isolation of any CNS neurons.
Yassour, Moran; Grabherr, Manfred; Blood, Philip D.; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D.; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N.; Henschel, Robert; LeDuc, Richard D.; Friedman, Nir; Regev, Aviv
2013-01-01
De novo assembly of RNA-Seq data allows us to study transcriptomes without the need for a genome sequence, such as in non-model organisms of ecological and evolutionary importance, cancer samples, or the microbiome. In this protocol, we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms. We also present Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples, and approaches to identify protein coding genes. In an included tutorial we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sf.net. PMID:23845962
Leaps and lulls in the developmental transcriptome of Dictyostelium discoideum.
Rosengarten, Rafael David; Santhanam, Balaji; Fuller, Danny; Katoh-Kurasawa, Mariko; Loomis, William F; Zupan, Blaz; Shaulsky, Gad
2015-04-13
Development of the soil amoeba Dictyostelium discoideum is triggered by starvation. When placed on a solid substrate, the starving solitary amoebae cease growth, communicate via extracellular cAMP, aggregate by tens of thousands and develop into multicellular organisms. Early phases of the developmental program are often studied in cells starved in suspension while cAMP is provided exogenously. Previous studies revealed massive shifts in the transcriptome under both developmental conditions and a close relationship between gene expression and morphogenesis, but were limited by the sampling frequency and the resolution of the methods. Here, we combine the superior depth and specificity of RNA-seq-based analysis of mRNA abundance with high frequency sampling during filter development and cAMP pulsing in suspension. We found that the developmental transcriptome exhibits mostly gradual changes interspersed by a few instances of large shifts. For each time point we treated the entire transcriptome as single phenotype, and were able to characterize development as groups of similar time points separated by gaps. The grouped time points represented gradual changes in mRNA abundance, or molecular phenotype, and the gaps represented times during which many genes are differentially expressed rapidly, and thus the phenotype changes dramatically. Comparing developmental experiments revealed that gene expression in filter developed cells lagged behind those treated with exogenous cAMP in suspension. The high sampling frequency revealed many genes whose regulation is reproducibly more complex than indicated by previous studies. Gene Ontology enrichment analysis suggested that the transition to multicellularity coincided with rapid accumulation of transcripts associated with DNA processes and mitosis. Later development included the up-regulation of organic signaling molecules and co-factor biosynthesis. Our analysis also demonstrated a high level of synchrony among the developing structures throughout development. Our data describe D. discoideum development as a series of coordinated cellular and multicellular activities. Coordination occurred within fields of aggregating cells and among multicellular bodies, such as mounds or migratory slugs that experience both cell-cell contact and various soluble signaling regimes. These time courses, sampled at the highest temporal resolution to date in this system, provide a comprehensive resource for studies of developmental gene expression.
Debey-Pascher, Svenja; Hofmann, Andrea; Kreusch, Fatima; Schuler, Gerold; Schuler-Thurner, Beatrice; Schultze, Joachim L.; Staratschek-Jox, Andrea
2011-01-01
Microarray-based transcriptome analysis of peripheral blood as surrogate tissue has become an important approach in clinical implementations. However, application of gene expression profiling in routine clinical settings requires careful consideration of the influence of sample handling and RNA isolation methods on gene expression profile outcome. We evaluated the effect of different sample preservation strategies (eg, cryopreservation of peripheral blood mononuclear cells or freezing of PAXgene-stabilized whole blood samples) on gene expression profiles. Expression profiles obtained from cryopreserved peripheral blood mononuclear cells differed substantially from those of their nonfrozen counterpart samples. Furthermore, expression profiles in cryopreserved peripheral blood mononuclear cell samples were found to undergo significant alterations with increasing storage period, whereas long-term freezing of PAXgene RNA stabilized whole blood samples did not significantly affect stability of gene expression profiles. This report describes important technical aspects contributing toward the establishment of robust and reliable guidance for gene expression studies using peripheral blood and provides a promising strategy for reliable implementation in routine handling for diagnostic purposes. PMID:21704280
Tang, Qin; Iyer, Sowmya; Lobbardi, Riadh; Moore, John C; Chen, Huidong; Lareau, Caleb; Hebert, Christine; Shaw, McKenzie L; Neftel, Cyril; Suva, Mario L; Ceol, Craig J; Bernards, Andre; Aryee, Martin; Pinello, Luca; Drummond, Iain A; Langenau, David M
2017-10-02
Recent advances in single-cell, transcriptomic profiling have provided unprecedented access to investigate cell heterogeneity during tissue and organ development. In this study, we used massively parallel, single-cell RNA sequencing to define cell heterogeneity within the zebrafish kidney marrow, constructing a comprehensive molecular atlas of definitive hematopoiesis and functionally distinct renal cells found in adult zebrafish. Because our method analyzed blood and kidney cells in an unbiased manner, our approach was useful in characterizing immune-cell deficiencies within DNA-protein kinase catalytic subunit ( prkdc ), interleukin-2 receptor γ a ( il2rga ), and double-homozygous-mutant fish, identifying blood cell losses in T, B, and natural killer cells within specific genetic mutants. Our analysis also uncovered novel cell types, including two classes of natural killer immune cells, classically defined and erythroid-primed hematopoietic stem and progenitor cells, mucin-secreting kidney cells, and kidney stem/progenitor cells. In total, our work provides the first, comprehensive, single-cell, transcriptomic analysis of kidney and marrow cells in the adult zebrafish. © 2017 Tang et al.
Iyer, Sowmya; Lobbardi, Riadh; Chen, Huidong; Hebert, Christine; Shaw, McKenzie L.; Neftel, Cyril; Suva, Mario L.; Bernards, Andre; Aryee, Martin; Drummond, Iain A.
2017-01-01
Recent advances in single-cell, transcriptomic profiling have provided unprecedented access to investigate cell heterogeneity during tissue and organ development. In this study, we used massively parallel, single-cell RNA sequencing to define cell heterogeneity within the zebrafish kidney marrow, constructing a comprehensive molecular atlas of definitive hematopoiesis and functionally distinct renal cells found in adult zebrafish. Because our method analyzed blood and kidney cells in an unbiased manner, our approach was useful in characterizing immune-cell deficiencies within DNA–protein kinase catalytic subunit (prkdc), interleukin-2 receptor γ a (il2rga), and double-homozygous–mutant fish, identifying blood cell losses in T, B, and natural killer cells within specific genetic mutants. Our analysis also uncovered novel cell types, including two classes of natural killer immune cells, classically defined and erythroid-primed hematopoietic stem and progenitor cells, mucin-secreting kidney cells, and kidney stem/progenitor cells. In total, our work provides the first, comprehensive, single-cell, transcriptomic analysis of kidney and marrow cells in the adult zebrafish. PMID:28878000
Kunnath-Velayudhan, Shajo; Porcelli, Steven A
2018-05-01
Intracellular cytokine staining (ICS) is a powerful method for identifying functionally distinct lymphocyte subsets, and for isolating these by fluorescence activated cell sorting (FACS). Although transcriptomic analysis of cells sorted on the basis of ICS has many potential applications, this is rarely performed because of the difficulty in isolating intact RNA from cells processed using standard fixation and permeabilization buffers for ICS. To address this issue, we compared three buffers shown previously to preserve RNA in nonhematopoietic cells subjected to intracellular staining for their effects on RNA isolated from T lymphocytes processed for ICS. Our results showed that buffers containing the recombinant ribonuclease inhibitor RNasin or high molar concentrations of salt yielded intact RNA from fixed and permeabilized T cells. As proof of principle, we successfully used the buffer containing RNasin to isolate intact RNA from CD4 + T cells that were sorted by FACS on the basis of specific cytokine production, thus demonstrating the potential of this approach for coupling ICS with transcriptomic analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Li, Qinghong; Freeman, Lisa M; Rush, John E; Huggins, Gordon S; Kennedy, Adam D; Labuda, Jeffrey A; Laflamme, Dorothy P; Hannah, Steven S
2015-08-01
Canine degenerative mitral valve disease (DMVD) is the most common form of heart disease in dogs. The objective of this study was to identify cellular and metabolic pathways that play a role in DMVD by performing metabolomics and transcriptomics analyses on serum and tissue (mitral valve and left ventricle) samples previously collected from dogs with DMVD or healthy hearts. Gas or liquid chromatography followed by mass spectrophotometry were used to identify metabolites in serum. Transcriptomics analysis of tissue samples was completed using RNA-seq, and selected targets were confirmed by RT-qPCR. Random Forest analysis was used to classify the metabolites that best predicted the presence of DMVD. Results identified 41 known and 13 unknown serum metabolites that were significantly different between healthy and DMVD dogs, representing alterations in fat and glucose energy metabolism, oxidative stress, and other pathways. The three metabolites with the greatest single effect in the Random Forest analysis were γ-glutamylmethionine, oxidized glutathione, and asymmetric dimethylarginine. Transcriptomics analysis identified 812 differentially expressed transcripts in left ventricle samples and 263 in mitral valve samples, representing changes in energy metabolism, antioxidant function, nitric oxide signaling, and extracellular matrix homeostasis pathways. Many of the identified alterations may benefit from nutritional or medical management. Our study provides evidence of the growing importance of integrative approaches in multi-omics research in veterinary and nutritional sciences.
Chiara, Matteo; Horner, David S; Spada, Alberto
2013-01-01
De novo transcriptome characterization from Next Generation Sequencing data has become an important approach in the study of non-model plants. Despite notable advances in the assembly of short reads, the clustering of transcripts into unigene-like (locus-specific) clusters remains a somewhat neglected subject. Indeed, closely related paralogous transcripts are often merged into single clusters by current approaches. Here, a novel heuristic method for locus-specific clustering is compared to that implemented in the de novo assembler Oases, using the same initial transcript collections, derived from Arabidopsis thaliana and the developmental model Streptocarpus rexii. We show that the proposed approach improves cluster specificity in the A. thaliana dataset for which the reference genome is available. Furthermore, for the S. rexii data our filtered transcript collection matches a larger number of distinct annotated loci in reference genomes than the Oases set, while containing a reduced overall number of loci. A detailed discussion of advantages and limitations of our approach in processing de novo transcriptome reconstructions is presented. The proposed method should be widely applicable to other organisms, irrespective of the transcript assembly method employed. The S. rexii transcriptome is available as a sophisticated and augmented publicly available online database.
Tao, Xiang; Lai, Xian-Jun; Zhang, Yi-Zheng; Tan, Xue-Mei; Wang, Haiyan
2014-01-01
Background Transposable elements (TEs) are the most abundant genomic components in eukaryotes and affect the genome by their replications and movements to generate genetic plasticity. Sweet potato performs asexual reproduction generally and the TEs may be an important genetic factor for genome reorganization. Complete identification of TEs is essential for the study of genome evolution. However, the TEs of sweet potato are still poorly understood because of its complex hexaploid genome and difficulty in genome sequencing. The recent availability of the sweet potato transcriptome databases provides an opportunity for discovering and characterizing the expressed TEs. Methodology/Principal Findings We first established the integrated-transcriptome database by de novo assembling four published sweet potato transcriptome databases from three cultivars in China. Using sequence-similarity search and analysis, a total of 1,405 TEs including 883 retrotransposons and 522 DNA transposons were predicted and categorized. Depending on mapping sets of RNA-Seq raw short reads to the predicted TEs, we compared the quantities, classifications and expression activities of TEs inter- and intra-cultivars. Moreover, the differential expressions of TEs in seven tissues of Xushu 18 cultivar were analyzed by using Illumina digital gene expression (DGE) tag profiling. It was found that 417 TEs were expressed in one or more tissues and 107 in all seven tissues. Furthermore, the copy number of 11 transposase genes was determined to be 1–3 copies in the genome of sweet potato by Real-time PCR-based absolute quantification. Conclusions/Significance Our result provides a new method for TE searching on species with transcriptome sequences while lacking genome information. The searching, identification and expression analysis of TEs will provide useful TE information in sweet potato, which are valuable for the further studies of TE-mediated gene mutation and optimization in asexual reproduction. It contributes to elucidating the roles of TEs in genome evolution. PMID:24608103
Comparative de novo transcriptome analysis of male and female Sea buckthorn.
Bansal, Ankush; Salaria, Mehul; Sharma, Tashil; Stobdan, Tsering; Kant, Anil
2018-02-01
Sea buckthorn is a dioecious medicinal plant found at high altitude. The plant has both male and female reproductive organs in separate individuals. In this article, whole transcriptome de novo assemblies of male and female flower bud samples were carried out using Illumina NextSeq 500 platform to determine the role of the genes involved in sex determination. Moreover, genes with differential expression in male and female transcriptomes were identified to understand the underlying sex determination mechanism. The current study showed 63,904 and 62,272 coding sequences (CDS) in female and male transcriptome data sets, respectively. 16,831 common CDS were screened out from both transcriptomes, out of which 625 were upregulated and 491 were found to be downregulated. To understand the potential regulatory roles of differentially expressed genes in metabolic networks and biosynthetic pathways: KEGG mapping, gene ontology, and co-expression network analysis were performed. Comparison with Flowering Interactive Database (FLOR-ID) resulted in eight differentially expressed genes viz. CHD3-type chromatin-remodeling factor PICKLE ( PKL ), phytochrome-associated serine/threonine-protein phosphatase ( FYPP ), protein TOPLESS ( TPL ), sensitive to freezing 6 ( SFR6 ), lysine-specific histone demethylase 1 homolog 1 ( LDL1 ), pre-mRNA-processing-splicing factor 8A ( PRP8A ), sucrose synthase 4 ( SUS4 ), ubiquitin carboxyl-terminal hydrolase 12 ( UBP12 ), known to be broadly involved in flowering, photoperiodism, embryo development, and cold response pathways. Male and female flower bud transcriptome data of Sea buckthorn may provide comprehensive information at genomic level for the identification of genetic regulation involved in sex determination.
Oral Neutrophil Transcriptome Changes Result in a Pro-Survival Phenotype in Periodontal Diseases
Lakschevitz, Flavia S.; Aboodi, Guy M.; Glogauer, Michael
2013-01-01
Background Periodontal diseases are inflammatory processes that occur following the influx of neutrophils into the periodontal tissues in response to the subgingival bacterial biofilm. Current literature suggests that while neutrophils are protective and prevent bacterial infections, they also appear to contribute to damage of the periodontal tissues. In the present study we compare the gene expression profile changes in neutrophils as they migrate from the circulation into the oral tissues in patients with chronic periodontits and matched healthy subjects. We hypothesized that oral neutrophils in periodontal disease patients will display a disease specific transcriptome that differs from the oral neutrophil of healthy subjects. Methods Venous blood and oral rinse samples were obtained from healthy subjects and chronic periodontitis patients for neutrophil isolation. mRNA was isolated from the neutrophils, and gene expression microarray analysis was completed. Results were confirmed for specific genes of interest by qRT-PCR and Western Blot analysis. Results and Discussion Chronic periodontitis patients presented with increased recruitment of neutrophils to the oral cavity. Gene expression analysis revealed differences in the expression levels of genes from several biological pathways. Using hierarchical clustering analysis, we found that the apoptosis network was significantly altered in patients with chronic inflammation in the oral cavity, with up-regulation of pro-survival members of the Bcl-2 family and down-regulation of pro-apoptosis members in the same compartment. Additional functional analysis confirmed that the percentages of viable neutrophils are significantly increased in the oral cavity of chronic periodontitis patients. Conclusions Oral neutrophils from patients with periodontal disease displayed an altered transcriptome following migration into the oral tissues. This resulted in a pro-survival neutrophil phenotype in chronic periodontitis patients when compared with healthy subjects, resulting in a longer-lived neutrophil. This is likely to impact the severity and length of the inflammatory response in this oral disease. PMID:23874838
USDA-ARS?s Scientific Manuscript database
The soybean transcriptome displays strong variation along the day in optimal growth conditions and also in response to adverse circumstances, like drought stress. However, no study conducted to date has presented suitable reference genes, with stable expression along the day, for relative gene expre...
USDA-ARS?s Scientific Manuscript database
The whitefly (Bemisia tabaci) causes tremendous damage to cotton production worldwide. However, very limited information is available about how plants perceive and defend themselves from this destructive pest. In this study, the transcriptomics differences between two cotton cultivars that exhibit e...
USDA-ARS?s Scientific Manuscript database
The woody resurrection plant Myrothamnus flabellifolia has remarkable tolerance to desiccation. Pyro-sequencing technology permitted us to analyze the transcriptome of M. flabellifolia during both dehydration and rehydration. We identified a total of 8287 and 8542 differentially transcribed genes du...
Amber J. Vanden Wymelenberg; Jill Gaskell; Michael Mozuch; Grzegorz Sabat; John Ralph; Oleksandr Skyba; Shawn D Mansfield; Robert A. Blanchette; Diego Martinez; Igor Grigoriev; Philip J Kersten; Daniel Cullen
2010-01-01
Cellulose degradation by brown rot fungi, such as Postia placenta, is poorly understood relative to the phylogenetically related white rot basidiomycete, Phanerochaete chrysosporium. To elucidate the number, structure, and regulation of genes involved in lignocellulosic cell wall attack, secretome and transcriptome analyses were performed on both wood decay fungi...
USDA-ARS?s Scientific Manuscript database
While many studies have characterized the transcriptome of plants attacked by herbivorous insect pests, few have undertaken an examination of the genes affected by root pests. We have subjected maize seedlings to infestation by southern corn rootworm (SCR) Diabrotica undecimpunctata howardi and usin...
USDA-ARS?s Scientific Manuscript database
Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. "K...
USDA-ARS?s Scientific Manuscript database
An essential step to understanding the genomic biology of any organism is to comprehensively survey its transcriptome. We present the Bovine Gene Atlas (BGA) a compendium of over 7.2 million unique 20 base Illumina DGE tags representing 100 tissue transcriptomes collected primarily from L1 Dominette...
Tylee, Daniel S; Hess, Jonathan L; Quinn, Thomas P; Barve, Rahul; Huang, Hailiang; Zhang-James, Yanli; Chang, Jeffrey; Stamova, Boryana S; Sharp, Frank R; Hertz-Picciotto, Irva; Faraone, Stephen V; Kong, Sek Won; Glatt, Stephen J
2017-04-01
Blood-based microarray studies comparing individuals affected with autism spectrum disorder (ASD) and typically developing individuals help characterize differences in circulating immune cell functions and offer potential biomarker signal. We sought to combine the subject-level data from previously published studies by mega-analysis to increase the statistical power. We identified studies that compared ex vivo blood or lymphocytes from ASD-affected individuals and unrelated comparison subjects using Affymetrix or Illumina array platforms. Raw microarray data and clinical meta-data were obtained from seven studies, totaling 626 affected and 447 comparison subjects. Microarray data were processed using uniform methods. Covariate-controlled mixed-effect linear models were used to identify gene transcripts and co-expression network modules that were significantly associated with diagnostic status. Permutation-based gene-set analysis was used to identify functionally related sets of genes that were over- and under-expressed among ASD samples. Our results were consistent with diminished interferon-, EGF-, PDGF-, PI3K-AKT-mTOR-, and RAS-MAPK-signaling cascades, and increased ribosomal translation and NK-cell related activity in ASD. We explored evidence for sex-differences in the ASD-related transcriptomic signature. We also demonstrated that machine-learning classifiers using blood transcriptome data perform with moderate accuracy when data are combined across studies. Comparing our results with those from blood-based studies of protein biomarkers (e.g., cytokines and trophic factors), we propose that ASD may feature decoupling between certain circulating signaling proteins (higher in ASD samples) and the transcriptional cascades which they typically elicit within circulating immune cells (lower in ASD samples). These findings provide insight into ASD-related transcriptional differences in circulating immune cells. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Defining the Human Macula Transcriptome and Candidate Retinal Disease Genes UsingEyeSAGE
Rickman, Catherine Bowes; Ebright, Jessica N.; Zavodni, Zachary J.; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P.; Wistow, Graeme; Boon, Kathy; Hauser, Michael A.
2009-01-01
Purpose To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Methods Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Results Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. Conclusions The EyeSAGE database, combining three different gene-profiling platforms including the authors’ multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions. PMID:16723438
The Whole-Genome and Transcriptome of the Manila Clam (Ruditapes philippinarum).
Mun, Seyoung; Kim, Yun-Ji; Markkandan, Kesavan; Shin, Wonseok; Oh, Sumin; Woo, Jiyoung; Yoo, Jongsu; An, Hyesuck; Han, Kyudong
2017-06-01
The manila clam, Ruditapes philippinarum, is an important bivalve species in worldwide aquaculture including Korea. The aquaculture production of R. philippinarum is under threat from diverse environmental factors including viruses, microorganisms, parasites, and water conditions with subsequently declining production. In spite of its importance as a marine resource, the reference genome of R. philippinarum for comprehensive genetic studies is largely unexplored. Here, we report the de novo whole-genome and transcriptome assembly of R. philippinarum across three different tissues (foot, gill, and adductor muscle), and provide the basic data for advanced studies in selective breeding and disease control in order to obtain successful aquaculture systems. An approximately 2.56 Gb high quality whole-genome was assembled with various library construction methods. A total of 108,034 protein coding gene models were predicted and repetitive elements including simple sequence repeats and noncoding RNAs were identified to further understanding of the genetic background of R. philippinarum for genomics-assisted breeding. Comparative analysis with the bivalve marine invertebrates uncover that the gene family related to complement C1q was enriched. Furthermore, we performed transcriptome analysis with three different tissues in order to support genome annotation and then identified 41,275 transcripts which were annotated. The R. philippinarum genome resource will markedly advance a wide range of potential genetic studies, a reference genome for comparative analysis of bivalve species and unraveling mechanisms of biological processes in molluscs. We believe that the R. philippinarum genome will serve as an initial platform for breeding better-quality clams using a genomic approach. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Mykles, Donald L; Burnett, Karen G; Durica, David S; Joyce, Blake L; McCarthy, Fiona M; Schmidt, Carl J; Stillman, Jonathon H
2016-12-01
High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the "Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology" symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Tylee, Daniel S.; Hess, Jonathan L.; Quinn, Thomas P.; Barve, Rahul; Huang, Hailiang; Zhang-James, Yanli; Chang, Jeffrey; Stamova, Boryana S.; Sharp, Frank R.; Hertz-Picciotto, Irva; Faraone, Stephen V.; Kong, Sek Won; Glatt, Stephen J.
2017-01-01
Blood-based microarray studies comparing individuals affected with autism spectrum disorder (ASD) and typically developing individuals help characterize differences in circulating immune cell functions and offer potential biomarker signal. We sought to combine the subject-level data from previously published studies by mega-analysis to increase the statistical power. We identified studies that compared ex-vivo blood or lymphocytes from ASD-affected individuals and unrelated comparison subjects using Affymetrix or Illumina array platforms. Raw microarray data and clinical meta-data were obtained from seven studies, totaling 626 affected and 447 comparison subjects. Microarray data were processed using uniform methods. Covariate-controlled mixed-effect linear models were used to identify gene transcripts and co-expression network modules that were significantly associated with diagnostic status. Permutation-based gene-set analysis was used to identify functionally related sets of genes that were over- and under-expressed among ASD samples. Our results were consistent with diminished interferon-, EGF-, PDGF-, PI3K-AKT-mTOR-, and RAS-MAPK-signaling cascades, and increased ribosomal translation and NK-cell related activity in ASD. We explored evidence for sex-differences in the ASD-related transcriptomic signature. We also demonstrated that machine-learning classifiers using blood transcriptome data perform with moderate accuracy when data are combined across studies. Comparing our results with those from blood-based studies of protein biomarkers (e.g., cytokines and trophic factors), we propose that ASD may feature decoupling between certain circulating signaling proteins (higher in ASD samples) and the transcriptional cascades which they typically elicit within circulating immune cells (lower in ASD samples). These findings provide insight into ASD-related transcriptional differences in circulating immune cells. PMID:27862943
Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun
2013-01-01
The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867
Wang, Wenzhao; Zhou, Yihui; Wu, Yingling; Dai, Xinlong; Liu, Yajun; Qian, Yumei; Li, Mingzhuo; Jiang, Xiaolan; Wang, Yunsheng; Gao, Liping; Xia, Tao
2018-04-25
Tea is an important economic crop with a 3.02 Gb genome. It accumulates various bioactive compounds, especially catechins, which are closely associated with tea flavor and quality. Catechins are biosynthesized through the phenylpropanoid and flavonoid pathways, with 12 structural genes being involved in their synthesis. However, we found that in Camellia sinensis the understanding of the basic profile of catechins biosynthesis is still unclear. The gene structure, locus, transcript number, transcriptional variation, and function of multigene families have not yet been clarified. Our previous studies demonstrated that the accumulation of flavonoids in tea is species, tissue, and induction specific, which indicates that gene coexpression patterns may be involved in tea catechins and flavonoids biosynthesis. In this paper, we screened candidate genes of multigene families involved in the phenylpropanoid and flavonoid pathways based on an analysis of genome and transcriptome sequence data. The authenticity of candidate genes was verified by PCR cloning, and their function was validated by reverse genetic methods. In the present study, 36 genes from 12 gene families were identified and were accessed in the NCBI database. During this process, some intron retention events of the CsCHI and CsDFR genes were found. Furthermore, the transcriptome sequencing of various tea tissues and subcellular location assays revealed coexpression and colocalization patterns. The correlation analysis showed that CsCHIc, CsF3'H, and CsANRb expression levels are associated significantly with the concentration of soluble PA as well as the expression levels of CsPALc and CsPALf with the concentration of insoluble PA. This work provides insights into catechins metabolism in tea and provides a foundation for future studies.
Hu, Yongli; Hase, Takeshi; Li, Hui Peng; Prabhakar, Shyam; Kitano, Hiroaki; Ng, See Kiong; Ghosh, Samik; Wee, Lawrence Jin Kiat
2016-12-22
The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour.
Xue, Shuxia; Liu, Yichen; Zhang, Yichen; Sun, Yan; Geng, Xuyun; Sun, Jinsheng
2013-01-01
White spot syndrome virus (WSSV) is a causative pathogen found in most shrimp farming areas of the world and causes large economic losses to the shrimp aquaculture. The mechanism underlying the molecular pathogenesis of the highly virulent WSSV remains unknown. To better understand the virus-host interactions at the molecular level, the transcriptome profiles in hemocytes of unchallenged and WSSV-challenged shrimp (Litopenaeus vannamei) were compared using a short-read deep sequencing method (Illumina). RNA-seq analysis generated more than 25.81 million clean pair end (PE) reads, which were assembled into 52,073 unigenes (mean size = 520 bp). Based on sequence similarity searches, 23,568 (45.3%) genes were identified, among which 6,562 and 7,822 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. Searches in the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) mapped 14,941 (63.4%) unigenes to 240 KEGG pathways. Among all the annotated unigenes, 1,179 were associated with immune-related genes. Digital gene expression (DGE) analysis revealed that the host transcriptome profile was slightly changed in the early infection (5 hours post injection) of the virus, while large transcriptional differences were identified in the late infection (48 hpi) of WSSV. The differentially expressed genes mainly involved in pattern recognition genes and some immune response factors. The results indicated that antiviral immune mechanisms were probably involved in the recognition of pathogen-associated molecular patterns. This study provided a global survey of host gene activities against virus infection in a non-model organism, pacific white shrimp. Results can contribute to the in-depth study of candidate genes in white shrimp, and help to improve the current understanding of host-pathogen interactions.
Guo, Qianqian; Ma, Xiaojun; Wei, Shugen; Qiu, Deyou; Wilson, Iain W; Wu, Peng; Tang, Qi; Liu, Lijun; Dong, Shoukun; Zu, Wei
2014-08-12
The major medicinal alkaloids isolated from Uncaria rhynchophylla (gouteng in chinese) capsules are rhynchophylline (RIN) and isorhynchophylline (IRN). Extracts containing these terpene indole alkaloids (TIAs) can inhibit the formation and destabilize preformed fibrils of amyloid β protein (a pathological marker of Alzheimer's disease), and have been shown to improve the cognitive function of mice with Alzheimer-like symptoms. The biosynthetic pathways of RIN and IRN are largely unknown. In this study, RNA-sequencing of pooled Uncaria capsules RNA samples taken at three developmental stages that accumulate different amount of RIN and IRN was performed. More than 50 million high-quality reads from a cDNA library were generated and de novo assembled. Sequences for all of the known enzymes involved in TIAs synthesis were identified. Additionally, 193 cytochrome P450 (CYP450), 280 methyltransferase and 144 isomerase genes were identified, that are potential candidates for enzymes involved in RIN and IRN synthesis. Digital gene expression profile (DGE) analysis was performed on the three capsule developmental stages, and based on genes possessing expression profiles consistent with RIN and IRN levels; four CYP450s, three methyltransferases and three isomerases were identified as the candidates most likely to be involved in the later steps of RIN and IRN biosynthesis. A combination of de novo transcriptome assembly and DGE analysis was shown to be a powerful method for identifying genes encoding enzymes potentially involved in the biosynthesis of important secondary metabolites in a non-model plant. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the capsule extract from Uncaria, and provides information that may aid in metabolic engineering to increase yields of these important alkaloids.
Cavill, Rachel; Kamburov, Atanas; Ellis, James K; Athersuch, Toby J; Blagrove, Marcus S C; Herwig, Ralf; Ebbels, Timothy M D; Keun, Hector C
2011-03-01
Using transcriptomic and metabolomic measurements from the NCI60 cell line panel, together with a novel approach to integration of molecular profile data, we show that the biochemical pathways associated with tumour cell chemosensitivity to platinum-based drugs are highly coincident, i.e. they describe a consensus phenotype. Direct integration of metabolome and transcriptome data at the point of pathway analysis improved the detection of consensus pathways by 76%, and revealed associations between platinum sensitivity and several metabolic pathways that were not visible from transcriptome analysis alone. These pathways included the TCA cycle and pyruvate metabolism, lipoprotein uptake and nucleotide synthesis by both salvage and de novo pathways. Extending the approach across a wide panel of chemotherapeutics, we confirmed the specificity of the metabolic pathway associations to platinum sensitivity. We conclude that metabolic phenotyping could play a role in predicting response to platinum chemotherapy and that consensus-phenotype integration of molecular profiling data is a powerful and versatile tool for both biomarker discovery and for exploring the complex relationships between biological pathways and drug response.
CBrowse: a SAM/BAM-based contig browser for transcriptome assembly visualization and analysis.
Li, Pei; Ji, Guoli; Dong, Min; Schmidt, Emily; Lenox, Douglas; Chen, Liangliang; Liu, Qi; Liu, Lin; Zhang, Jie; Liang, Chun
2012-09-15
To address the impending need for exploring rapidly increased transcriptomics data generated for non-model organisms, we developed CBrowse, an AJAX-based web browser for visualizing and analyzing transcriptome assemblies and contigs. Designed in a standard three-tier architecture with a data pre-processing pipeline, CBrowse is essentially a Rich Internet Application that offers many seamlessly integrated web interfaces and allows users to navigate, sort, filter, search and visualize data smoothly. The pre-processing pipeline takes the contig sequence file in FASTA format and its relevant SAM/BAM file as the input; detects putative polymorphisms, simple sequence repeats and sequencing errors in contigs and generates image, JSON and database-compatible CSV text files that are directly utilized by different web interfaces. CBowse is a generic visualization and analysis tool that facilitates close examination of assembly quality, genetic polymorphisms, sequence repeats and/or sequencing errors in transcriptome sequencing projects. CBrowse is distributed under the GNU General Public License, available at http://bioinfolab.muohio.edu/CBrowse/ liangc@muohio.edu or liangc.mu@gmail.com; glji@xmu.edu.cn Supplementary data are available at Bioinformatics online.
Mulindwa, Julius; Leiss, Kevin; Ibberson, David; Kamanyi Marucha, Kevin; Helbig, Claudia; Melo do Nascimento, Larissa; Silvester, Eleanor; Matthews, Keith; Matovu, Enock; Enyaru, John
2018-01-01
All of our current knowledge of African trypanosome metabolism is based on results from trypanosomes grown in culture or in rodents. Drugs against sleeping sickness must however treat trypanosomes in humans. We here compare the transcriptomes of Trypanosoma brucei rhodesiense from the blood and cerebrospinal fluid of human patients with those of trypanosomes from culture and rodents. The data were aligned and analysed using new user-friendly applications designed for Kinetoplastid RNA-Seq data. The transcriptomes of trypanosomes from human blood and cerebrospinal fluid did not predict major metabolic differences that might affect drug susceptibility. Usefully, there were relatively few differences between the transcriptomes of trypanosomes from patients and those of similar trypanosomes grown in rats. Transcriptomes of monomorphic laboratory-adapted parasites grown in in vitro culture closely resembled those of the human parasites, but some differences were seen. In poly(A)-selected mRNA transcriptomes, mRNAs encoding some protein kinases and RNA-binding proteins were under-represented relative to mRNA that had not been poly(A) selected; further investigation revealed that the selection tends to result in loss of longer mRNAs. PMID:29474390
Mulindwa, Julius; Leiss, Kevin; Ibberson, David; Kamanyi Marucha, Kevin; Helbig, Claudia; Melo do Nascimento, Larissa; Silvester, Eleanor; Matthews, Keith; Matovu, Enock; Enyaru, John; Clayton, Christine
2018-02-01
All of our current knowledge of African trypanosome metabolism is based on results from trypanosomes grown in culture or in rodents. Drugs against sleeping sickness must however treat trypanosomes in humans. We here compare the transcriptomes of Trypanosoma brucei rhodesiense from the blood and cerebrospinal fluid of human patients with those of trypanosomes from culture and rodents. The data were aligned and analysed using new user-friendly applications designed for Kinetoplastid RNA-Seq data. The transcriptomes of trypanosomes from human blood and cerebrospinal fluid did not predict major metabolic differences that might affect drug susceptibility. Usefully, there were relatively few differences between the transcriptomes of trypanosomes from patients and those of similar trypanosomes grown in rats. Transcriptomes of monomorphic laboratory-adapted parasites grown in in vitro culture closely resembled those of the human parasites, but some differences were seen. In poly(A)-selected mRNA transcriptomes, mRNAs encoding some protein kinases and RNA-binding proteins were under-represented relative to mRNA that had not been poly(A) selected; further investigation revealed that the selection tends to result in loss of longer mRNAs.
Li, Yiping; Li, Yanhong; Bai, Zhenjiang; Pan, Jian; Wang, Jian; Fang, Fang
2017-12-13
Sepsis represents a complex disease with the dysregulated inflammatory response and high mortality rate. The goal of this study was to identify potential transcriptomic markers in developing pediatric sepsis by a co-expression module analysis of the transcriptomic dataset. Using the R software and Bioconductor packages, we performed a weighted gene co-expression network analysis to identify co-expression modules significantly associated with pediatric sepsis. Functional interpretation (gene ontology and pathway analysis) and enrichment analysis with known transcription factors and microRNAs of the identified candidate modules were then performed. In modules significantly associated with sepsis, the intramodular analysis was further performed and "hub genes" were identified and validated by quantitative real-time PCR (qPCR) in this study. 15 co-expression modules in total were detected, and four modules ("midnight blue", "cyan", "brown", and "tan") were most significantly associated with pediatric sepsis and suggested as potential sepsis-associated modules. Gene ontology analysis and pathway analysis revealed that these four modules strongly associated with immune response. Three of the four sepsis-associated modules were also enriched with known transcription factors (false discovery rate-adjusted P < 0.05). Hub genes were identified in each of the four modules. Four of the identified hub genes (MYB proto-oncogene like 1, killer cell lectin like receptor G1, stomatin, and membrane spanning 4-domains A4A) were further validated to be differentially expressed between septic children and controls by qPCR. Four pediatric sepsis-associated co-expression modules were identified in this study. qPCR results suggest that hub genes in these modules are potential transcriptomic markers for pediatric sepsis diagnosis. These results provide novel insights into the pathogenesis of pediatric sepsis and promote the generation of diagnostic gene sets.
Torre, Sara; Tattini, Massimiliano; Brunetti, Cecilia; Guidi, Lucia; Gori, Antonella; Marzano, Cristina; Landi, Marco; Sebastiani, Federico
2016-01-01
Sweet basil (Ocimum basilicum), one of the most popular cultivated herbs worldwide, displays a number of varieties differing in several characteristics, such as the color of the leaves. The development of a reference transcriptome for sweet basil, and the analysis of differentially expressed genes in acyanic and cyanic cultivars exposed to natural sunlight irradiance, has interest from horticultural and biological point of views. There is still great uncertainty about the significance of anthocyanins in photoprotection, and how green and red morphs may perform when exposed to photo-inhibitory light, a condition plants face on daily and seasonal basis. We sequenced the leaf transcriptome of the green-leaved Tigullio (TIG) and the purple-leaved Red Rubin (RR) exposed to full sunlight over a four-week experimental period. We assembled and annotated 111,007 transcripts. A total of 5,468 and 5,969 potential SSRs were identified in TIG and RR, respectively, out of which 66 were polymorphic in silico. Comparative analysis of the two transcriptomes showed 2,372 differentially expressed genes (DEGs) clustered in 222 enriched Gene ontology terms. Green and red basil mostly differed for transcripts abundance of genes involved in secondary metabolism. While the biosynthesis of waxes was up-regulated in red basil, the biosynthesis of flavonols and carotenoids was up-regulated in green basil. Data from our study provides a comprehensive transcriptome survey, gene sequence resources and microsatellites that can be used for further investigations in sweet basil. The analysis of DEGs and their functional classification also offers new insights on the functional role of anthocyanins in photoprotection.
2013-01-01
Backgroud Isatis indigotica is a widely used herb for the clinical treatment of colds, fever, and influenza in Traditional Chinese Medicine (TCM). Various structural classes of compounds have been identified as effective ingredients. However, little is known at genetics level about these active metabolites. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive dataset of I. indigotica. Results A database of 36,367 unigenes (average length = 1,115.67 bases) was generated by performing transcriptome sequencing. Based on the gene annotation of the transcriptome, 104 unigenes were identified covering most of the catalytic steps in the general biosynthetic pathways of indole, terpenoid, and phenylpropanoid. Subsequently, the organ-specific expression patterns of the genes involved in these pathways, and their responses to methyl jasmonate (MeJA) induction, were investigated. Metabolites profile of effective phenylpropanoid showed accumulation pattern of secondary metabolites were mostly correlated with the transcription of their biosynthetic genes. According to the analysis of UDP-dependent glycosyltransferases (UGT) family, several flavonoids were indicated to exist in I. indigotica and further identified by metabolic profile using UPLC/Q-TOF. Moreover, applying transcriptome co-expression analysis, nine new, putative UGTs were suggested as flavonol glycosyltransferases and lignan glycosyltransferases. Conclusions This database provides a pool of candidate genes involved in biosynthesis of effective metabolites in I. indigotica. Furthermore, the comprehensive analysis and characterization of the significant pathways are expected to give a better insight regarding the diversity of chemical composition, synthetic characteristics, and the regulatory mechanism which operate in this medical herb. PMID:24308360
Langley, Raymond J; Tipper, Jennifer L; Bruse, Shannon; Baron, Rebecca M; Tsalik, Ephraim L; Huntley, James; Rogers, Angela J; Jaramillo, Richard J; O'Donnell, Denise; Mega, William M; Keaton, Mignon; Kensicki, Elizabeth; Gazourian, Lee; Fredenburgh, Laura E; Massaro, Anthony F; Otero, Ronny M; Fowler, Vance G; Rivers, Emanuel P; Woods, Chris W; Kingsmore, Stephen F; Sopori, Mohan L; Perrella, Mark A; Choi, Augustine M K; Harrod, Kevin S
2014-08-15
Sepsis is a leading cause of morbidity and mortality. Currently, early diagnosis and the progression of the disease are difficult to make. The integration of metabolomic and transcriptomic data in a primate model of sepsis may provide a novel molecular signature of clinical sepsis. To develop a biomarker panel to characterize sepsis in primates and ascertain its relevance to early diagnosis and progression of human sepsis. Intravenous inoculation of Macaca fascicularis with Escherichia coli produced mild to severe sepsis, lung injury, and death. Plasma samples were obtained before and after 1, 3, and 5 days of E. coli challenge and at the time of killing. At necropsy, blood, lung, kidney, and spleen samples were collected. An integrative analysis of the metabolomic and transcriptomic datasets was performed to identify a panel of sepsis biomarkers. The extent of E. coli invasion, respiratory distress, lethargy, and mortality was dependent on the bacterial dose. Metabolomic and transcriptomic changes characterized severe infections and death, and indicated impaired mitochondrial, peroxisomal, and liver functions. Analysis of the pulmonary transcriptome and plasma metabolome suggested impaired fatty acid catabolism regulated by peroxisome-proliferator activated receptor signaling. A representative four-metabolite model effectively diagnosed sepsis in primates (area under the curve, 0.966) and in two human sepsis cohorts (area under the curve, 0.78 and 0.82). A model of sepsis based on reciprocal metabolomic and transcriptomic data was developed in primates and validated in two human patient cohorts. It is anticipated that the identified parameters will facilitate early diagnosis and management of sepsis.
Danchin, Etienne G.J.; Perfus-Barbeoch, Laetitia; Rancurel, Corinne; Thorpe, Peter; Da Rocha, Martine; Bajew, Simon; Neilson, Roy; Sokolova (Guzeeva), Elena; Da Silva, Corinne; Guy, Julie; Labadie, Karine; Esmenjaud, Daniel; Helder, Johannes; Jones, John T.
2017-01-01
Nematodes have evolved the ability to parasitize plants on at least four independent occasions, with plant parasites present in Clades 1, 2, 10 and 12 of the phylum. In the case of Clades 10 and 12, horizontal gene transfer of plant cell wall degrading enzymes from bacteria and fungi has been implicated in the evolution of plant parasitism. We have used ribonucleic acid sequencing (RNAseq) to generate reference transcriptomes for two economically important nematode species, Xiphinema index and Longidorus elongatus, representative of two genera within the early-branching Clade 2 of the phylum Nematoda. We used a transcriptome-wide analysis to identify putative horizontal gene transfer events. This represents the first in-depth transcriptome analysis from any plant-parasitic nematode of this clade. For each species, we assembled ~30 million Illumina reads into a reference transcriptome. We identified 62 and 104 transcripts, from X. index and L. elongatus, respectively, that were putatively acquired via horizontal gene transfer. By cross-referencing horizontal gene transfer prediction with a phylum-wide analysis of Pfam domains, we identified Clade 2-specific events. Of these, a GH12 cellulase from X. index was analysed phylogenetically and biochemically, revealing a likely bacterial origin and canonical enzymatic function. Horizontal gene transfer was previously shown to be a phenomenon that has contributed to the evolution of plant parasitism among nematodes. Our findings underline the importance and the extensiveness of this phenomenon in the evolution of plant-parasitic life styles in this speciose and widespread animal phylum. PMID:29065523
Danchin, Etienne G J; Perfus-Barbeoch, Laetitia; Rancurel, Corinne; Thorpe, Peter; Da Rocha, Martine; Bajew, Simon; Neilson, Roy; Guzeeva, Elena Sokolova; Da Silva, Corinne; Guy, Julie; Labadie, Karine; Esmenjaud, Daniel; Helder, Johannes; Jones, John T; den Akker, Sebastian Eves-van
2017-10-23
Nematodes have evolved the ability to parasitize plants on at least four independent occasions, with plant parasites present in Clades 1, 2, 10 and 12 of the phylum. In the case of Clades 10 and 12, horizontal gene transfer of plant cell wall degrading enzymes from bacteria and fungi has been implicated in the evolution of plant parasitism. We have used ribonucleic acid sequencing (RNAseq) to generate reference transcriptomes for two economically important nematode species, Xiphinema index and Longidorus elongatus , representative of two genera within the early-branching Clade 2 of the phylum Nematoda. We used a transcriptome-wide analysis to identify putative horizontal gene transfer events. This represents the first in-depth transcriptome analysis from any plant-parasitic nematode of this clade. For each species, we assembled ~30 million Illumina reads into a reference transcriptome. We identified 62 and 104 transcripts, from X. index and L. elongatus , respectively, that were putatively acquired via horizontal gene transfer. By cross-referencing horizontal gene transfer prediction with a phylum-wide analysis of Pfam domains, we identified Clade 2-specific events. Of these, a GH12 cellulase from X. index was analysed phylogenetically and biochemically, revealing a likely bacterial origin and canonical enzymatic function. Horizontal gene transfer was previously shown to be a phenomenon that has contributed to the evolution of plant parasitism among nematodes. Our findings underline the importance and the extensiveness of this phenomenon in the evolution of plant-parasitic life styles in this speciose and widespread animal phylum.
Transcriptome profile and unique genetic evolution of positively selected genes in yak lungs.
Lan, DaoLiang; Xiong, XianRong; Ji, WenHui; Li, Jian; Mipam, Tserang-Donko; Ai, Yi; Chai, ZhiXin
2018-04-01
The yak (Bos grunniens), which is a unique bovine breed that is distributed mainly in the Qinghai-Tibetan Plateau, is considered a good model for studying plateau adaptability in mammals. The lungs are important functional organs that enable animals to adapt to their external environment. However, the genetic mechanism underlying the adaptability of yak lungs to harsh plateau environments remains unknown. To explore the unique evolutionary process and genetic mechanism of yak adaptation to plateau environments, we performed transcriptome sequencing of yak and cattle (Bos taurus) lungs using RNA-Seq technology and a subsequent comparison analysis to identify the positively selected genes in the yak. After deep sequencing, a normal transcriptome profile of yak lung that containing a total of 16,815 expressed genes was obtained, and the characteristics of yak lungs transcriptome was described by functional analysis. Furthermore, Ka/Ks comparison statistics result showed that 39 strong positively selected genes are identified from yak lungs. Further GO and KEGG analysis was conducted for the functional annotation of these genes. The results of this study provide valuable data for further explorations of the unique evolutionary process of high-altitude hypoxia adaptation in yaks in the Tibetan Plateau and the genetic mechanism at the molecular level.
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome
Kim, Gunjune
2017-01-01
Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is “leaves of three, let it be”, which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species. PMID:29125533
Sequencing and De Novo Assembly of the Toxicodendron radicans (Poison Ivy) Transcriptome.
Weisberg, Alexandra J; Kim, Gunjune; Westwood, James H; Jelesko, John G
2017-11-10
Contact with poison ivy plants is widely dreaded because they produce a natural product called urushiol that is responsible for allergenic contact delayed-dermatitis symptoms lasting for weeks. For this reason, the catchphrase most associated with poison ivy is "leaves of three, let it be", which serves the purpose of both identification and an appeal for avoidance. Ironically, despite this notoriety, there is a dearth of specific knowledge about nearly all other aspects of poison ivy physiology and ecology. As a means of gaining a more molecular-oriented understanding of poison ivy physiology and ecology, Next Generation DNA sequencing technology was used to develop poison ivy root and leaf RNA-seq transcriptome resources. De novo assembled transcriptomes were analyzed to generate a core set of high quality expressed transcripts present in poison ivy tissue. The predicted protein sequences were evaluated for similarity to SwissProt homologs and InterProScan domains, as well as assigned both GO terms and KEGG annotations. Over 23,000 simple sequence repeats were identified in the transcriptome, and corresponding oligo nucleotide primer pairs were designed. A pan-transcriptome analysis of existing Anacardiaceae transcriptomes revealed conserved and unique transcripts among these species.
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing
Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li
2010-01-01
Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome. PMID:20392818
Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.
Wang, Bin; Guo, Guangwu; Wang, Chao; Lin, Ying; Wang, Xiaoning; Zhao, Mouming; Guo, Yong; He, Minghui; Zhang, Yong; Pan, Li
2010-08-01
Aspergillus oryzae, an important filamentous fungus used in food fermentation and the enzyme industry, has been shown through genome sequencing and various other tools to have prominent features in its genomic composition. However, the functional complexity of the A. oryzae transcriptome has not yet been fully elucidated. Here, we applied direct high-throughput paired-end RNA-sequencing (RNA-Seq) to the transcriptome of A. oryzae under four different culture conditions. With the high resolution and sensitivity afforded by RNA-Seq, we were able to identify a substantial number of novel transcripts, new exons, untranslated regions, alternative upstream initiation codons and upstream open reading frames, which provide remarkable insight into the A. oryzae transcriptome. We were also able to assess the alternative mRNA isoforms in A. oryzae and found a large number of genes undergoing alternative splicing. Many genes and pathways that might be involved in higher levels of protein production in solid-state culture than in liquid culture were identified by comparing gene expression levels between different cultures. Our analysis indicated that the transcriptome of A. oryzae is much more complex than previously anticipated, and these results may provide a blueprint for further study of the A. oryzae transcriptome.
Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen
2014-01-01
Background The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. Methodology/Principal Findings In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. Conclusions/Significance This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence variation in the BPH. PMID:24533099
Yu, Haixin; Ji, Rui; Ye, Wenfeng; Chen, Hongdan; Lai, Wenxiang; Fu, Qiang; Lou, Yonggen
2014-01-01
The brown planthopper (BPH), Nilaparvata lugens (Stål), one of the most serious rice insect pests in Asia, can quickly overcome rice resistance by evolving new virulent populations. The insect fat body plays essential roles in the life cycles of insects and in plant-insect interactions. However, whether differences in fat body transcriptomes exist between insect populations with different virulence levels and whether the transcriptomic differences are related to insect virulence remain largely unknown. In this study, we performed transcriptome-wide analyses on the fat bodies of two BPH populations with different virulence levels in rice. The populations were derived from rice variety TN1 (TN1 population) and Mudgo (M population). In total, 33,776 and 32,332 unigenes from the fat bodies of TN1 and M populations, respectively, were generated using Illumina technology. Gene ontology annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology classifications indicated that genes related to metabolism and immunity were significantly active in the fat bodies. In addition, a total of 339 unigenes showed homology to genes of yeast-like symbionts (YLSs) from 12 genera and endosymbiotic bacteria Wolbachia. A comparative analysis of the two transcriptomes generated 7,860 differentially expressed genes. GO annotations and enrichment analysis of KEGG pathways indicated these differentially expressed transcripts might be involved in metabolism and immunity. Finally, 105 differentially expressed genes from YLSs and Wolbachia were identified, genes which might be associated with the formation of different virulent populations. This study was the first to compare the fat-body transcriptomes of two BPH populations having different virulence traits and to find genes that may be related to this difference. Our findings provide a molecular resource for future investigations of fat bodies and will be useful in examining the interactions between the fat body and virulence variation in the BPH.
Hypertranscription in development, stem cells, and regeneration
Percharde, Michelle; Bulut-Karslioglu, Aydan; Ramalho-Santos, Miguel
2016-01-01
SUMMARY Cells can globally up-regulate their transcriptome during specific transitions, a phenomenon called hypertranscription. Evidence for hypertranscription dates back over 70 years, but it has gone largely ignored in the genomics era until recently. We discuss data supporting the notion that hypertranscription is a unifying theme in embryonic development, stem cell biology, regeneration and cell competition. We review the history, methods for analysis, underlying mechanisms and biological significance of hypertranscription. PMID:27989554
2010-01-01
Background Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/Illumina RNA-seq and Digital gene expression (DGE) are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish. Results RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish-specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution. Conclusion This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host-pathogen interactions and evolutionary history of immunogenetics from fish to mammals. PMID:20707909
Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming
2013-01-01
Background Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. Results In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. Conclusions This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas. PMID:24349370
Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming
2013-01-01
Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas.
Using next generation transcriptome sequencing to predict an ectomycorrhizal metablome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, P. E.; Sreedasyam, A.; Trivedi, G
Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides) roots. The transcriptomic data was used to identify statistically significantly expressed gene models usingmore » a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems.« less
Sequencing, Annotation and Analysis of the Syrian Hamster (Mesocricetus auratus) Transcriptome
Tchitchek, Nicolas; Safronetz, David; Rasmussen, Angela L.; Martens, Craig; Virtaneva, Kimmo; Porcella, Stephen F.; Feldmann, Heinz
2014-01-01
Background The Syrian hamster (golden hamster, Mesocricetus auratus) is gaining importance as a new experimental animal model for multiple pathogens, including emerging zoonotic diseases such as Ebola. Nevertheless there are currently no publicly available transcriptome reference sequences or genome for this species. Results A cDNA library derived from mRNA and snRNA isolated and pooled from the brains, lungs, spleens, kidneys, livers, and hearts of three adult female Syrian hamsters was sequenced. Sequence reads were assembled into 62,482 contigs and 111,796 reads remained unassembled (singletons). This combined contig/singleton dataset, designated as the Syrian hamster transcriptome, represents a total of 60,117,204 nucleotides. Our Mesocricetus auratus Syrian hamster transcriptome mapped to 11,648 mouse transcripts representing 9,562 distinct genes, and mapped to a similar number of transcripts and genes in the rat. We identified 214 quasi-complete transcripts based on mouse annotations. Canonical pathways involved in a broad spectrum of fundamental biological processes were significantly represented in the library. The Syrian hamster transcriptome was aligned to the current release of the Chinese hamster ovary (CHO) cell transcriptome and genome to improve the genomic annotation of this species. Finally, our Syrian hamster transcriptome was aligned against 14 other rodents, primate and laurasiatheria species to gain insights about the genetic relatedness and placement of this species. Conclusions This Syrian hamster transcriptome dataset significantly improves our knowledge of the Syrian hamster's transcriptome, especially towards its future use in infectious disease research. Moreover, this library is an important resource for the wider scientific community to help improve genome annotation of the Syrian hamster and other closely related species. Furthermore, these data provide the basis for development of expression microarrays that can be used in functional genomics studies. PMID:25398096
Re-evaluating microglia expression profiles using RiboTag and cell isolation strategies.
Haimon, Zhana; Volaski, Alon; Orthgiess, Johannes; Boura-Halfon, Sigalit; Varol, Diana; Shemer, Anat; Yona, Simon; Zuckerman, Binyamin; David, Eyal; Chappell-Maor, Louise; Bechmann, Ingo; Gericke, Martin; Ulitsky, Igor; Jung, Steffen
2018-06-01
Transcriptome profiling is widely used to infer functional states of specific cell types, as well as their responses to stimuli, to define contributions to physiology and pathophysiology. Focusing on microglia, the brain's macrophages, we report here a side-by-side comparison of classical cell-sorting-based transcriptome sequencing and the 'RiboTag' method, which avoids cell retrieval from tissue context and yields translatome sequencing information. Conventional whole-cell microglial transcriptomes were found to be significantly tainted by artifacts introduced by tissue dissociation, cargo contamination and transcripts sequestered from ribosomes. Conversely, our data highlight the added value of RiboTag profiling for assessing the lineage accuracy of Cre recombinase expression in transgenic mice. Collectively, this study indicates method-based biases, reveals observer effects and establishes RiboTag-based translatome profiling as a valuable complement to standard sorting-based profiling strategies.
How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives.
Dal Molin, Alessandra; Di Camillo, Barbara
2018-01-31
The sequencing of the transcriptome of single cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types in heterogeneous cell populations or for the study of stochastic gene expression. In recent years, various experimental methods and computational tools for analysing single-cell RNA-sequencing data have been proposed. However, most of them are tailored to different experimental designs or biological questions, and in many cases, their performance has not been benchmarked yet, thus increasing the difficulty for a researcher to choose the optimal single-cell transcriptome sequencing (scRNA-seq) experiment and analysis workflow. In this review, we aim to provide an overview of the current available experimental and computational methods developed to handle single-cell RNA-sequencing data and, based on their peculiarities, we suggest possible analysis frameworks depending on specific experimental designs. Together, we propose an evaluation of challenges and open questions and future perspectives in the field. In particular, we go through the different steps of scRNA-seq experimental protocols such as cell isolation, messenger RNA capture, reverse transcription, amplification and use of quantitative standards such as spike-ins and Unique Molecular Identifiers (UMIs). We then analyse the current methodological challenges related to preprocessing, alignment, quantification, normalization, batch effect correction and methods to control for confounding effects. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Analysis of the Macaca mulatta transcriptome and the sequence divergence between Macaca and human.
Magness, Charles L; Fellin, P Campion; Thomas, Matthew J; Korth, Marcus J; Agy, Michael B; Proll, Sean C; Fitzgibbon, Matthew; Scherer, Christina A; Miner, Douglas G; Katze, Michael G; Iadonato, Shawn P
2005-01-01
We report the initial sequencing and comparative analysis of the Macaca mulatta transcriptome. Cloned sequences from 11 tissues, nine animals, and three species (M. mulatta, M. fascicularis, and M. nemestrina) were sampled, resulting in the generation of 48,642 sequence reads. These data represent an initial sampling of the putative rhesus orthologs for 6,216 human genes. Mean nucleotide diversity within M. mulatta and sequence divergence among M. fascicularis, M. nemestrina, and M. mulatta are also reported.
Geib, Scott M; Calla, Bernarda; Hall, Brian; Hou, Shaobin; Manoukis, Nicholas C
2014-10-28
The oriental fruit fly, Bactrocera dorsalis, is an important pest of fruit and vegetable crops throughout Asia, and is considered a high risk pest for establishment in the mainland United States. It is a member of the family Tephritidae, which are the most agriculturally important family of flies, and can be considered an out-group to well-studied members of the family Drosophilidae. Despite their importance as pests and their relatedness to Drosophila, little information is present on B. dorsalis transcripts and proteins. The objective of this paper is to comprehensively characterize the transcripts present throughout the life history of B. dorsalis and functionally annotate and analyse these transcripts relative to the presence, expression, and function of orthologous sequences present in Drosophila melanogaster. We present a detailed transcriptome assembly of B. dorsalis from egg through adult stages containing 20,666 transcripts across 10,799 unigene components. Utilizing data available through Flybase and the modENCODE project, we compared expression patterns of these transcripts to putative orthologs in D. melanogaster in terms of timing, abundance, and function. In addition, temporal expression patterns in B. dorsalis were characterized between stages, to establish the constitutive or stage-specific expression patterns of particular transcripts. A fully annotated transcriptome assembly is made available through NCBI, in addition to corresponding expression data. Through characterizing the transcriptome of B. dorsalis through its life history and comparing the transcriptome of B. dorsalis to the model organism D. melanogaster, a database has been developed that can be used as the foundation to functional genomic research in Bactrocera flies and help identify orthologous genes between B. dorsalis and D. melanogaster. This data provides the foundation for future functional genomic research that will focus on improving our understanding of the physiology and biology of this species at the molecular level. This knowledge can also be applied towards developing improved methods for control, survey, and eradication of this important pest.
Puente-Marin, Sara; Nombela, Iván; Ciordia, Sergio; Mena, María Carmen; Chico, Verónica; Coll, Julio; Ortega-Villaizan, María Del Mar
2018-04-09
Nucleated red blood cells (RBCs) of fish have, in the last decade, been implicated in several immune-related functions, such as antiviral response, phagocytosis or cytokine-mediated signaling. RNA-sequencing (RNA-seq) and label-free shotgun proteomic analyses were carried out for in silico functional pathway profiling of rainbow trout RBCs. For RNA-seq, a de novo assembly was conducted, in order to create a transcriptome database for RBCs. For proteome profiling, we developed a proteomic method that combined: (a) fractionation into cytosolic and membrane fractions, (b) hemoglobin removal of the cytosolic fraction, (c) protein digestion, and (d) a novel step with pH reversed-phase peptide fractionation and final Liquid Chromatography Electrospray Ionization Tandem Mass Spectrometric (LC ESI-MS/MS) analysis of each fraction. Combined transcriptome- and proteome- sequencing data identified, in silico, novel and striking immune functional networks for rainbow trout nucleated RBCs, which are mainly linked to innate and adaptive immunity. Functional pathways related to regulation of hematopoietic cell differentiation, antigen presentation via major histocompatibility complex class II (MHCII), leukocyte differentiation and regulation of leukocyte activation were identified. These preliminary findings further implicate nucleated RBCs in immune function, such as antigen presentation and leukocyte activation.
Genome-wide proteomics analysis on longissimus muscles in Qinchuan beef cattle.
He, Hua; Chen, Si; Liang, Wei; Liu, Xiaolin
2017-04-01
To gain further insight into the molecular mechanism of bovine muscle development, we combined mass spectrometry characterization of proteins with Illumina deep sequencing of RNAs obtained from bovine longissimus muscle (LD) at prenatal and postnatal stages. For the proteomic study, each group of LD proteins was extracted and labeled using isobaric tags for relative and absolute quantitation (iTRAQ) method. Among the 1321 proteins identified from six samples, 390 proteins were differentially expressed in embryos at day 135 post-fertilization (Emb135d) vs. 30-month-old adult cattle (Emb135d vs. 30M) samples. Gene Ontology, Cluster of Orthologous Groups and Kyoto Encyclopedia of Genes and Genomes analyses were further conducted to better understand the different functions. Furthermore, we analyzed the relationship between transcript and protein regulation between samples by direct comparison of expression levels from transcriptomic and iTRAQ-based proteomics. Association results indicated that 1295 of 1321 proteins could be mapped to transcriptome sequencing data. This study provides the most comprehensive, targeted survey of bovine LD proteins to date and has shown the power of combining transcriptomic and proteomic approaches to provide molecular insights for understanding the developmental characteristics in bovine muscle, and even in other mammals. © 2016 Stichting International Foundation for Animal Genetics.
Puente-Marin, Sara; Ciordia, Sergio; Mena, María Carmen; Chico, Verónica; Coll, Julio
2018-01-01
Nucleated red blood cells (RBCs) of fish have, in the last decade, been implicated in several immune-related functions, such as antiviral response, phagocytosis or cytokine-mediated signaling. RNA-sequencing (RNA-seq) and label-free shotgun proteomic analyses were carried out for in silico functional pathway profiling of rainbow trout RBCs. For RNA-seq, a de novo assembly was conducted, in order to create a transcriptome database for RBCs. For proteome profiling, we developed a proteomic method that combined: (a) fractionation into cytosolic and membrane fractions, (b) hemoglobin removal of the cytosolic fraction, (c) protein digestion, and (d) a novel step with pH reversed-phase peptide fractionation and final Liquid Chromatography Electrospray Ionization Tandem Mass Spectrometric (LC ESI-MS/MS) analysis of each fraction. Combined transcriptome- and proteome- sequencing data identified, in silico, novel and striking immune functional networks for rainbow trout nucleated RBCs, which are mainly linked to innate and adaptive immunity. Functional pathways related to regulation of hematopoietic cell differentiation, antigen presentation via major histocompatibility complex class II (MHCII), leukocyte differentiation and regulation of leukocyte activation were identified. These preliminary findings further implicate nucleated RBCs in immune function, such as antigen presentation and leukocyte activation. PMID:29642539
Time Series Expression Analyses Using RNA-seq: A Statistical Approach
Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.
2013-01-01
RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021
Time series expression analyses using RNA-seq: a statistical approach.
Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P
2013-01-01
RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.
PASTA: splice junction identification from RNA-Sequencing data
2013-01-01
Background Next generation transcriptome sequencing (RNA-Seq) is emerging as a powerful experimental tool for the study of alternative splicing and its regulation, but requires ad-hoc analysis methods and tools. PASTA (Patterned Alignments for Splicing and Transcriptome Analysis) is a splice junction detection algorithm specifically designed for RNA-Seq data, relying on a highly accurate alignment strategy and on a combination of heuristic and statistical methods to identify exon-intron junctions with high accuracy. Results Comparisons against TopHat and other splice junction prediction software on real and simulated datasets show that PASTA exhibits high specificity and sensitivity, especially at lower coverage levels. Moreover, PASTA is highly configurable and flexible, and can therefore be applied in a wide range of analysis scenarios: it is able to handle both single-end and paired-end reads, it does not rely on the presence of canonical splicing signals, and it uses organism-specific regression models to accurately identify junctions. Conclusions PASTA is a highly efficient and sensitive tool to identify splicing junctions from RNA-Seq data. Compared to similar programs, it has the ability to identify a higher number of real splicing junctions, and provides highly annotated output files containing detailed information about their location and characteristics. Accurate junction data in turn facilitates the reconstruction of the splicing isoforms and the analysis of their expression levels, which will be performed by the remaining modules of the PASTA pipeline, still under development. Use of PASTA can therefore enable the large-scale investigation of transcription and alternative splicing. PMID:23557086
Firmino, Alexandre Augusto Pereira; Fonseca, Fernando Campos de Assis; de Macedo, Leonardo Lima Pepino; Coelho, Roberta Ramos; Antonino de Souza, José Dijair; Togawa, Roberto Coiti; Silva-Junior, Orzenil Bonfim; Pappas, Georgios Joannis; da Silva, Maria Cristina Mattar; Engler, Gilbert; Grossi-de-Sa, Maria Fatima
2013-01-01
Cotton plants are subjected to the attack of several insect pests. In Brazil, the cotton boll weevil, Anthonomus grandis, is the most important cotton pest. The use of insecticidal proteins and gene silencing by interference RNA (RNAi) as techniques for insect control are promising strategies, which has been applied in the last few years. For this insect, there are not much available molecular information on databases. Using 454-pyrosequencing methodology, the transcriptome of all developmental stages of the insect pest, A. grandis, was analyzed. The A. grandis transcriptome analysis resulted in more than 500.000 reads and a data set of high quality 20,841 contigs. After sequence assembly and annotation, around 10,600 contigs had at least one BLAST hit against NCBI non-redundant protein database and 65.7% was similar to Tribolium castaneum sequences. A comparison of A. grandis, Drosophila melanogaster and Bombyx mori protein families' data showed higher similarity to dipteran than to lepidopteran sequences. Several contigs of genes encoding proteins involved in RNAi mechanism were found. PAZ Domains sequences extracted from the transcriptome showed high similarity and conservation for the most important functional and structural motifs when compared to PAZ Domains from 5 species. Two SID-like contigs were phylogenetically analyzed and grouped with T. castaneum SID-like proteins. No RdRP gene was found. A contig matching chitin synthase 1 was mined from the transcriptome. dsRNA microinjection of a chitin synthase gene to A. grandis female adults resulted in normal oviposition of unviable eggs and malformed alive larvae that were unable to develop in artificial diet. This is the first study that characterizes the transcriptome of the coleopteran, A. grandis. A new and representative transcriptome database for this insect pest is now available. All data support the state of the art of RNAi mechanism in insects.
Coelho, Roberta Ramos; Antonino de Souza Jr, José Dijair; Togawa, Roberto Coiti; Silva-Junior, Orzenil Bonfim; Pappas-Jr, Georgios Joannis; da Silva, Maria Cristina Mattar; Engler, Gilbert; Grossi-de-Sa, Maria Fatima
2013-01-01
Cotton plants are subjected to the attack of several insect pests. In Brazil, the cotton boll weevil, Anthonomus grandis, is the most important cotton pest. The use of insecticidal proteins and gene silencing by interference RNA (RNAi) as techniques for insect control are promising strategies, which has been applied in the last few years. For this insect, there are not much available molecular information on databases. Using 454-pyrosequencing methodology, the transcriptome of all developmental stages of the insect pest, A. grandis, was analyzed. The A. grandis transcriptome analysis resulted in more than 500.000 reads and a data set of high quality 20,841 contigs. After sequence assembly and annotation, around 10,600 contigs had at least one BLAST hit against NCBI non-redundant protein database and 65.7% was similar to Tribolium castaneum sequences. A comparison of A. grandis, Drosophila melanogaster and Bombyx mori protein families’ data showed higher similarity to dipteran than to lepidopteran sequences. Several contigs of genes encoding proteins involved in RNAi mechanism were found. PAZ Domains sequences extracted from the transcriptome showed high similarity and conservation for the most important functional and structural motifs when compared to PAZ Domains from 5 species. Two SID-like contigs were phylogenetically analyzed and grouped with T. castaneum SID-like proteins. No RdRP gene was found. A contig matching chitin synthase 1 was mined from the transcriptome. dsRNA microinjection of a chitin synthase gene to A. grandis female adults resulted in normal oviposition of unviable eggs and malformed alive larvae that were unable to develop in artificial diet. This is the first study that characterizes the transcriptome of the coleopteran, A. grandis. A new and representative transcriptome database for this insect pest is now available. All data support the state of the art of RNAi mechanism in insects. PMID:24386449
Moisá, Sonia J.; Shike, Daniel W.; Shoup, Lindsay; Rodriguez-Zas, Sandra L.; Loor, Juan J.
2015-01-01
In model organisms both the nutrition of the mother and the young offspring could induce long-lasting transcriptional changes in tissues. In livestock, such changes could have important roles in determining nutrient use and meat quality. The main objective was to evaluate if plane of maternal nutrition during late-gestation and weaning age alter the offspring’s Longissimus muscle (LM) transcriptome, animal performance, and metabolic hormones. Whole-transcriptome microarray analysis was performed on LM samples of early (EW) and normal weaned (NW) Angus × Simmental calves born to grazing cows receiving no supplement [low plane of nutrition (LPN)] or 2.3 kg high-grain mix/day [medium plane of nutrition (MPN)] during the last 105 days of gestation. Biopsies of LM were harvested at 78 (EW), 187 (NW) and 354 (before slaughter) days of age. Despite greater feed intake in MPN offspring, blood insulin was greater in LPN offspring. Carcass intramuscular fat content was greater in EW offspring. Bioinformatics analysis of the transcriptome highlighted a modest overall response to maternal plane of nutrition, resulting in only 35 differentially expressed genes (DEG). However, weaning age and a high-grain diet (EW) strongly impacted the transcriptome (DEG = 167), especially causing a lipogenic program activation. In addition, between 78 and 187 days of age, EW steers had an activation of the innate immune system due presumably to macrophage infiltration of intramuscular fat. Between 187 and 354 days of age (the “finishing” phase), NW steers had an activation of the lipogenic transcriptome machinery, while EW steers had a clear inhibition through the epigenetic control of histone acetylases. Results underscored the need to conduct further studies to understand better the functional outcome of transcriptome changes induced in the offspring by pre- and post-natal nutrition. Additional knowledge on molecular and functional outcomes would help produce more efficient beef cattle. PMID:26153887
Babineau, Marielle; Mahmood, Khalid; Mathiassen, Solvejg K; Kudsk, Per; Kristensen, Michael
2017-02-06
Loose silky bentgrass (Apera spica-venti) is an important weed in Europe with a recent increase in herbicide resistance cases. The lack of genetic information about this noxious weed limits its biological understanding such as growth, reproduction, genetic variation, molecular ecology and metabolic herbicide resistance. This study produced a reference transcriptome for A. spica-venti from different tissues (leaf, root, stem) and various growth stages (seed at phenological stages 05, 07, 08, 09). The de novo assembly was performed on individual and combined dataset followed by functional annotations. Individual transcripts and gene families involved in metabolic based herbicide resistance were identified. Eight separate transcriptome assemblies were performed and compared. The combined transcriptome assembly consists of 83,349 contigs with an N50 and average contig length of 762 and 658 bp, respectively. This dataset contains 74,724 transcripts consisting of total 54,846,111 bp. Among them 94% had a homologue to UniProtKB, 73% retrieved a GO mapping, and 50% were functionally annotated. Compared with other grass species, A. spica-venti has 26% proteins in common to Brachypodium distachyon, and 41% to Lolium spp. Glycosyltransferases had the highest number of transcripts in each tissue followed by the cytochrome P450s. The GSTF1 and CYP89A2 transcripts were recovered from the majority of tissues and aligned at a maximum of 66 and 30% to proven herbicide resistant allele from Alopecurus myosuroides and Lolium rigidum, respectively. De novo transcriptome assembly enabled the generation of the first reference transcriptome of A. spica-venti. This can serve as stepping stone for understanding the metabolic herbicide resistance as well as the general biology of this problematic weed. Furthermore, this large-scale sequence data is a valuable scientific resource for comparative transcriptome analysis for Poaceae grasses.
Moisá, Sonia J; Shike, Daniel W; Shoup, Lindsay; Rodriguez-Zas, Sandra L; Loor, Juan J
2015-01-01
In model organisms both the nutrition of the mother and the young offspring could induce long-lasting transcriptional changes in tissues. In livestock, such changes could have important roles in determining nutrient use and meat quality. The main objective was to evaluate if plane of maternal nutrition during late-gestation and weaning age alter the offspring's Longissimus muscle (LM) transcriptome, animal performance, and metabolic hormones. Whole-transcriptome microarray analysis was performed on LM samples of early (EW) and normal weaned (NW) Angus × Simmental calves born to grazing cows receiving no supplement [low plane of nutrition (LPN)] or 2.3 kg high-grain mix/day [medium plane of nutrition (MPN)] during the last 105 days of gestation. Biopsies of LM were harvested at 78 (EW), 187 (NW) and 354 (before slaughter) days of age. Despite greater feed intake in MPN offspring, blood insulin was greater in LPN offspring. Carcass intramuscular fat content was greater in EW offspring. Bioinformatics analysis of the transcriptome highlighted a modest overall response to maternal plane of nutrition, resulting in only 35 differentially expressed genes (DEG). However, weaning age and a high-grain diet (EW) strongly impacted the transcriptome (DEG = 167), especially causing a lipogenic program activation. In addition, between 78 and 187 days of age, EW steers had an activation of the innate immune system due presumably to macrophage infiltration of intramuscular fat. Between 187 and 354 days of age (the "finishing" phase), NW steers had an activation of the lipogenic transcriptome machinery, while EW steers had a clear inhibition through the epigenetic control of histone acetylases. Results underscored the need to conduct further studies to understand better the functional outcome of transcriptome changes induced in the offspring by pre- and post-natal nutrition. Additional knowledge on molecular and functional outcomes would help produce more efficient beef cattle.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2011-01-01
Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
Xu, Ning; Zhao, Hong-Yan; Yin, Yin; Shen, Shan-Shan; Shan, Lin-Lin; Chen, Chuan-Xi; Zhang, Yan-Xia; Gao, Jian-Fang; Ji, Xiang
2017-04-21
We conducted an omics-analysis of the venom of Naja kaouthia from China. Proteomics analysis revealed six protein families [three-finger toxins (3-FTx), phospholipase A 2 (PLA 2 ), nerve growth factor, snake venom metalloproteinase (SVMP), cysteine-rich secretory protein and ohanin], and venom-gland transcriptomics analysis revealed 28 protein families from 79 unigenes. 3-FTx (56.5% in proteome/82.0% in transcriptome) and PLA 2 (26.9%/13.6%) were identified as the most abundant families in venom proteome and venom-gland transcriptome. Furthermore, N. kaouthia venom expressed strong lethality (i.p. LD 50 : 0.79μg/g) and myotoxicity (CK: 5939U/l) in mice, and showed notable activity in PLA 2 but weak activity in SVMP, l-amino acid oxidase or 5' nucleotidase. Antivenomic assessment revealed that several venom components (nearly 17.5% of total venom) from N. kaouthia could not be thoroughly immunocaptured by commercial Naja atra antivenom. ELISA analysis revealed that there was no difference in the cross-reaction between N. kaouthia and N. atra venoms against the N. atra antivenom. The use of commercial N. atra antivenom in treatment of snakebites caused by N. kaouthia is reasonable, but design of novel antivenom with the attention on enhancing the immune response of non-immunocaptured components should be encouraged. The venomics, antivenomics and venom-gland transcriptome of the monocoled cobra (Naja kaouthia) from China have been elucidated. Quantitative and qualitative differences are evident when venom proteomic and venom-gland transcriptomic profiles are compared. Two protein families (3-FTx and PLA 2 ) are found to be the predominated components in N. kaouthia venom, and considered as the major players in functional role of venom. Other protein families with relatively low abundance appear to be minor in the functional significance. Antivenomics and ELISA evaluation reveal that the N. kaouthia venom can be effectively immunorecognized by commercial N. atra antivenom, but still a small number of venom components could not be thoroughly immunocaptured. The findings indicate that exploring the precise composition of snake venom should be executed by an integrated omics-approach, and elucidating the venom composition is helpful in understanding composition-function relationships and will facilitate the clinical application of antivenoms. Copyright © 2017 Elsevier B.V. All rights reserved.
How to normalize metatranscriptomic count data for differential expression analysis.
Klingenberg, Heiner; Meinicke, Peter
2017-01-01
Differential expression analysis on the basis of RNA-Seq count data has become a standard tool in transcriptomics. Several studies have shown that prior normalization of the data is crucial for a reliable detection of transcriptional differences. Until now it has not been clear whether and how the transcriptomic approach can be used for differential expression analysis in metatranscriptomics. We propose a model for differential expression in metatranscriptomics that explicitly accounts for variations in the taxonomic composition of transcripts across different samples. As a main consequence the correct normalization of metatranscriptomic count data under this model requires the taxonomic separation of the data into organism-specific bins. Then the taxon-specific scaling of organism profiles yields a valid normalization and allows us to recombine the scaled profiles into a metatranscriptomic count matrix. This matrix can then be analyzed with statistical tools for transcriptomic count data. For taxon-specific scaling and recombination of scaled counts we provide a simple R script. When applying transcriptomic tools for differential expression analysis directly to metatranscriptomic data with an organism-independent (global) scaling of counts the resulting differences may be difficult to interpret. The differences may correspond to changing functional profiles of the contributing organisms but may also result from a variation of taxonomic abundances. Taxon-specific scaling eliminates this variation and therefore the resulting differences actually reflect a different behavior of organisms under changing conditions. In simulation studies we show that the divergence between results from global and taxon-specific scaling can be drastic. In particular, the variation of organism abundances can imply a considerable increase of significant differences with global scaling. Also, on real metatranscriptomic data, the predictions from taxon-specific and global scaling can differ widely. Our studies indicate that in real data applications performed with global scaling it might be impossible to distinguish between differential expression in terms of transcriptomic changes and differential composition in terms of changing taxonomic proportions. As in transcriptomics, a proper normalization of count data is also essential for differential expression analysis in metatranscriptomics. Our model implies a taxon-specific scaling of counts for normalization of the data. The application of taxon-specific scaling consequently removes taxonomic composition variations from functional profiles and therefore provides a clear interpretation of the observed functional differences.
USDA-ARS?s Scientific Manuscript database
The yeast, Metschnikowia fructicola, is an antagonist with biological control activity against postharvest diseases of several fruits. We performed a transcriptome analysis, using RNA-Seq technology, to examine the response of M. fructicola with citrus fruit and with the postharvest pathogen, Penic...
J. D. Tang; L. A. Parker; A. D. Perkins; T. S. Sonstegard; S. G. Schroeder; D. D. Nicholas; S. V. Diehl
2013-01-01
High-throughput transcriptomics was used to identify Fibroporia radiculosa genes that were differentially regulated during colonization of wood treated with a copper-based preservative. The transcriptome was profiled at two time points while the fungus was growing on wood treated with micronized copper quat (MCQ). A total of 917 transcripts were...
The Long Noncoding RNA Transcriptome of Dictyostelium discoideum Development.
Rosengarten, Rafael D; Santhanam, Balaji; Kokosar, Janez; Shaulsky, Gad
2017-02-09
Dictyostelium discoideum live in the soil as single cells, engulfing bacteria and growing vegetatively. Upon starvation, tens of thousands of amoebae enter a developmental program that includes aggregation, multicellular differentiation, and sporulation. Major shifts across the protein-coding transcriptome accompany these developmental changes. However, no study has presented a global survey of long noncoding RNAs (ncRNAs) in D. discoideum To characterize the antisense and long intergenic noncoding RNA (lncRNA) transcriptome, we analyzed previously published developmental time course samples using an RNA-sequencing (RNA-seq) library preparation method that selectively depletes ribosomal RNAs (rRNAs). We detected the accumulation of transcripts for 9833 protein-coding messenger RNAs (mRNAs), 621 lncRNAs, and 162 putative antisense RNAs (asRNAs). The noncoding RNAs were interspersed throughout the genome, and were distinct in expression level, length, and nucleotide composition. The noncoding transcriptome displayed a temporal profile similar to the coding transcriptome, with stages of gradual change interspersed with larger leaps. The transcription profiles of some noncoding RNAs were strongly correlated with known differentially expressed coding RNAs, hinting at a functional role for these molecules during development. Examining the mitochondrial transcriptome, we modeled two novel antisense transcripts. We applied yet another ribosomal depletion method to a subset of the samples to better retain transfer RNA (tRNA) transcripts. We observed polymorphisms in tRNA anticodons that suggested a post-transcriptional means by which D. discoideum compensates for codons missing in the genomic complement of tRNAs. We concluded that the prevalence and characteristics of long ncRNAs indicate that these molecules are relevant to the progression of molecular and cellular phenotypes during development. Copyright © 2017 Rosengarten et al.
Toh, Su San; Treves, David S; Barati, Michelle T; Perlin, Michael H
2016-10-01
Microbotryum lychnidis-dioicae is a member of a species complex infecting host plants in the Caryophyllaceae. It is used as a model system in many areas of research, but attempts to make this organism tractable for reverse genetic approaches have not been fruitful. Here, we exploited the recently obtained genome sequence and transcriptome analysis to inform our design of constructs for use in Agrobacterium-mediated transformation techniques currently available for other fungi. Reproducible transformation was demonstrated at the genomic, transcriptional and functional levels. Moreover, these initial proof-of-principle experiments provide evidence that supports the findings from initial global transcriptome analysis regarding expression from the respective promoters under different growth conditions of the fungus. The technique thus provides for the first time the ability to stably introduce transgenes and over-express target M. lychnidis-dioicae genes.
Lima, Leandro; Sinaimeri, Blerina; Sacomoto, Gustavo; Lopez-Maestre, Helene; Marchet, Camille; Miele, Vincent; Sagot, Marie-France; Lacroix, Vincent
2017-01-01
The main challenge in de novo genome assembly of DNA-seq data is certainly to deal with repeats that are longer than the reads. In de novo transcriptome assembly of RNA-seq reads, on the other hand, this problem has been underestimated so far. Even though we have fewer and shorter repeated sequences in transcriptomics, they do create ambiguities and confuse assemblers if not addressed properly. Most transcriptome assemblers of short reads are based on de Bruijn graphs (DBG) and have no clear and explicit model for repeats in RNA-seq data, relying instead on heuristics to deal with them. The results of this work are threefold. First, we introduce a formal model for representing high copy-number and low-divergence repeats in RNA-seq data and exploit its properties to infer a combinatorial characteristic of repeat-associated subgraphs. We show that the problem of identifying such subgraphs in a DBG is NP-complete. Second, we show that in the specific case of local assembly of alternative splicing (AS) events, we can implicitly avoid such subgraphs, and we present an efficient algorithm to enumerate AS events that are not included in repeats. Using simulated data, we show that this strategy is significantly more sensitive and precise than the previous version of KisSplice (Sacomoto et al. in WABI, pp 99-111, 1), Trinity (Grabherr et al. in Nat Biotechnol 29(7):644-652, 2), and Oases (Schulz et al. in Bioinformatics 28(8):1086-1092, 3), for the specific task of calling AS events. Third, we turn our focus to full-length transcriptome assembly, and we show that exploring the topology of DBGs can improve de novo transcriptome evaluation methods. Based on the observation that repeats create complicated regions in a DBG, and when assemblers try to traverse these regions, they can infer erroneous transcripts, we propose a measure to flag transcripts traversing such troublesome regions, thereby giving a confidence level for each transcript. The originality of our work when compared to other transcriptome evaluation methods is that we use only the topology of the DBG, and not read nor coverage information. We show that our simple method gives better results than Rsem-Eval (Li et al. in Genome Biol 15(12):553, 4) and TransRate (Smith-Unna et al. in Genome Res 26(8):1134-1144, 5) on both real and simulated datasets for detecting chimeras, and therefore is able to capture assembly errors missed by these methods.
Sweeney, Torres; Lejeune, Alex; Moloney, Aidan P; Monahan, Frank J; Gettigan, Paul Mc; Downey, Gerard; Park, Stephen D E; Ryan, Marion T
2016-09-21
Differences between cattle production systems can influence the nutritional and sensory characteristics of beef, in particular its fatty acid (FA) composition. As beef products derived from pasture-based systems can demand a higher premium from consumers, there is a need to understand the biological characteristics of pasture produced meat and subsequently to develop methods of authentication for these products. Here, we describe an approach to authentication that focuses on differences in the transcriptomic profile of muscle from animals finished in different systems of production of practical relevance to the Irish beef industry. The objectives of this study were to identify a panel of differentially expressed (DE) genes/networks in the muscle of cattle raised outdoors on pasture compared to animals raised indoors on a concentrate based diet and to subsequently identify an optimum panel which can classify the meat based on a production system. A comparison of the muscle transcriptome of outdoor/pasture-fed and Indoor/concentrate-fed cattle resulted in the identification of 26 DE genes. Functional analysis of these genes identified two significant networks (1: Energy Production, Lipid Metabolism, Small Molecule Biochemistry; and 2: Lipid Metabolism, Molecular Transport, Small Molecule Biochemistry), both of which are involved in FA metabolism. The expression of selected up-regulated genes in the outdoor/pasture-fed animals correlated positively with the total n-3 FA content of the muscle. The pathway and network analysis of the DE genes indicate that peroxisome proliferator-activated receptor (PPAR) and FYN/AMPK could be implicit in the regulation of these alterations to the lipid profile. In terms of authentication, the expression profile of three DE genes (ALAD, EIF4EBP1 and NPNT) could almost completely separate the samples based on production system (95 % authentication for animals on pasture-based and 100 % for animals on concentrate- based diet) in this context. The majority of DE genes between muscle of the outdoor/pasture-fed and concentrate-fed cattle were related to lipid metabolism and in particular β-oxidation. In this experiment the combined expression profiles of ALAD, EIF4EBP1 and NPNT were optimal in classifying the muscle transcriptome based on production system. Given the overall lack of comparable studies and variable concordance with those that do exist, the use of transcriptomic data in authenticating production systems requires more exploration across a range of contexts and breeds.
Li, Weiguo; Zhang, Lihui; Ding, Zhan; Wang, Guodong; Zhang, Yandi; Gong, Hongmei; Chang, Tianjun; Zhang, Yanwen
2017-02-28
Taihangia rupestris, an andromonoecious plant species, bears both male and hermaphroditic flowers within the same individual. However, the establishment and development of male and hermaphroditic flowers in andromonoecious Taihangia remain poorly understood, due to the limited genetic and sequence information. To investigate the potential molecular mechanism in the regulation of Taihangia flower formation, we used de novo RNA sequencing to compare the transcriptome profiles of male and hermaphroditic flowers at early and late developmental stages. Four cDNA libraries, including male floral bud, hermaphroditic floral bud, male flower, and hermaphroditic flower, were constructed and sequenced by using the Illumina RNA-Seq method. Totally, 84,596,426 qualified Illumina reads were obtained and then assembled into 59,064 unigenes, of which 24,753 unigenes were annotated in the NCBI non-redundant protein database. In addition, 12,214, 7,153, and 8,115 unigenes were assigned into 53 Gene Ontology (GO) functional groups, 25 Clusters of Orthologous Group (COG) categories, and 126 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, respectively. By pairwise comparison of unigene abundance between the samples, we identified 1,668 differential expressed genes (DEGs), including 176 transcription factors (TFs) between the male and hermaphroditic flowers. At the early developmental stage, we found 263 up-regulated genes and 436 down-regulated genes expressed in hermaphroditic floral buds, while 844 up-regulated genes and 314 down-regulated genes were detected in hermaphroditic flowers at the late developmental stage. GO and KEGG enrichment analyses showed that a large number of DEGs were associated with a wide range of functions, including cell cycle, epigenetic processes, flower development, and biosynthesis of unsaturated fatty acid pathway. Finally, real-time quantitative PCR was conducted to validate the DEGs identified in the present study. In this study, transcriptome data of this rare andromonoecious Taihangia were reported for the first time. Comparative transcriptome analysis revealed the significant differences in gene expression profiles between male and hermaphroditic flowers at early and late developmental stages. The transcriptome data of Taihangia would be helpful to improve the understanding of the underlying molecular mechanisms in regulation of flower formation and unisexual flower establishment in andromonoecious plants.
Macías-Segura, N.; Bastian, Y.; Santiago-Algarra, D.; Castillo-Ortiz, J. D.; Alemán-Navarro, A. L.; Jaime-Sánchez, E.; Gomez-Moreno, M.; Saucedo-Toral, C. A.; Lara-Ramírez, Edgar E.; Zapata-Zuñiga, M.; Enciso-Moreno, L.; González-Amaro, R.; Ramos-Remus, C.; Enciso-Moreno, J. A.
2018-01-01
Background Little is known regarding the mechanisms underlying the loss of tolerance in the early and preclinical stages of autoimmune diseases. The aim of this work was to identify the transcriptional profile and signaling pathways associated to non-treated early rheumatoid arthritis (RA) and subjects at high risk. Several biomarker candidates for early RA are proposed. Methods Whole blood total RNA was obtained from non-treated early RA patients with <1 year of evolution as well as from healthy first-degree relatives of patients with RA (FDR) classified as ACCP+ and ACCP- according to their antibodies serum levels against cyclic citrullinated peptides. Complementary RNA (cRNA) was synthetized and hybridized to high-density microarrays. Data was analyzed in Genespring Software and functional categories were assigned to a specific transcriptome identified in subjects with RA and FDR ACCP positive. Specific signaling pathways for genes associated to RA were identified. Gene expression was evaluated by qPCR. Receiver operating characteristic (ROC) analysis was used to evaluate these genes as biomarkers. Results A characteristic transcriptome of 551 induced genes and 4,402 repressed genes were identified in early RA patients. Bioinformatics analysis of the data identified a specific transcriptome in RA patients. Moreover, some overlapped transcriptional profiles between patients with RA and ACCP+ were identified, suggesting an up-regulated distinctive transcriptome from the preclinical stages up to progression to an early RA state. A total of 203 pathways have up-regulated genes that are shared between RA and ACCP+. Some of these genes show potential to be used as progression biomarkers for early RA with area under the curve of ROC > 0.92. These genes come from several functional categories associated to inflammation, Wnt signaling and type I interferon pathways. Conclusion The presence of a specific transcriptome in whole blood of RA patients suggests the activation of a specific inflammatory transcriptional signature in early RA development. The set of overexpressed genes in early RA patients that are shared with ACCP+ subjects but not with ACCP- subjects, can represent a transcriptional signature involved with the transition of a preclinical to a clinical RA stage. Some of these particular up-regulated and down-regulated genes are related to inflammatory processes and could be considered as biomarker candidates for disease progression in subjects at risk to develop RA. PMID:29584756
Möller, Carolina; Clark, Evan; Safavi-Hemami, Helena; DeCaprio, Anthony; Marí, Frank
2017-07-05
Hyaluronidases are ubiquitous enzymes commonly found in venom and their main function is to degrade hyaluran, which is the major glycosaminoglycan of the extracellular matrix in animal tissues. Here we describe the purification and characterization of a 60kDa hyaluronidase found in the injected venom from Conus purpurascens, Conohyal-P1. Using a combined strategy based on transcriptomic and proteomic analysis, we determined the Conohyal-P1 sequence. Conohyal-P1 has conserved consensus catalytic and positioning domain residues characteristic of hyaluronidases and a C-terminus EGF-like domain. Additionally, the enzyme is expressed as a mixture of glycosylated isoforms at five asparagine sites. The activity of the native Conohyal-P1 was assess MS-based methods and confirmed by classical turbidimetric methods. The MS-based assay is particularly sensitive and provides the first detailed analysis of a venom hyaluronidase activity monitored with this method. The discovery of new hyaluronidases and the development of techniques to evaluate their performance can advance several therapeutic procedures, as these enzymes are widely used for enhanced drug delivery applications. Cone snail venom is a remarkable source of therapeutically important molecules, as is the case of conotoxins, which have undergone extensive clinical trials for several applications. In addition to the conotoxins, a large array of proteins have been reported in the venom of several species of cone snails, including enzymes that were found in dissected and injected Conus venom. Here we describe the isolation and characterization of the hyaluronidase Conohyal-P1 from the injected venom of C. purpurascens. We employed a combined transcriptomic and proteomic analysis to obtain the full sequence of this hyaluronidase. The activity of Conohyal-P1 was assessed by a mass spectrometry-based method, which provide the first detailed venom hyaluronidase activity analysis monitored by mass spectrometry allowing the visualization of the substrate degradation by the enzyme. Published by Elsevier B.V.
Ochsner, Scott A.; Tsimelzon, Anna; Dong, Jianrong; Coarfa, Cristian
2016-01-01
The pregnane X receptor (PXR) (PXR/NR1I3) and constitutive androstane receptor (CAR) (CAR/NR1I2) members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors are well-characterized mediators of xenobiotic and endocrine-disrupting chemical signaling. The Nuclear Receptor Signaling Atlas maintains a growing library of transcriptomic datasets involving perturbations of NR signaling pathways, many of which involve perturbations relevant to PXR and CAR xenobiotic signaling. Here, we generated a reference transcriptome based on the frequency of differential expression of genes across 159 experiments compiled from 22 datasets involving perturbations of CAR and PXR signaling pathways. In addition to the anticipated overrepresentation in the reference transcriptome of genes encoding components of the xenobiotic stress response, the ranking of genes involved in carbohydrate metabolism and gonadotropin action sheds mechanistic light on the suspected role of xenobiotics in metabolic syndrome and reproductive disorders. Gene Set Enrichment Analysis showed that although acetaminophen, chlorpromazine, and phenobarbital impacted many similar gene sets, differences in direction of regulation were evident in a variety of processes. Strikingly, gene sets representing genes linked to Parkinson's, Huntington's, and Alzheimer's diseases were enriched in all 3 transcriptomes. The reference xenobiotic transcriptome will be supplemented with additional future datasets to provide the community with a continually updated reference transcriptomic dataset for CAR- and PXR-mediated xenobiotic signaling. Our study demonstrates how aggregating and annotating transcriptomic datasets, and making them available for routine data mining, facilitates research into the mechanisms by which xenobiotics and endocrine-disrupting chemicals subvert conventional NR signaling modalities. PMID:27409825
2018-01-01
SUMMARY Transcriptomics, the analysis of genome-wide RNA expression, is a common approach to investigate host and pathogen processes in infectious diseases. Technical and bioinformatic advances have permitted increasingly thorough analyses of the association of RNA expression with fundamental biology, immunity, pathogenesis, diagnosis, and prognosis. Transcriptomic approaches can now be used to realize a previously unattainable goal, the simultaneous study of RNA expression in host and pathogen, in order to better understand their interactions. This exciting prospect is not without challenges, especially as focus moves from interactions in vitro under tightly controlled conditions to tissue- and systems-level interactions in animal models and natural and experimental infections in humans. Here we review the contribution of transcriptomic studies to the understanding of malaria, a parasitic disease which has exerted a major influence on human evolution and continues to cause a huge global burden of disease. We consider malaria a paradigm for the transcriptomic assessment of systemic host-pathogen interactions in humans, because much of the direct host-pathogen interaction occurs within the blood, a readily sampled compartment of the body. We illustrate lessons learned from transcriptomic studies of malaria and how these lessons may guide studies of host-pathogen interactions in other infectious diseases. We propose that the potential of transcriptomic studies to improve the understanding of malaria as a disease remains partly untapped because of limitations in study design rather than as a consequence of technological constraints. Further advances will require the integration of transcriptomic data with analytical approaches from other scientific disciplines, including epidemiology and mathematical modeling. PMID:29695497
Ochsner, Scott A; Tsimelzon, Anna; Dong, Jianrong; Coarfa, Cristian; McKenna, Neil J
2016-08-01
The pregnane X receptor (PXR) (PXR/NR1I3) and constitutive androstane receptor (CAR) (CAR/NR1I2) members of the nuclear receptor (NR) superfamily of ligand-regulated transcription factors are well-characterized mediators of xenobiotic and endocrine-disrupting chemical signaling. The Nuclear Receptor Signaling Atlas maintains a growing library of transcriptomic datasets involving perturbations of NR signaling pathways, many of which involve perturbations relevant to PXR and CAR xenobiotic signaling. Here, we generated a reference transcriptome based on the frequency of differential expression of genes across 159 experiments compiled from 22 datasets involving perturbations of CAR and PXR signaling pathways. In addition to the anticipated overrepresentation in the reference transcriptome of genes encoding components of the xenobiotic stress response, the ranking of genes involved in carbohydrate metabolism and gonadotropin action sheds mechanistic light on the suspected role of xenobiotics in metabolic syndrome and reproductive disorders. Gene Set Enrichment Analysis showed that although acetaminophen, chlorpromazine, and phenobarbital impacted many similar gene sets, differences in direction of regulation were evident in a variety of processes. Strikingly, gene sets representing genes linked to Parkinson's, Huntington's, and Alzheimer's diseases were enriched in all 3 transcriptomes. The reference xenobiotic transcriptome will be supplemented with additional future datasets to provide the community with a continually updated reference transcriptomic dataset for CAR- and PXR-mediated xenobiotic signaling. Our study demonstrates how aggregating and annotating transcriptomic datasets, and making them available for routine data mining, facilitates research into the mechanisms by which xenobiotics and endocrine-disrupting chemicals subvert conventional NR signaling modalities.
Lee, Hyun Jae; Georgiadou, Athina; Otto, Thomas D; Levin, Michael; Coin, Lachlan J; Conway, David J; Cunnington, Aubrey J
2018-06-01
Transcriptomics, the analysis of genome-wide RNA expression, is a common approach to investigate host and pathogen processes in infectious diseases. Technical and bioinformatic advances have permitted increasingly thorough analyses of the association of RNA expression with fundamental biology, immunity, pathogenesis, diagnosis, and prognosis. Transcriptomic approaches can now be used to realize a previously unattainable goal, the simultaneous study of RNA expression in host and pathogen, in order to better understand their interactions. This exciting prospect is not without challenges, especially as focus moves from interactions in vitro under tightly controlled conditions to tissue- and systems-level interactions in animal models and natural and experimental infections in humans. Here we review the contribution of transcriptomic studies to the understanding of malaria, a parasitic disease which has exerted a major influence on human evolution and continues to cause a huge global burden of disease. We consider malaria a paradigm for the transcriptomic assessment of systemic host-pathogen interactions in humans, because much of the direct host-pathogen interaction occurs within the blood, a readily sampled compartment of the body. We illustrate lessons learned from transcriptomic studies of malaria and how these lessons may guide studies of host-pathogen interactions in other infectious diseases. We propose that the potential of transcriptomic studies to improve the understanding of malaria as a disease remains partly untapped because of limitations in study design rather than as a consequence of technological constraints. Further advances will require the integration of transcriptomic data with analytical approaches from other scientific disciplines, including epidemiology and mathematical modeling. Copyright © 2018 Lee et al.
Urbarova, Ilona; Karlsen, Bård Ove; Okkenhaug, Siri; Seternes, Ole Morten; Johansen, Steinar D.; Emblem, Åse
2012-01-01
Marine bioprospecting is the search for new marine bioactive compounds and large-scale screening in extracts represents the traditional approach. Here, we report an alternative complementary protocol, called digital marine bioprospecting, based on deep sequencing of transcriptomes. We sequenced the transcriptomes from the adult polyp stage of two cold-water sea anemones, Bolocera tuediae and Hormathia digitata. We generated approximately 1.1 million quality-filtered sequencing reads by 454 pyrosequencing, which were assembled into approximately 120,000 contigs and 220,000 single reads. Based on annotation and gene ontology analysis we profiled the expressed mRNA transcripts according to known biological processes. As a proof-of-concept we identified polypeptide toxins with a potential blocking activity on sodium and potassium voltage-gated channels from digital transcriptome libraries. PMID:23170083
clusterProfiler: an R package for comparing biological themes among gene clusters.
Yu, Guangchuang; Wang, Li-Gen; Han, Yanyan; He, Qing-Yu
2012-05-01
Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html.
Jeon, Jin; Kim, Jae Kwang; Kim, HyeRan; Kim, Yeon Jeong; Park, Yun Ji; Kim, Sun Ju; Kim, Changsoo; Park, Sang Un
2018-02-15
Kale (Brassica oleracea var. acephala) is a rich source of numerous health-benefiting compounds, including vitamins, glucosinolates, phenolic compounds, and carotenoids. However, the genetic resources for exploiting the phyto-nutritional traits of kales are limited. To acquire precise information on secondary metabolites in kales, we performed a comprehensive analysis of the transcriptome and metabolome of green and red kale seedlings. Kale transcriptome datasets revealed 37,149 annotated genes and several secondary metabolite biosynthetic genes. HPLC analysis revealed 14 glucosinolates, 20 anthocyanins, 3 phenylpropanoids, and 6 carotenoids in the kale seedlings that were examined. Red kale contained more glucosinolates, anthocyanins, and phenylpropanoids than green kale, whereas the carotenoid contents were much higher in green kale than in red kale. Ultimately, our data will be a valuable resource for future research on kale bio-engineering and will provide basic information to define gene-to-metabolite networks in kale. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dhanasekaran, Saravana M.; Balbin, O. Alejandro; Chen, Guoan; Nadal, Ernest; Kalyana-Sundaram, Shanker; Pan, Jincheng; Veeneman, Brendan; Cao, Xuhong; Malik, Rohit; Vats, Pankaj; Wang, Rui; Huang, Stephanie; Zhong, Jinjie; Jing, Xiaojun; Iyer, Matthew; Wu, Yi-Mi; Harms, Paul W.; Lin, Jules; Reddy, Rishindra; Brennan, Christine; Palanisamy, Nallasivam; Chang, Andrew C.; Truini, Anna; Truini, Mauro; Robinson, Dan R.; Beer, David G.; Chinnaiyan, Arul M.
2014-01-01
Lung cancer is emerging as a paradigm for disease molecular subtyping, facilitating targeted therapy based on driving somatic alterations. Here, we perform transcriptome analysis of 153 samples representing lung adenocarcinomas, squamous cell carcinomas, large cell lung cancer, adenoid cystic carcinomas and cell lines. By integrating our data with The Cancer Genome Atlas and published sources, we analyze 753 lung cancer samples for gene fusions and other transcriptomic alterations. We show that higher numbers of gene fusions is an independent prognostic factor for poor survival in lung cancer. Our analysis confirms the recently reported CD74-NRG1 fusion and suggests that NRG1, NF1 and Hippo pathway fusions may play important roles in tumors without known driver mutations. In addition, we observe exon skipping events in c-MET, which are attributable to splice site mutations. These classes of genetic aberrations may play a significant role in the genesis of lung cancers lacking known driver mutations. PMID:25531467
Transcriptome Dynamics in Mango Fruit Peel Reveals Mechanisms of Chilling Stress
Sivankalyani, Velu; Sela, Noa; Feygenberg, Oleg; Zemach, Hanita; Maurer, Dalia; Alkan, Noam
2016-01-01
Cold storage is considered the most effective method for prolonging fresh produce storage. However, subtropical fruit is sensitive to cold. Symptoms of chilling injury (CI) in mango include red and black spots that start from discolored lenticels and develop into pitting. The response of ‘Keitt’ mango fruit to chilling stress was monitored by transcriptomic, physiological, and microscopic analyses. Transcriptomic changes in the mango fruit peel were evaluated during optimal (12°C) and suboptimal (5°C) cold storage. Two days of chilling stress upregulated genes involved in the plant stress response, including those encoding transmembrane receptors, calcium-mediated signal transduction, NADPH oxidase, MAP kinases, and WRKYs, which can lead to cell death. Indeed, cell death was observed around the discolored lenticels after 19 days of cold storage at 5°C. Localized cell death and cuticular opening in the lumen of discolored lenticels were correlated with increased general decay during shelf-life storage, possibly due to fungal penetration. We also observed increased phenolics accumulation around the discolored lenticels, which was correlated with the biosynthesis of phenylpropanoids that were probably transported from the resin ducts. Increased lipid peroxidation was observed during CI by both the biochemical malondialdehyde method and a new non-destructive luminescent technology, correlated to upregulation of the α-linolenic acid oxidation pathway. Genes involved in sugar metabolism were also induced, possibly to maintain osmotic balance. This analysis provides an in-depth characterization of mango fruit response to chilling stress and could lead to the development of new tools, treatments and strategies to prolong cold storage of subtropical fruit. PMID:27812364
Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.
Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong
2014-05-01
We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.
Hyun, Tae Kyung; Lee, Sarah; Kumar, Dhinesh; Rim, Yeonggil; Kumar, Ritesh; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean
2014-10-01
Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data containing abundant information on genes involved in the metabolic pathways in R. idaeus cv. Nova fruits. Rubus idaeus (Red raspberry) is one of the important economical crops that possess numerous nutrients, micronutrients and phytochemicals with essential health benefits to human. The molecular mechanism underlying the ripening process and phytochemical biosynthesis in red raspberry is attributed to the changes in gene expression, but very limited transcriptomic and genomic information in public databases is available. To address this issue, we generated more than 51 million sequencing reads from R. idaeus cv. Nova fruit using Illumina RNA-Seq technology. After de novo assembly, we obtained 42,604 unigenes with an average length of 812 bp. At the protein level, Nova fruit transcriptome showed 77 and 68 % sequence similarities with Rubus coreanus and Fragaria versa, respectively, indicating the evolutionary relationship between them. In addition, 69 % of assembled unigenes were annotated using public databases including NCBI non-redundant, Cluster of Orthologous Groups and Gene ontology database, suggesting that our transcriptome dataset provides a valuable resource for investigating metabolic processes in red raspberry. To analyze the relationship between several novel transcripts and the amounts of metabolites such as γ-aminobutyric acid and anthocyanins, real-time PCR and target metabolite analysis were performed on two different ripening stages of Nova. This is the first attempt using Illumina sequencing platform for RNA sequencing and de novo assembly of Nova fruit without reference genome. Our data provide the most comprehensive transcriptome resource available for Rubus fruits, and will be useful for understanding the ripening process and for breeding R. idaeus cultivars with improved fruit quality.
Luo, Hui; Xiao, Shijun; Ye, Hua; Zhang, Zhengshi; Lv, Changhuan; Zheng, Shuming; Wang, Zhiyong; Wang, Xiaoqing
2016-01-01
Schizothorax prenanti (S. prenanti) is mainly distributed in the upstream regions of the Yangtze River and its tributaries in China. This species is indigenous and commercially important. However, in recent years, wild populations and aquacultures have faced the serious challenges of germplasm variation loss and an increased susceptibility to a range of pathogens. Currently, the genetics and immune mechanisms of S. prenanti are unknown, partly due to a lack of genome and transcriptome information. Here, we sought to identify genes related to immune functions and to identify molecular markers to study the function of these genes and for trait mapping. To this end, the transcriptome from spleen tissues of S. prenanti was analyzed and sequenced. Using paired-end reads from the Illumina Hiseq2500 platform, 48,517 transcripts were isolated from the spleen transcriptome. These transcripts could be clustered into 37,785 unigenes with an N50 length of 2,539 bp. The majority of the unigenes (35,653, 94.4%) were successfully annotated using non-redundant nucleotide sequence analysis (nt), and the non-redundant protein (nr), Swiss-Prot, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. KEGG pathway assignment identified more than 500 immune-related genes. Furthermore, 7,545 putative simple sequence repeats (SSRs), 857,535 single nucleotide polymorphisms (SNPs), and 53,481 insertion/deletion (InDels) were detected from the transcriptome. This is the first reported high-throughput transcriptome analysis of S. prenanti, and it provides valuable genetic resources for the investigation of immune mechanisms, conservation of germplasm, and molecular marker-assisted breeding of S. prenanti.
Leach, Richard E.; Jessmon, Philip; Coutifaris, Christos; Kruger, Michael; Myers, Evan R.; Ali-Fehmi, Rouba; Carson, Sandra A.; Legro, Richard S.; Schlaff, William D.; Carr, Bruce R.; Steinkampf, Michael P.; Silva, Susan; Leppert, Phyllis C.; Giudice, Linda; Diamond, Michael P.; Armant, D. Randall
2012-01-01
BACKGROUND Although histological dating of endometrial biopsies provides little help for prediction or diagnosis of infertility, analysis of individual endometrial proteins, proteomic profiling and transcriptome analysis have suggested several biomarkers with altered expression arising from intrinsic abnormalities, inadequate stimulation by or in response to gonadal steroids or altered function due to systemic disorders. The objective of this study was to delineate the developmental dynamics of potentially important proteins in the secretory phase of the menstrual cycle, utilizing a collection of endometrial biopsies from women of fertile (n = 89) and infertile (n = 89) couples. METHODS AND RESULTS Progesterone receptor-B (PGR-B), leukemia inhibitory factor, glycodelin/progestagen-associated endometrial protein (PAEP), homeobox A10, heparin-binding EGF-like growth factor, calcitonin and chemokine ligand 14 (CXCL14) were measured using a high-throughput, quantitative immunohistochemical method. Significant cyclic and tissue-specific regulation was documented for each protein, as well as their dysregulation in women of infertile couples. Infertile patients demonstrated a delay early in the secretory phase in the decline of PGR-B (P < 0.05) and premature mid-secretory increases in PAEP (P < 0.05) and CXCL14 (P < 0.05), suggesting that the implantation interval could be closing early. Correlation analysis identified potential interactions among certain proteins that were disrupted by infertility. CONCLUSIONS This approach overcomes the limitations of a small sample number. Protein expression and localization provided important insights into the potential roles of these proteins in normal and pathological development of the endometrium that is not attainable from transcriptome analysis, establishing a basis for biomarker, diagnostic and targeted drug development for women with infertility. PMID:22215622
NASA Astrophysics Data System (ADS)
Blasi, Thomas; Buettner, Florian; Strasser, Michael K.; Marr, Carsten; Theis, Fabian J.
2017-06-01
Accessing gene expression at a single-cell level has unraveled often large heterogeneity among seemingly homogeneous cells, which remains obscured when using traditional population-based approaches. The computational analysis of single-cell transcriptomics data, however, still imposes unresolved challenges with respect to normalization, visualization and modeling the data. One such issue is differences in cell size, which introduce additional variability into the data and for which appropriate normalization techniques are needed. Otherwise, these differences in cell size may obscure genuine heterogeneities among cell populations and lead to overdispersed steady-state distributions of mRNA transcript numbers. We present cgCorrect, a statistical framework to correct for differences in cell size that are due to cell growth in single-cell transcriptomics data. We derive the probability for the cell-growth-corrected mRNA transcript number given the measured, cell size-dependent mRNA transcript number, based on the assumption that the average number of transcripts in a cell increases proportionally to the cell’s volume during the cell cycle. cgCorrect can be used for both data normalization and to analyze the steady-state distributions used to infer the gene expression mechanism. We demonstrate its applicability on both simulated data and single-cell quantitative real-time polymerase chain reaction (PCR) data from mouse blood stem and progenitor cells (and to quantitative single-cell RNA-sequencing data obtained from mouse embryonic stem cells). We show that correcting for differences in cell size affects the interpretation of the data obtained by typically performed computational analysis.
Transcriptome analysis of zebrafish embryos exposed to deltamethrin.
Chueh, Tsung-Cheng; Hsu, Li-Sung; Kao, Chin-Ming; Hsu, Tung-Wei; Liao, Hung-Yu; Wang, Kuan-Yi; Chen, Ssu Ching
2017-05-01
Deltamethrin (DTM), a type II pyrethroid, is one of the most commonly used insecticides. The increased use of pyrethroid leads to potential adverse effects, particularly in sensitive populations such as children and pregnant women. None of the related studies was focused on the transcriptome responses in zebrafish embryos after treatment with DTM; therefore, RNA-seq, a high-throughput method, was performed to analyze the global expression of differential expressed genes (DEGs) in zebrafish embryos treated with DTM (40 and 80 μg/L) from fertilization to 48 h postfertilization (hpf) as compared with that in the control group (without DTM treatment). Two cDNA libraries were generated from treated embryos and one cDNA library from nontreated embryos, respectively. Over 92% of reads mapped to the reference in these three libraries. It was observed that many differential genes were expressed in comparison with embryos before and after DTM. The 20 most differentially expressed upregulated or downregulated genes were majorly involved in the signaling transduction. Validation of selected nine genes expression using qRT-PCR confirmed RNA-seq results. The transcriptome sequences were further subjected to gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, showing G-protein-coupled receptor signaling pathway and neuroactive ligand-receptor interaction, respectively, were most enriched. The data from this study contributed to a better understanding of the potential consequences of fish exposed to DTM, to an evaluation of the potential threat of DTM to fish populations in aquatic environments. © 2016 Wiley Periodicals, Inc. Environ Toxicol 32: 1548-1557, 2017. © 2016 Wiley Periodicals, Inc.
Rai, Muhammad Farooq; Patra, Debabrata; Sandell, Linda J.; Brophy, Robert H.
2013-01-01
Objective Meniscus tears are associated with a heightened risk for osteoarthritis. We aimed to advance our understanding of the metabolic state of human injured meniscus at the time of arthroscopic partial meniscectomy through transcriptome-wide analysis of gene expression in relation to patient age and degree of cartilage chondrosis. Methods The degree of chondrosis of knee cartilage was recorded at the time of meniscectomy in symptomatic patients without radiographic osteoarthritis. RNA preparations from resected menisci (N=12) were subjected to transcriptome-wide microarray and QuantiGene Plex analyses. The relative changes in gene expression variation with age and chondrosis were analyzed and integrated biological processes were investigated computationally. Results We identified a set of genes in torn meniscus that were differentially expressed with age and chondrosis. There were 866 genes differentially regulated (≥1.5-fold; P<0.05) with age and 49 with chondrosis. In older patients, genes associated with cartilage and skeletal development and extracellular matrix synthesis were repressed while those involved in immune response, inflammation, cell cycle, and cellular proliferation were stimulated. With chondrosis, genes representing cell catabolism (cAMP catabolic process) and tissue and endothelial cell development were repressed and those involved in T cell differentiation and apoptosis were elevated. Conclusion Differences in age-related gene expression suggest that in older adults, meniscal cells might de-differentiate and initiate a proliferative phenotype. Conversely, meniscal cells in younger patients appear to respond to injury, but maintain the differentiated phenotype. Definitive molecular signatures identified in damaged meniscus could be segregated largely with age and, to a lesser extent, with chondrosis. PMID:23658108
Meta-Transcriptomic Analysis of a Chromate-Reducing Aquifer Microbial Community
NASA Astrophysics Data System (ADS)
Beller, H. R.; Brodie, E. L.; Han, R.; Karaoz, U.
2010-12-01
A major challenge for microbial ecology that has become more tractable in the advent of new molecular techniques is characterizing gene expression in complex microbial communities. We are using meta-transcriptomic analysis to characterize functional changes in an aquifer-derived, chromate-reducing microbial community as it transitions through various electron-accepting conditions. We inoculated anaerobic microcosms with groundwater from the Cr-contaminated Hanford 100H site and supplemented them with lactate and electron acceptors present at the site, namely, nitrate, sulfate, and Fe(III). The microcosms progressed successively through various electron-accepting conditions (e.g., denitrifying, sulfate-reducing, and ferric iron-reducing conditions, as well as nitrate-dependent, chemolithotrophic Fe(II)-oxidizing conditions). Cr(VI) was rapidly reduced initially and again upon further Cr(VI) amendments. Extensive geochemical sampling and analysis (e.g., lactate, acetate, chloride, nitrate, nitrite, sulfate, dissolved Cr(VI), total Fe(II)), RNA/DNA harvesting, and PhyloChip analyses were conducted. Methods were developed for removal of rRNA from total RNA in preparation for meta-transcriptome sequencing. To date, samples representing denitrifying and fermentative/sulfate-reducing conditions have been sequenced using 454 Titanium technology. Of the non-rRNA related reads for the denitrifying sample (which was also actively reducing chromate), ca. 8% were associated with denitrification and ca. 0.9% were associated with chromate resistance/transport, in contrast to the fermentative/sulfate-reducing sample (in which chromate had already been reduced), which had zero reads associated with either of these categories but many predicted proteins associated with sulfate-reducing bacteria. We observed sequences for key functional transcripts that were unique at the nucleotide level compared to the GenBank non-redundant database [such as L-lactate dehydrogenase (iron-sulfur-cluster-binding subunit), cytochrome cd1 nitrite reductase (nirS) (from the denitrifying phase), and dissimilatory sulfite reductase (dsrA, dsrB) (from the sulfate-reducing phase)]. One potential advantage of this approach is that such important genes may not have been detected using more traditional techniques, including PCR-based methods and a priori functional microarrays.
Sun, Li Xue; Teng, Jian; Zhao, Yan; Li, Ning; Wang, Hui; Ji, Xiang Shan
2018-02-28
Nowadays, the molecular mechanisms governing TSD (temperature-dependent sex determination) or GSD + TE (genotypic sex determination + temperature effects) remain a mystery in fish. We developed three all-female families of Nile tilapia ( Oreochromis niloticus ), and the family with the highest male ratio after high-temperature treatment was used for transcriptome analysis. First, gonadal histology analysis indicated that the histological morphology of control females (CF) was not significantly different from that of high-temperature-treated females (TF) at various development stages. However, the high-temperature treatment caused a lag of spermatogenesis in high-temperature-induced neomales (IM). Next, we sequenced the transcriptome of CF, TF, and IM Nile tilapia. 79, 11,117, and 11,000 differentially expressed genes (DEGs) were detected in the CF-TF, CF-IM, and TF-IM comparisons, respectively, and 44 DEGs showed identical expression changes in the CF-TF and CF-IM comparisons. Principal component analysis (PCA) indicated that three individuals in CF and three individuals in TF formed a cluster, and three individuals in IM formed a distinct cluster, which confirmed that the gonad transcriptome profile of TF was similar to that of CF and different from that of IM. Finally, six sex-related genes were validated by qRT-PCR. This study identifies a number of genes that may be involved in GSD + TE, which will be useful for investigating the molecular mechanisms of TSD or GSD + TE in fish.
Gaur, Mahendra; Das, Aradhana; Sahoo, Rajesh Kumar; Mohanty, Sujata; Joshi, Raj Kumar; Subudhi, Enketeswara
2016-09-01
Ginger (Zingiber officinale Rosc.), a well-known member of family Zingiberaceae, is bestowed with number of medicinal properties which is because of the secondary metabolites, essential oil and oleoresin, it contains in its rhizome. The drug yielding potential is known to depend on agro-climatic conditions prevailing at the place cultivation. Present study deals with comparative transcriptome analysis of two sample of elite ginger variety Suprabha collected from two different agro-climatic zones of Odisha. Transcriptome assembly for both the samples was done using next generation sequencing methodology. The raw data of size 10.8 and 11.8 GB obtained from analysis of two rhizomes S1Z4 and S2Z5 collected from Bhubaneswar and Koraput and are available in NCBI accession number SAMN03761169 and SAMN03761176 respectively. We identified 60,452 and 54,748 transcripts using trinity tool respectively from ginger rhizome of S1Z4 and S2Z5. The transcript length varied from 300 bp to 15,213 bp and 8988 bp and N50 value of 1415 bp and 1334 bp respectively for S1Z4 and S2Z5. To the best of our knowledge, this is the first comparative transcriptome analysis of elite ginger cultivars Suprabha from two different agro-climatic conditions of Odisha, India which will help to understand the effect of agro-climatic conditions on differential expression of secondary metabolites.
Sa, Renna; Zhong, Ruqing; Xing, Huan; Zhang, Hongfu
2016-01-01
Atmospheric ammonia is a common problem in poultry industry. High concentrations of aerial ammonia cause great harm to broilers' health and production. For the consideration of human health, the limit exposure concentration of ammonia in houses is set at 25 ppm. Previous reports have shown that 25 ppm is still detrimental to livestock, especially the gastrointestinal tract and respiratory tract, but the negative relationship between ammonia exposure and the tissue of breast muscle of broilers is still unknown. In the present study, 25 ppm ammonia in poultry houses was found to lower slaughter performance and breast yield. Then, high-throughput RNA sequencing was utilized to identify differentially expressed genes in breast muscle of broiler chickens exposed to high (25 ppm) or low (3 ppm) levels of atmospheric ammonia. The transcriptome analysis showed that 163 genes (fold change ≥ 2 or ≤ 0.5; P-value < 0.05) were differentially expressed between Ammonia25 (treatment group) and Ammonia3 (control group), including 96 down-regulated and 67 up-regulated genes. qRT-PCR analysis validated the transcriptomic results of RNA sequencing. Gene Ontology (GO) functional annotation analysis revealed potential genes, processes and pathways with putative involvement in growth and development inhibition of breast muscle in broilers caused by aerial ammonia exposure. This study facilitates understanding of the genetic architecture of the chicken breast muscle transcriptome, and has identified candidate genes for breast muscle response to atmospheric ammonia exposure. PMID:27611572
Gene expression analysis of induced pluripotent stem cells from aneuploid chromosomal syndromes
2013-01-01
Background Human aneuploidy is the leading cause of early pregnancy loss, mental retardation, and multiple congenital anomalies. Due to the high mortality associated with aneuploidy, the pathophysiological mechanisms of aneuploidy syndrome remain largely unknown. Previous studies focused mostly on whether dosage compensation occurs, and the next generation transcriptomics sequencing technology RNA-seq is expected to eventually uncover the mechanisms of gene expression regulation and the related pathological phenotypes in human aneuploidy. Results Using next generation transcriptomics sequencing technology RNA-seq, we profiled the transcriptomes of four human aneuploid induced pluripotent stem cell (iPSC) lines generated from monosomy × (Turner syndrome), trisomy 8 (Warkany syndrome 2), trisomy 13 (Patau syndrome), and partial trisomy 11:22 (Emanuel syndrome) as well as two umbilical cord matrix iPSC lines as euploid controls to examine how phenotypic abnormalities develop with aberrant karyotype. A total of 466 M (50-bp) reads were obtained from the six iPSC lines, and over 13,000 mRNAs were identified by gene annotation. Global analysis of gene expression profiles and functional analysis of differentially expressed (DE) genes were implemented. Over 5000 DE genes are determined between aneuploidy and euploid iPSCs respectively while 9 KEGG pathways are overlapped enriched in four aneuploidy samples. Conclusions Our results demonstrate that the extra or missing chromosome has extensive effects on the whole transcriptome. Functional analysis of differentially expressed genes reveals that the genes most affected in aneuploid individuals are related to central nervous system development and tumorigenesis. PMID:24564826
Histological and Transcriptomic Analysis during Bulbil Formation in Lilium lancifolium
Yang, Panpan; Xu, Leifeng; Xu, Hua; Tang, Yuchao; He, Guoren; Cao, Yuwei; Feng, Yayan; Yuan, Suxia; Ming, Jun
2017-01-01
Aerial bulbils are an important propagative organ, playing an important role in population expansion. However, the detailed gene regulatory patterns and molecular mechanism underlying bulbil formation remain unclear. Triploid Lilium lancifolium, which develops many aerial bulbils on the leaf axils of middle-upper stem, is a useful species for investigating bulbil formation. To investigate the mechanism of bulbil formation in triploid L. lancifolium, we performed histological and transcriptomic analyses using samples of leaf axils located in the upper and lower stem of triploid L. lancifolium during bulbil formation. Histological results indicated that the bulbils of triploid L. lancifolium are derived from axillary meristems that initiate de novo from cells on the adaxial side of the petiole base. Transcriptomic analysis generated ~650 million high-quality reads and 11,871 differentially expressed genes (DEGs). Functional analysis showed that the DEGs were significantly enriched in starch and sucrose metabolism and plant hormone signal transduction. Starch synthesis and accumulation likely promoted the initiation of upper bulbils in triploid L. lancifolium. Hormone-associated pathways exhibited distinct patterns of change in each sample. Auxin likely promoted the initiation of bulbils and then inhibited further bulbil formation. High biosynthesis and low degradation of cytokinin might have led to bulbil formation in the upper leaf axil. The present study achieved a global transcriptomic analysis focused on gene expression changes and pathways' enrichment during upper bulbil formation in triploid L. lancifolium, laying a solid foundation for future molecular studies on bulbil formation. PMID:28912794
Agrawal, A; Khan, MJ; Graugnard, DE; Vailati-Riboni, M; Rodriguez-Zas, SL; Osorio, JS; Loor, JJ
2017-01-01
In the dairy industry, cow health and farmer profits depend on the balance between diet (ie, nutrient composition, daily intake) and metabolism. This is especially true during the transition period, where dramatic physiological changes foster vulnerability to immunosuppression, negative energy balance, and clinical and subclinical disorders. Using an Agilent microarray platform, this study examined changes in the transcriptome of bovine polymorphonuclear leukocytes (PMNLs) due to prepartal dietary intake. Holstein cows were fed a high-straw, control-energy diet (CON; NEL = 1.34 Mcal/kg) or overfed a moderate-energy diet (OVE; NEL = 1.62 Mcal/kg) during the dry period. Blood for PMNL isolation and metabolite analysis was collected at −14 and +7 days relative to parturition. At an analysis of variance false discovery rate <0.05, energy intake (OVE vs CON) influenced 1806 genes. Dynamic Impact Approach bioinformatics analysis classified treatment effects on Kyoto Encyclopedia of Genes and Genomes pathways, including activated oxidative phosphorylation and biosynthesis of unsaturated fatty acids and inhibited RNA polymerase, proteasome, and toll-like receptor signaling pathway. This analysis indicates that processes critical for energy metabolism and cellular and immune function were affected with mixed results. However, overall interpretation of the transcriptome data agreed in part with literature documenting a potentially detrimental, chronic activation of PMNL in response to overfeeding. The widespread, transcriptome-level changes captured here confirm the importance of dietary energy adjustments around calving on the immune system. PMID:28579762
Meena, Seema; Kumar, Sarma R; Venkata Rao, D K; Dwivedi, Varun; Shilpashree, H B; Rastogi, Shubhra; Shasany, Ajit K; Nagegowda, Dinesh A
2016-01-01
Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition.
Valenzuela-Muñoz, Valentina; Sturm, Armin; Gallardo-Escárate, Cristian
2015-04-09
ATP-binding cassette (ABC) protein family encode for membrane proteins involved in the transport of various biomolecules through the cellular membrane. These proteins have been identified in all taxa and present important physiological functions, including the process of insecticide detoxification in arthropods. For that reason the ectoparasite Caligus rogercresseyi represents a model species for understanding the molecular underpinnings involved in insecticide drug resistance. llumina sequencing was performed using sea lice exposed to 2 and 3 ppb of deltamethrin and azamethiphos. Contigs obtained from de novo assembly were annotated by Blastx. RNA-Seq analysis was performed and validated by qPCR analysis. From the transcriptome database of C. rogercresseyi, 57 putative members of ABC protein sequences were identified and phylogenetically classified into the eight subfamilies described for ABC transporters in arthropods. Transcriptomic profiles for ABC proteins subfamilies were evaluated throughout C. rogercresseyi development. Moreover, RNA-Seq analysis was performed for adult male and female salmon lice exposed to the delousing drugs azamethiphos and deltamethrin. High transcript levels of the ABCB and ABCC subfamilies were evidenced. Furthermore, SNPs mining was carried out for the ABC proteins sequences, revealing pivotal genomic information. The present study gives a comprehensive transcriptome analysis of ABC proteins from C. rogercresseyi, providing relevant information about transporter roles during ontogeny and in relation to delousing drug responses in salmon lice. This genomic information represents a valuable tool for pest management in the Chilean salmon aquaculture industry.
Meena, Seema; Kumar, Sarma R.; Venkata Rao, D. K.; Dwivedi, Varun; Shilpashree, H. B.; Rastogi, Shubhra; Shasany, Ajit K.; Nagegowda, Dinesh A.
2016-01-01
Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition. PMID:27516768
A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics
House, John S.; Grimm, Fabian A.; Jima, Dereje D.; Zhou, Yi-Hui; Rusyn, Ivan; Wright, Fred A.
2017-01-01
Cell-based assays are an attractive option to measure gene expression response to exposure, but the cost of whole-transcriptome RNA sequencing has been a barrier to the use of gene expression profiling for in vitro toxicity screening. In addition, standard RNA sequencing adds variability due to variable transcript length and amplification. Targeted probe-sequencing technologies such as TempO-Seq, with transcriptomic representation that can vary from hundreds of genes to the entire transcriptome, may reduce some components of variation. Analyses of high-throughput toxicogenomics data require renewed attention to read-calling algorithms and simplified dose–response modeling for datasets with relatively few samples. Using data from induced pluripotent stem cell-derived cardiomyocytes treated with chemicals at varying concentrations, we describe here and make available a pipeline for handling expression data generated by TempO-Seq to align reads, clean and normalize raw count data, identify differentially expressed genes, and calculate transcriptomic concentration–response points of departure. The methods are extensible to other forms of concentration–response gene-expression data, and we discuss the utility of the methods for assessing variation in susceptibility and the diseased cellular state. PMID:29163636
Scaria, Joy; Sreedharan, Aswathy; Chang, Yung-Fu
2008-01-01
Background Microarrays are becoming a very popular tool for microbial detection and diagnostics. Although these diagnostic arrays are much simpler when compared to the traditional transcriptome arrays, due to the high throughput nature of the arrays, the data analysis requirements still form a bottle neck for the widespread use of these diagnostic arrays. Hence we developed a new online data sharing and analysis environment customised for diagnostic arrays. Methods Microbial Diagnostic Array Workstation (MDAW) is a database driven application designed in MS Access and front end designed in ASP.NET. Conclusion MDAW is a new resource that is customised for the data analysis requirements for microbial diagnostic arrays. PMID:18811969
Codina-Solà, Marta; Rodríguez-Santiago, Benjamín; Homs, Aïda; Santoyo, Javier; Rigau, Maria; Aznar-Laín, Gemma; Del Campo, Miguel; Gener, Blanca; Gabau, Elisabeth; Botella, María Pilar; Gutiérrez-Arumí, Armand; Antiñolo, Guillermo; Pérez-Jurado, Luis Alberto; Cuscó, Ivon
2015-01-01
Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders with high heritability. Recent findings support a highly heterogeneous and complex genetic etiology including rare de novo and inherited mutations or chromosomal rearrangements as well as double or multiple hits. We performed whole-exome sequencing (WES) and blood cell transcriptome by RNAseq in a subset of male patients with idiopathic ASD (n = 36) in order to identify causative genes, transcriptomic alterations, and susceptibility variants. We detected likely monogenic causes in seven cases: five de novo (SCN2A, MED13L, KCNV1, CUL3, and PTEN) and two inherited X-linked variants (MAOA and CDKL5). Transcriptomic analyses allowed the identification of intronic causative mutations missed by the usual filtering of WES and revealed functional consequences of some rare mutations. These included aberrant transcripts (PTEN, POLR3C), deregulated expression in 1.7% of mutated genes (that is, SEMA6B, MECP2, ANK3, CREBBP), allele-specific expression (FUS, MTOR, TAF1C), and non-sense-mediated decay (RIT1, ALG9). The analysis of rare inherited variants showed enrichment in relevant pathways such as the PI3K-Akt signaling and the axon guidance. Integrative analysis of WES and blood RNAseq data has proven to be an efficient strategy to identify likely monogenic forms of ASD (19% in our cohort), as well as additional rare inherited mutations that can contribute to ASD risk in a multifactorial manner. Blood transcriptomic data, besides validating 88% of expressed variants, allowed the identification of missed intronic mutations and revealed functional correlations of genetic variants, including changes in splicing, expression levels, and allelic expression.
He, Lin; Jiang, Hui; Cao, Dandan; Liu, Lihua; Hu, Songnian; Wang, Qun
2013-01-01
The accessory sex gland (ASG) is an important component of the male reproductive system, which functions to enhance the fertility of spermatozoa during male reproduction. Certain proteins secreted by the ASG are known to bind to the spermatozoa membrane and affect its function. The ASG gene expression profile in Chinese mitten crab (Eriocheir sinensis) has not been extensively studied, and limited genetic research has been conducted on this species. The advent of high-throughput sequencing technologies enables the generation of genomic resources within a short period of time and at minimal cost. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive transcript dataset for the ASG of E. sinensis using Illumina sequencing technology. This analysis yielded a total of 33,221,284 sequencing reads, including 2.6 Gb of total nucleotides. Reads were assembled into 85,913 contigs (average 218 bp), or 58,567 scaffold sequences (average 292 bp), that identified 37,955 unigenes (average 385 bp). We assembled all unigenes and compared them with the published testis transcriptome from E. sinensis. In order to identify which genes may be involved in ASG function, as it pertains to modification of spermatozoa, we compared the ASG and testis transcriptome of E. sinensis. Our analysis identified specific genes with both higher and lower tissue expression levels in the two tissues, and the functions of these genes were analyzed to elucidate their potential roles during maturation of spermatozoa. Availability of detailed transcriptome data from ASG and testis in E. sinensis can assist our understanding of the molecular mechanisms involved with spermatozoa conservation, transport, maturation and capacitation and potentially acrosome activation. PMID:23342039
Hussain, Tajammul; Plunkett, Blue; Ejaz, Mahwish; Espley, Richard V.; Kayser, Oliver
2018-01-01
The liverwort Radula marginata belongs to the bryophyte division of land plants and is a prospective alternate source of cannabinoid-like compounds. However, mechanistic insights into the molecular pathways directing the synthesis of these cannabinoid-like compounds have been hindered due to the lack of genetic information. This prompted us to do deep sequencing, de novo assembly and annotation of R. marginata transcriptome, which resulted in the identification and validation of the genes for cannabinoid biosynthetic pathway. In total, we have identified 11,421 putative genes encoding 1,554 enzymes from 145 biosynthetic pathways. Interestingly, we have identified all the upstream genes of the central precursor of cannabinoid biosynthesis, cannabigerolic acid (CBGA), including its two first intermediates, stilbene acid (SA) and geranyl diphosphate (GPP). Expression of all these genes was validated using quantitative real-time PCR. We have characterized the protein structure of stilbene synthase (STS), which is considered as a homolog of olivetolic acid in R. marginata. Moreover, the metabolomics approach enabled us to identify CBGA-analogous compounds using electrospray ionization mass spectrometry (ESI-MS/MS) and gas chromatography mass spectrometry (GC-MS). Transcriptomic analysis revealed 1085 transcription factors (TF) from 39 families. Comparative analysis showed that six TF families have been uniquely predicted in R. marginata. In addition, the bioinformatics analysis predicted a large number of simple sequence repeats (SSRs) and non-coding RNAs (ncRNAs). Our results collectively provide mechanistic insights into the putative precursor genes for the biosynthesis of cannabinoid-like compounds and a novel transcriptomic resource for R. marginata. The large-scale transcriptomic resource generated in this study would further serve as a reference transcriptome to explore the Radulaceae family.
Transcriptomic responses to wounding: meta-analysis of gene expression microarray data.
Sass, Piotr Andrzej; Dąbrowski, Michał; Charzyńska, Agata; Sachadyn, Paweł
2017-11-07
A vast amount of microarray data on transcriptomic response to injury has been collected so far. We designed the analysis in order to identify the genes displaying significant changes in expression after wounding in different organisms and tissues. This meta-analysis is the first study to compare gene expression profiles in response to wounding in as different tissues as heart, liver, skin, bones, and spinal cord, and species, including rat, mouse and human. We collected available microarray transcriptomic profiles obtained from different tissue injury experiments and selected the genes showing a minimum twofold change in expression in response to wounding in prevailing number of experiments for each of five wound healing stages we distinguished: haemostasis & early inflammation, inflammation, early repair, late repair and remodelling. During the initial phases after wounding, haemostasis & early inflammation and inflammation, the transcriptomic responses showed little consistency between different tissues and experiments. For the later phases, wound repair and remodelling, we identified a number of genes displaying similar transcriptional responses in all examined tissues. As revealed by ontological analyses, activation of certain pathways was rather specific for selected phases of wound healing, such as e.g. responses to vitamin D pronounced during inflammation. Conversely, we observed induction of genes encoding inflammatory agents and extracellular matrix proteins in all wound healing phases. Further, we selected several genes differentially upregulated throughout different stages of wound response, including established factors of wound healing in addition to those previously unreported in this context such as PTPRC and AQP4. We found that transcriptomic responses to wounding showed similar traits in a diverse selection of tissues including skin, muscles, internal organs and nervous system. Notably, we distinguished transcriptional induction of inflammatory genes not only in the initial response to wounding, but also later, during wound repair and tissue remodelling.
Acclimation of Antarctic Chlamydomonas to the sea-ice environment: a transcriptomic analysis.
Liu, Chenlin; Wang, Xiuliang; Wang, Xingna; Sun, Chengjun
2016-07-01
The Antarctic green alga Chlamydomonas sp. ICE-L was isolated from sea ice. As a psychrophilic microalga, it can tolerate the environmental stress in the sea-ice brine, such as freezing temperature and high salinity. We performed a transcriptome analysis to identify freezing stress responding genes and explore the extreme environmental acclimation-related strategies. Here, we show that many genes in ICE-L transcriptome that encoding PUFA synthesis enzymes, molecular chaperon proteins, and cell membrane transport proteins have high similarity to the gens from Antarctic bacteria. These ICE-L genes are supposed to be acquired through horizontal gene transfer from its symbiotic microbes in the sea-ice brine. The presence of these genes in both sea-ice microalgae and bacteria indicated the biological processes they involved in are possibly contributing to ICE-L success in sea ice. In addition, the biological pathways were compared between ICE-L and its closely related sister species, Chlamydomonas reinhardtii and Volvox carteri. In ICE-L transcripome, many sequences homologous to the plant or bacteria proteins in the post-transcriptional, post-translational modification, and signal-transduction KEGG pathways, are absent in the nonpsychrophilic green algae. These complex structural components might imply enhanced stress adaptation capacity. At last, differential gene expression analysis at the transcriptome level of ICE-L indicated that genes that associated with post-translational modification, lipid metabolism, and nitrogen metabolism are responding to the freezing treatment. In conclusion, the transcriptome of Chlamydomonas sp. ICE-L is very useful for exploring the mutualistic interaction between microalgae and bacteria in sea ice; and discovering the specific genes and metabolism pathways responding to the freezing acclimation in psychrophilic microalgae.
Madio, Bruno; Undheim, Eivind A B; King, Glenn F
2017-08-23
More than a century of research on sea anemone venoms has shown that they contain a diversity of biologically active proteins and peptides. However, recent omics studies have revealed that much of the venom proteome remains unexplored. We used, for the first time, a combination of proteomic and transcriptomic techniques to obtain a holistic overview of the venom arsenal of the well-studied sea anemone Stichodactyla haddoni. A purely search-based approach to identify putative toxins in a transcriptome from tentacles regenerating after venom extraction identified 508 unique toxin-like transcripts grouped into 63 families. However, proteomic analysis of venom revealed that 52 of these toxin families are likely false positives. In contrast, the combination of transcriptomic and proteomic data enabled positive identification of 23 families of putative toxins, 12 of which have no homology known proteins or peptides. Our data highlight the importance of using proteomics of milked venom to correctly identify venom proteins/peptides, both known and novel, while minimizing false positive identifications from non-toxin homologues identified in transcriptomes of venom-producing tissues. This work lays the foundation for uncovering the role of individual toxins in sea anemone venom and how they contribute to the envenomation of prey, predators, and competitors. Proteomic analysis of milked venom combined with analysis of a tentacle transcriptome revealed the full extent of the venom arsenal of the sea anemone Stichodactyla haddoni. This combined approach led to the discovery of 12 entirely new families of disulfide-rich peptides and proteins in a genus of anemones that have been studied for over a century. Copyright © 2017 Elsevier B.V. All rights reserved.
Analysis of the Salivary Gland Transcriptome of Frankliniella occidentalis
Stafford-Banks, Candice A.; Rotenberg, Dorith; Johnson, Brian R.; Whitfield, Anna E.; Ullman, Diane E.
2014-01-01
Saliva is known to play a crucial role in insect feeding behavior and virus transmission. Currently, little is known about the salivary glands and saliva of thrips, despite the fact that Frankliniella occidentalis (Pergande) (the western flower thrips) is a serious pest due to its destructive feeding, wide host range, and transmission of tospoviruses. As a first step towards characterizing thrips salivary gland functions, we sequenced the transcriptome of the primary salivary glands of F. occidentalis using short read sequencing (Illumina) technology. A de novo-assembled transcriptome revealed 31,392 high quality contigs with an average size of 605 bp. A total of 12,166 contigs had significant BLASTx or tBLASTx hits (E≤1.0E−6) to known proteins, whereas a high percentage (61.24%) of contigs had no apparent protein or nucleotide hits. Comparison of the F. occidentalis salivary gland transcriptome (sialotranscriptome) against a published F. occidentalis full body transcriptome assembled from Roche-454 reads revealed several contigs with putative annotations associated with salivary gland functions. KEGG pathway analysis of the sialotranscriptome revealed that the majority (18 out of the top 20 predicted KEGG pathways) of the salivary gland contig sequences match proteins involved in metabolism. We identified several genes likely to be involved in detoxification and inhibition of plant defense responses including aldehyde dehydrogenase, metalloprotease, glucose oxidase, glucose dehydrogenase, and regucalcin. We also identified several genes that may play a role in the extra-oral digestion of plant structural tissues including β-glucosidase and pectin lyase; and the extra-oral digestion of sugars, including α-amylase, maltase, sucrase, and α-glucosidase. This is the first analysis of a sialotranscriptome for any Thysanopteran species and it provides a foundational tool to further our understanding of how thrips interact with their plant hosts and the viruses they transmit. PMID:24736614
Analysis of the salivary gland transcriptome of Frankliniella occidentalis.
Stafford-Banks, Candice A; Rotenberg, Dorith; Johnson, Brian R; Whitfield, Anna E; Ullman, Diane E
2014-01-01
Saliva is known to play a crucial role in insect feeding behavior and virus transmission. Currently, little is known about the salivary glands and saliva of thrips, despite the fact that Frankliniella occidentalis (Pergande) (the western flower thrips) is a serious pest due to its destructive feeding, wide host range, and transmission of tospoviruses. As a first step towards characterizing thrips salivary gland functions, we sequenced the transcriptome of the primary salivary glands of F. occidentalis using short read sequencing (Illumina) technology. A de novo-assembled transcriptome revealed 31,392 high quality contigs with an average size of 605 bp. A total of 12,166 contigs had significant BLASTx or tBLASTx hits (E≤1.0E-6) to known proteins, whereas a high percentage (61.24%) of contigs had no apparent protein or nucleotide hits. Comparison of the F. occidentalis salivary gland transcriptome (sialotranscriptome) against a published F. occidentalis full body transcriptome assembled from Roche-454 reads revealed several contigs with putative annotations associated with salivary gland functions. KEGG pathway analysis of the sialotranscriptome revealed that the majority (18 out of the top 20 predicted KEGG pathways) of the salivary gland contig sequences match proteins involved in metabolism. We identified several genes likely to be involved in detoxification and inhibition of plant defense responses including aldehyde dehydrogenase, metalloprotease, glucose oxidase, glucose dehydrogenase, and regucalcin. We also identified several genes that may play a role in the extra-oral digestion of plant structural tissues including β-glucosidase and pectin lyase; and the extra-oral digestion of sugars, including α-amylase, maltase, sucrase, and α-glucosidase. This is the first analysis of a sialotranscriptome for any Thysanopteran species and it provides a foundational tool to further our understanding of how thrips interact with their plant hosts and the viruses they transmit.
Transcriptomic immune response of Tenebrio molitor pupae to parasitization by Scleroderma guani.
Zhu, Jia-Ying; Yang, Pu; Zhang, Zhong; Wu, Guo-Xing; Yang, Bin
2013-01-01
Host and parasitoid interaction is one of the most fascinating relationships of insects, which is currently receiving an increasing interest. Understanding the mechanisms evolved by the parasitoids to evade or suppress the host immune system is important for dissecting this interaction, while it was still poorly known. In order to gain insight into the immune response of Tenebrio molitor to parasitization by Scleroderma guani, the transcriptome of T. molitor pupae was sequenced with focus on immune-related gene, and the non-parasitized and parasitized T. molitor pupae were analyzed by digital gene expression (DGE) analysis with special emphasis on parasitoid-induced immune-related genes using Illumina sequencing. In a single run, 264,698 raw reads were obtained. De novo assembly generated 71,514 unigenes with mean length of 424 bp. Of those unigenes, 37,373 (52.26%) showed similarity to the known proteins in the NCBI nr database. Via analysis of the transcriptome data in depth, 430 unigenes related to immunity were identified. DGE analysis revealed that parasitization by S. guani had considerable impacts on the transcriptome profile of T. molitor pupae, as indicated by the significant up- or down-regulation of 3,431 parasitism-responsive transcripts. The expression of a total of 74 unigenes involved in immune response of T. molitor was significantly altered after parasitization. obtained T. molitor transcriptome, in addition to establishing a fundamental resource for further research on functional genomics, has allowed the discovery of a large group of immune genes that might provide a meaningful framework to better understand the immune response in this species and other beetles. The DGE profiling data provides comprehensive T. molitor immune gene expression information at the transcriptional level following parasitization, and sheds valuable light on the molecular understanding of the host-parasitoid interaction.
Torre, Sara; Tattini, Massimiliano; Brunetti, Cecilia; Guidi, Lucia; Gori, Antonella; Marzano, Cristina; Landi, Marco; Sebastiani, Federico
2016-01-01
Sweet basil (Ocimum basilicum), one of the most popular cultivated herbs worldwide, displays a number of varieties differing in several characteristics, such as the color of the leaves. The development of a reference transcriptome for sweet basil, and the analysis of differentially expressed genes in acyanic and cyanic cultivars exposed to natural sunlight irradiance, has interest from horticultural and biological point of views. There is still great uncertainty about the significance of anthocyanins in photoprotection, and how green and red morphs may perform when exposed to photo-inhibitory light, a condition plants face on daily and seasonal basis. We sequenced the leaf transcriptome of the green-leaved Tigullio (TIG) and the purple-leaved Red Rubin (RR) exposed to full sunlight over a four-week experimental period. We assembled and annotated 111,007 transcripts. A total of 5,468 and 5,969 potential SSRs were identified in TIG and RR, respectively, out of which 66 were polymorphic in silico. Comparative analysis of the two transcriptomes showed 2,372 differentially expressed genes (DEGs) clustered in 222 enriched Gene ontology terms. Green and red basil mostly differed for transcripts abundance of genes involved in secondary metabolism. While the biosynthesis of waxes was up-regulated in red basil, the biosynthesis of flavonols and carotenoids was up-regulated in green basil. Data from our study provides a comprehensive transcriptome survey, gene sequence resources and microsatellites that can be used for further investigations in sweet basil. The analysis of DEGs and their functional classification also offers new insights on the functional role of anthocyanins in photoprotection. PMID:27483170
Takata, Nozomu; Sakakura, Eriko; Kasukawa, Takeya; Sakuma, Tetsushi; Yamamoto, Takashi; Sasai, Yoshiki
2016-06-01
The epiblast (foremost embryonic ectoderm) generates all three germ layers and therefore has crucial roles in the formation of all mammalian body cells. However, regulation of epiblast gene expression is poorly understood because of the difficulty of manipulating epiblast tissues in vivo. In the present study, using the self-organizing properties of mouse embryonic stem cell (ESC), we generated and characterized epiblast-like tissue in three-dimensional culture. We identified significant genome-wide gene expression changes in this epiblast-like tissue by transcriptomic analysis. In addition, we identified the particular significance of the Erk/Mapk and integrin-linked kinase pathways, and genes related to ectoderm/epithelial formation, using the bioinformatics resources IPA and DAVID. Here, we focused on Fgf5, which ranked in the top 10 among the discovered genes. To develop a functional analysis of Fgf5, we created an efficient method combining CRISPR/Cas9-mediated genome engineering and RNA interference (RNAi). Notably, we show one-step generation of various Fgf5 reporter lines including heterozygous and homozygous knockins (the GET method). For time- and dose-dependent depletion of fgf5 over the course of development, we generated an ESC line harboring Tol2 transposon-mediated integration of an inducible short hairpin RNA interference system (pdiRNAi). Our findings raised the possibility that Fgf/Erk signaling and apicobasal epithelial integrity are important factors in epiblast development. In addition, our methods provide a framework for a broad array of applications in the areas of mammalian genetics and molecular biology to understand development and to improve future therapeutics.
Camp, J Gray; Treutlein, Barbara
2017-05-01
Innovative methods designed to recapitulate human organogenesis from pluripotent stem cells provide a means to explore human developmental biology. New technologies to sequence and analyze single-cell transcriptomes can deconstruct these 'organoids' into constituent parts, and reconstruct lineage trajectories during cell differentiation. In this Spotlight article we summarize the different approaches to performing single-cell transcriptomics on organoids, and discuss the opportunities and challenges of applying these techniques to generate organ-level, mechanistic models of human development and disease. Together, these technologies will move past characterization to the prediction of human developmental and disease-related phenomena. © 2017. Published by The Company of Biologists Ltd.
van Iterson, Maarten; van Zwet, Erik W; Heijmans, Bastiaan T
2017-01-27
We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.
Nishimura, Osamu; Hirao, Yukako; Tarui, Hiroshi; Agata, Kiyokazu
2012-06-29
Planarians are considered to be among the extant animals close to one of the earliest groups of organisms that acquired a central nervous system (CNS) during evolution. Planarians have a bilobed brain with nine lateral branches from which a variety of external signals are projected into different portions of the main lobes. Various interneurons process different signals to regulate behavior and learning/memory. Furthermore, planarians have robust regenerative ability and are attracting attention as a new model organism for the study of regeneration. Here we conducted large-scale EST analysis of the head region of the planarian Dugesia japonica to construct a database of the head-region transcriptome, and then performed comparative analyses among related species. A total of 54,752 high-quality EST reads were obtained from a head library of the planarian Dugesia japonica, and 13,167 unigene sequences were produced by de novo assembly. A new method devised here revealed that proteins related to metabolism and defense mechanisms have high flexibility of amino-acid substitutions within the planarian family. Eight-two CNS-development genes were found in the planarian (cf. C. elegans 3; chicken 129). Comparative analysis revealed that 91% of the planarian CNS-development genes could be mapped onto the schistosome genome, but one-third of these shared genes were not expressed in the schistosome. We constructed a database that is a useful resource for comparative planarian transcriptome studies. Analysis comparing homologous genes between two planarian species showed that the potential of genes is important for accumulation of amino-acid substitutions. The presence of many CNS-development genes in our database supports the notion that the planarian has a fundamental brain with regard to evolution and development at not only the morphological/functional, but also the genomic, level. In addition, our results indicate that the planarian CNS-development genes already existed before the divergence of planarians and schistosomes from their common ancestor.
Zhang, Le-Le; Zhang, Zi-Ning; Wu, Xian; Jiang, Yong-Jun; Fu, Ya-Jing; Shang, Hong
2017-09-12
A small proportion of HIV-infected patients remain clinically and/or immunologically stable for years, including elite controllers (ECs) who have undetectable viremia (<50 copies/ml) and long-term nonprogressors (LTNPs) who maintain normal CD4 + T cell counts for prolonged periods (>10 years). However, the mechanism of nonprogression needs to be further resolved. In this study, a transcriptome meta-analysis was performed on nonprogressor and progressor microarray data to identify differential transcriptome pathways and potential biomarkers. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed genes (DEGs) in nonprogressors and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DEGs identified in the meta-analysis. Five microarray datasets (81 cases and 98 controls in total), including whole blood, CD4 + and CD8 + T cells, were collected for meta-analysis. We determined that nonprogressors have reduced expression of important interferon-stimulated genes (ISGs), CD38, lymphocyte activation gene 3 (LAG-3) in whole blood, CD4 + and CD8 + T cells. Gene ontology (GO) analysis showed a significant enrichment in DEGs that function in the type I interferon signaling pathway. Upregulated pathways, including the PI3K-Akt signaling pathway in whole blood, cytokine-cytokine receptor interaction in CD4 + T cells and the MAPK signaling pathway in CD8 + T cells, were identified in nonprogressors compared with progressors. In each metabolic functional category, the number of downregulated DEGs was more than the upregulated DEGs, and almost all genes were downregulated DEGs in the oxidative phosphorylation (OXPHOS) and tricarboxylic acid (TCA) cycle in the three types of samples. Our transcriptomic meta-analysis provides a comprehensive evaluation of the gene expression profiles in major blood types of nonprogressors, providing new insights in the understanding of HIV pathogenesis and developing strategies to delay HIV disease progression.
Sarkar, Soumyadev; Chakravorty, Somnath; Mukherjee, Avishek; Bhattacharya, Debanjana; Bhattacharya, Semantee; Gachhui, Ratan
2018-03-01
Nitrogen is a key nutrient for all cell forms. Most organisms respond to nitrogen scarcity by slowing down their growth rate. On the contrary, our previous studies have shown that Papiliotrema laurentii strain RY1 has a robust growth under nitrogen starvation. To understand the global regulation that leads to such an extraordinary response, we undertook a de novo approach for transcriptome analysis of the yeast. Close to 33 million sequence reads of high quality for nitrogen limited and enriched condition were generated using Illumina NextSeq500. Trinity analysis and clustered transcripts annotation of the reads produced 17,611 unigenes, out of which 14,157 could be annotated. Gene Ontology term analysis generated 44.92% cellular component terms, 39.81% molecular function terms and 15.24% biological process terms. The most over represented pathways in general were translation, carbohydrate metabolism, amino acid metabolism, general metabolism, folding, sorting, degradation followed by transport and catabolism, nucleotide metabolism, replication and repair, transcription and lipid metabolism. A total of 4256 Single Sequence Repeats were identified. Differential gene expression analysis detected 996 P-significant transcripts to reveal transmembrane transport, lipid homeostasis, fatty acid catabolism and translation as the enriched terms which could be essential for Papiliotrema laurentii strain RY1 to adapt during nitrogen deprivation. Transcriptome data was validated by quantitative real-time PCR analysis of twelve transcripts. To the best of our knowledge, this is the first report of Papiliotrema laurentii strain RY1 transcriptome which would play a pivotal role in understanding the biochemistry of the yeast under acute nitrogen stress and this study would be encouraging to initiate extensive investigations into this Papiliotrema system. Copyright © 2017 Elsevier B.V. All rights reserved.
Pick, Thea R; Bräutigam, Andrea; Schlüter, Urte; Denton, Alisandra K; Colmsee, Christian; Scholz, Uwe; Fahnenstich, Holger; Pieruschka, Roland; Rascher, Uwe; Sonnewald, Uwe; Weber, Andreas P M
2011-12-01
We systematically analyzed a developmental gradient of the third maize (Zea mays) leaf from the point of emergence into the light to the tip in 10 continuous leaf slices to study organ development and physiological and biochemical functions. Transcriptome analysis, oxygen sensitivity of photosynthesis, and photosynthetic rate measurements showed that the maize leaf undergoes a sink-to-source transition without an intermediate phase of C(3) photosynthesis or operation of a photorespiratory carbon pump. Metabolome and transcriptome analysis, chlorophyll and protein measurements, as well as dry weight determination, showed continuous gradients for all analyzed items. The absence of binary on-off switches and regulons pointed to a morphogradient along the leaf as the determining factor of developmental stage. Analysis of transcription factors for differential expression along the leaf gradient defined a list of putative regulators orchestrating the sink-to-source transition and establishment of C(4) photosynthesis. Finally, transcriptome and metabolome analysis, as well as enzyme activity measurements, and absolute quantification of selected metabolites revised the current model of maize C(4) photosynthesis. All data sets are included within the publication to serve as a resource for maize leaf systems biology.
A generic Transcriptomics Reporting Framework (TRF) for 'omics data processing and analysis.
Gant, Timothy W; Sauer, Ursula G; Zhang, Shu-Dong; Chorley, Brian N; Hackermüller, Jörg; Perdichizzi, Stefania; Tollefsen, Knut E; van Ravenzwaay, Ben; Yauk, Carole; Tong, Weida; Poole, Alan
2017-12-01
A generic Transcriptomics Reporting Framework (TRF) is presented that lists parameters that should be reported in 'omics studies used in a regulatory context. The TRF encompasses the processes from transcriptome profiling from data generation to a processed list of differentially expressed genes (DEGs) ready for interpretation. Included within the TRF is a reference baseline analysis (RBA) that encompasses raw data selection; data normalisation; recognition of outliers; and statistical analysis. The TRF itself does not dictate the methodology for data processing, but deals with what should be reported. Its principles are also applicable to sequencing data and other 'omics. In contrast, the RBA specifies a simple data processing and analysis methodology that is designed to provide a comparison point for other approaches and is exemplified here by a case study. By providing transparency on the steps applied during 'omics data processing and analysis, the TRF will increase confidence processing of 'omics data, and regulatory use. Applicability of the TRF is ensured by its simplicity and generality. The TRF can be applied to all types of regulatory 'omics studies, and it can be executed using different commonly available software tools. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Sreedharan, Vipin T; Schultheiss, Sebastian J; Jean, Géraldine; Kahles, André; Bohnert, Regina; Drewe, Philipp; Mudrakarta, Pramod; Görnitz, Nico; Zeller, Georg; Rätsch, Gunnar
2014-05-01
We present Oqtans, an open-source workbench for quantitative transcriptome analysis, that is integrated in Galaxy. Its distinguishing features include customizable computational workflows and a modular pipeline architecture that facilitates comparative assessment of tool and data quality. Oqtans integrates an assortment of machine learning-powered tools into Galaxy, which show superior or equal performance to state-of-the-art tools. Implemented tools comprise a complete transcriptome analysis workflow: short-read alignment, transcript identification/quantification and differential expression analysis. Oqtans and Galaxy facilitate persistent storage, data exchange and documentation of intermediate results and analysis workflows. We illustrate how Oqtans aids the interpretation of data from different experiments in easy to understand use cases. Users can easily create their own workflows and extend Oqtans by integrating specific tools. Oqtans is available as (i) a cloud machine image with a demo instance at cloud.oqtans.org, (ii) a public Galaxy instance at galaxy.cbio.mskcc.org, (iii) a git repository containing all installed software (oqtans.org/git); most of which is also available from (iv) the Galaxy Toolshed and (v) a share string to use along with Galaxy CloudMan.
Pick, Thea R.; Bräutigam, Andrea; Schlüter, Urte; Denton, Alisandra K.; Colmsee, Christian; Scholz, Uwe; Fahnenstich, Holger; Pieruschka, Roland; Rascher, Uwe; Sonnewald, Uwe; Weber, Andreas P.M.
2011-01-01
We systematically analyzed a developmental gradient of the third maize (Zea mays) leaf from the point of emergence into the light to the tip in 10 continuous leaf slices to study organ development and physiological and biochemical functions. Transcriptome analysis, oxygen sensitivity of photosynthesis, and photosynthetic rate measurements showed that the maize leaf undergoes a sink-to-source transition without an intermediate phase of C3 photosynthesis or operation of a photorespiratory carbon pump. Metabolome and transcriptome analysis, chlorophyll and protein measurements, as well as dry weight determination, showed continuous gradients for all analyzed items. The absence of binary on–off switches and regulons pointed to a morphogradient along the leaf as the determining factor of developmental stage. Analysis of transcription factors for differential expression along the leaf gradient defined a list of putative regulators orchestrating the sink-to-source transition and establishment of C4 photosynthesis. Finally, transcriptome and metabolome analysis, as well as enzyme activity measurements, and absolute quantification of selected metabolites revised the current model of maize C4 photosynthesis. All data sets are included within the publication to serve as a resource for maize leaf systems biology. PMID:22186372
Highly Multiplexed, Single Cell Transcriptomic Analysis of T-Cells by Microfluidic PCR.
Dominguez, Maria; Roederer, Mario; Chattopadhyay, Pratip K
2017-01-01
Recently, technologies have been developed to measure expression of 96 (or more) mRNA transcripts at once from a single cell. Here we describe methods and important considerations for use of Fluidigm's BioMark platform for multiplexed single cell gene expression. We describe how to qualify primer/probes, select genes to examine in 96-parameter panels, perform the reverse transcription/cDNA synthesis step, and operate the instrument. In addition, we describe data analysis considerations. This technology has enormous value for characterizing the heterogeneity of T-cells, thereby providing a useful tool for immune monitoring.
Valencia, Arnubio; Wang, Haichuan; Soto, Alberto; Aristizabal, Manuel; Arboleda, Jorge W; Eyun, Seong-Il; Noriega, Daniel D; Siegfried, Blair
2016-01-01
The banana weevil Cosmopolites sordidus is an important and serious insect pest in most banana and plantain-growing areas of the world. In spite of the economic importance of this insect pest very little genomic and transcriptomic information exists for this species. In the present study, we characterized the midgut transcriptome of C. sordidus using massive 454-pyrosequencing. We generated over 590,000 sequencing reads that assembled into 30,840 contigs with more than 400 bp, representing a significant expansion of existing sequences available for this insect pest. Among them, 16,427 contigs contained one or more GO terms. In addition, 15,263 contigs were assigned an EC number. In-depth transcriptome analysis identified genes potentially involved in insecticide resistance, peritrophic membrane biosynthesis, immunity-related function and defense against pathogens, and Bacillus thuringiensis toxins binding proteins as well as multiple enzymes involved with protein digestion. This transcriptome will provide a valuable resource for understanding larval physiology and for identifying novel target sites and management approaches for this important insect pest.
Valencia, Arnubio; Wang, Haichuan; Soto, Alberto; Aristizabal, Manuel; Arboleda, Jorge W.; Eyun, Seong-il; Noriega, Daniel D.; Siegfried, Blair
2016-01-01
The banana weevil Cosmopolites sordidus is an important and serious insect pest in most banana and plantain-growing areas of the world. In spite of the economic importance of this insect pest very little genomic and transcriptomic information exists for this species. In the present study, we characterized the midgut transcriptome of C. sordidus using massive 454-pyrosequencing. We generated over 590,000 sequencing reads that assembled into 30,840 contigs with more than 400 bp, representing a significant expansion of existing sequences available for this insect pest. Among them, 16,427 contigs contained one or more GO terms. In addition, 15,263 contigs were assigned an EC number. In-depth transcriptome analysis identified genes potentially involved in insecticide resistance, peritrophic membrane biosynthesis, immunity-related function and defense against pathogens, and Bacillus thuringiensis toxins binding proteins as well as multiple enzymes involved with protein digestion. This transcriptome will provide a valuable resource for understanding larval physiology and for identifying novel target sites and management approaches for this important insect pest. PMID:26949943
USDA-ARS?s Scientific Manuscript database
Rose is one of the most important cut flowers among ornamental plants. Rose flower longevity is largely dependent on the timing of petal shedding occurrence. To understand the molecular mechanism underlying petal abscission in rose, we performed transcriptome profiling of the petal abscission zone d...
USDA-ARS?s Scientific Manuscript database
This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...
Santos, Patricia; Plaszczyca, Marian; Pawlowski, Katharina
2013-01-01
Actinorhizal root nodule symbioses are very diverse, and the symbiosis of Datisca glomerata has previously been shown to have many unusual aspects. In order to gain molecular information on the infection mechanism, nodule development and nodule metabolism, we compared the transcriptomes of D. glomerata roots and nodules. Root and nodule libraries representing the 3′-ends of cDNAs were subjected to high-throughput parallel 454 sequencing. To identify the corresponding genes and to improve the assembly, Illumina sequencing of the nodule transcriptome was performed as well. The evaluation revealed 406 differentially regulated genes, 295 of which (72.7%) could be assigned a function based on homology. Analysis of the nodule transcriptome showed that genes encoding components of the common symbiosis signaling pathway were present in nodules of D. glomerata, which in combination with the previously established function of SymRK in D. glomerata nodulation suggests that this pathway is also active in actinorhizal Cucurbitales. Furthermore, comparison of the D. glomerata nodule transcriptome with nodule transcriptomes from actinorhizal Fagales revealed a new subgroup of nodule-specific defensins that might play a role specific to actinorhizal symbioses. The D. glomerata members of this defensin subgroup contain an acidic C-terminal domain that was never found in plant defensins before. PMID:24009681
Grace, Peter M.; Hurley, Daniel; Barratt, Daniel T.; Tsykin, Anna; Watkins, Linda R.; Rolan, Paul E.; Hutchinson, Mark R.
2017-01-01
A quantitative, peripherally accessible biomarker for neuropathic pain has great potential to improve clinical outcomes. Based on the premise that peripheral and central immunity contribute to neuropathic pain mechanisms, we hypothesized that biomarkers could be identified from the whole blood of adult male rats, by integrating graded chronic constriction injury (CCI), ipsilateral lumbar dorsal quadrant (iLDQ) and whole blood transcriptomes, and pathway analysis with pain behavior. Correlational bioinformatics identified a range of putative biomarker genes for allodynia intensity, many encoding for proteins with a recognized role in immune/nociceptive mechanisms. A selection of these genes was validated in a separate replication study. Pathway analysis of the iLDQ transcriptome identified Fcγ and Fcε signaling pathways, among others. This study is the first to employ the whole blood transcriptome to identify pain biomarker panels. The novel correlational bioinformatics, developed here, selected such putative biomarkers based on a correlation with pain behavior and formation of signaling pathways with iLDQ genes. Future studies may demonstrate the predictive ability of these biomarker genes across other models and additional variables. PMID:22697386
Liu, Miaomiao; Zhu, Jinhang; Wu, Shengbing; Wang, Chenkai; Guo, Xingyi; Wu, Jiawen; Zhou, Meiqi
2018-04-11
Artemisia argyi Lev. et Vant. (A. argyi) is widely utilized for moxibustion in Chinese medicine, and the mechanism underlying terpenoid biosynthesis in its leaves is suggested to play an important role in its medicinal use. However, the A. argyi transcriptome has not been sequenced. Herein, we performed RNA sequencing for A. argyi leaf, root and stem tissues to identify as many as possible of the transcribed genes. In total, 99,807 unigenes were assembled by analysing the expression profiles generated from the three tissue types, and 67,446 of those unigenes were annotated in public databases. We further performed differential gene expression analysis to compare leaf tissue with the other two tissue types and identified numerous genes that were specifically expressed or up-regulated in leaf tissue. Specifically, we identified multiple genes encoding significant enzymes or transcription factors related to terpenoid synthesis. This study serves as a valuable resource for transcriptome information, as many transcribed genes related to terpenoid biosynthesis were identified in the A. argyi transcriptome, providing a functional genomic basis for additional studies on molecular mechanisms underlying the medicinal use of A. argyi.
Huang, Xiaoyun; Zang, Xiaonan; Wu, Fei; Jin, Yuming; Wang, Haitao; Liu, Chang; Ding, Yating; He, Bangxiang; Xiao, Dongfang; Song, Xinwei; Liu, Zhu
2017-01-01
Gracilariopsis lemaneiformis (aka Gracilaria lemaneiformis) is a red macroalga rich in phycoerythrin, which can capture light efficiently and transfer it to photosystemⅡ. However, little is known about the synthesis of optically active phycoerythrinin in G. lemaneiformis at the molecular level. With the advent of high-throughput sequencing technology, analysis of genetic information for G. lemaneiformis by transcriptome sequencing is an effective means to get a deeper insight into the molecular mechanism of phycoerythrin synthesis. Illumina technology was employed to sequence the transcriptome of two strains of G. lemaneiformis- the wild type and a green-pigmented mutant. We obtained a total of 86915 assembled unigenes as a reference gene set, and 42884 unigenes were annotated in at least one public database. Taking the above transcriptome sequencing as a reference gene set, 4041 differentially expressed genes were screened to analyze and compare the gene expression profiles of the wild type and green mutant. By GO and KEGG pathway analysis, we concluded that three factors, including a reduction in the expression level of apo-phycoerythrin, an increase of chlorophyll light-harvesting complex synthesis, and reduction of phycoerythrobilin by competitive inhibition, caused the reduction of optically active phycoerythrin in the green-pigmented mutant.
Polyadenylation state microarray (PASTA) analysis.
Beilharz, Traude H; Preiss, Thomas
2011-01-01
Nearly all eukaryotic mRNAs terminate in a poly(A) tail that serves important roles in mRNA utilization. In the cytoplasm, the poly(A) tail promotes both mRNA stability and translation, and these functions are frequently regulated through changes in tail length. To identify the scope of poly(A) tail length control in a transcriptome, we developed the polyadenylation state microarray (PASTA) method. It involves the purification of mRNA based on poly(A) tail length using thermal elution from poly(U) sepharose, followed by microarray analysis of the resulting fractions. In this chapter we detail our PASTA approach and describe some methods for bulk and mRNA-specific poly(A) tail length measurements of use to monitor the procedure and independently verify the microarray data.
Brahma, Rajeev Kungur; McCleary, Ryan J R; Kini, R Manjunatha; Doley, Robin
2015-01-01
Snake venoms are cocktails of protein toxins that play important roles in capture and digestion of prey. Significant qualitative and quantitative variation in snake venom composition has been observed among and within species. Understanding these variations in protein components is instrumental in interpreting clinical symptoms during human envenomation and in searching for novel venom proteins with potential therapeutic applications. In the last decade, transcriptomic analyses of venom glands have helped in understanding the composition of various snake venoms in great detail. Here we review transcriptomic analysis as a powerful tool for understanding venom profile, variation and evolution. Copyright © 2014 Elsevier Ltd. All rights reserved.
Meng, Xian-liang; Liu, Ping; Jia, Fu-long; Li, Jian; Gao, Bao-Quan
2015-01-01
The swimming crab Portunus trituberculatus is a commercially important crab species in East Asia countries. Gonadal development is a physiological process of great significance to the reproduction as well as commercial seed production for P. trituberculatus. However, little is currently known about the molecular mechanisms governing the developmental processes of gonads in this species. To open avenues of molecular research on P. trituberculatus gonadal development, Illumina paired-end sequencing technology was employed to develop deep-coverage transcriptome sequencing data for its gonads. Illumina sequencing generated 58,429,148 and 70,474,978 high-quality reads from the ovary and testis cDNA library, respectively. All these reads were assembled into 54,960 unigenes with an average sequence length of 879 bp, of which 12,340 unigenes (22.45% of the total) matched sequences in GenBank non-redundant database. Based on our transcriptome analysis as well as published literature, a number of candidate genes potentially involved in the regulation of gonadal development of P. trituberculatus were identified, such as FAOMeT, mPRγ, PGMRC1, PGDS, PGER4, 3β-HSD and 17β-HSDs. Differential expression analysis generated 5,919 differentially expressed genes between ovary and testis, among which many genes related to gametogenesis and several genes previously reported to be critical in differentiation and development of gonads were found, including Foxl2, Wnt4, Fst, Fem-1 and Sox9. Furthermore, 28,534 SSRs and 111,646 high-quality SNPs were identified in this transcriptome dataset. This work represents the first transcriptome analysis of P. trituberculatus gonads using the next generation sequencing technology and provides a valuable dataset for understanding molecular mechanisms controlling development of gonads and facilitating future investigation of reproductive biology in this species. The molecular markers obtained in this study will provide a fundamental basis for population genetics and functional genomics in P. trituberculatus and other closely related species. PMID:26042806
Narnoliya, Lokesh K; Kaushal, Girija; Singh, Sudhir P; Sangwan, Rajender S
2017-01-13
Rose-scented geranium (Pelargonium sp.) is a perennial herb that produces a high value essential oil of fragrant significance due to the characteristic compositional blend of rose-oxide and acyclic monoterpenoids in foliage. Recently, the plant has also been shown to produce tartaric acid in leaf tissues. Rose-scented geranium represents top-tier cash crop in terms of economic returns and significance of the plant and plant products. However, there has hardly been any study on its metabolism and functional genomics, nor any genomic expression dataset resource is available in public domain. Therefore, to begin the gains in molecular understanding of specialized metabolic pathways of the plant, de novo sequencing of rose-scented geranium leaf transcriptome, transcript assembly, annotation, expression profiling as well as their validation were carried out. De novo transcriptome analysis resulted a total of 78,943 unique contigs (average length: 623 bp, and N50 length: 752 bp) from 15.44 million high quality raw reads. In silico functional annotation led to the identification of several putative genes representing terpene, ascorbic acid and tartaric acid biosynthetic pathways, hormone metabolism, and transcription factors. Additionally, a total of 6,040 simple sequence repeat (SSR) motifs were identified in 6.8% of the expressed transcripts. The highest frequency of SSR was of tri-nucleotides (50%). Further, transcriptome assembly was validated for randomly selected putative genes by standard PCR-based approach. In silico expression profile of assembled contigs were validated by real-time PCR analysis of selected transcripts. Being the first report on transcriptome analysis of rose-scented geranium the data sets and the leads and directions reflected in this investigation will serve as a foundation for pursuing and understanding molecular aspects of its biology, and specialized metabolic pathways, metabolic engineering, genetic diversity as well as molecular breeding.
Alkan, Noam; Friedlander, Gilgi; Ment, Dana; Prusky, Dov; Fluhr, Robert
2015-01-01
The fungus Colletotrichum gloeosporioides breaches the fruit cuticle but remains quiescent until fruit ripening signals a switch to necrotrophy, culminating in devastating anthracnose disease. There is a need to understand the distinct fungal arms strategy and the simultaneous fruit response. Transcriptome analysis of fungal-fruit interactions was carried out concurrently in the appressoria, quiescent and necrotrophic stages. Conidia germinating on unripe fruit cuticle showed stage-specific transcription that was accompanied by massive fruit defense responses. The subsequent quiescent stage showed the development of dendritic-like structures and swollen hyphae within the fruit epidermis. The quiescent fungal transcriptome was characterized by activation of chromatin remodeling genes and unsuspected environmental alkalization. Fruit response was portrayed by continued highly integrated massive up-regulation of defense genes. During cuticle infection of green or ripe fruit, fungi recapitulate the same developmental stages but with differing quiescent time spans. The necrotrophic stage showed a dramatic shift in fungal metabolism and up-regulation of pathogenicity factors. Fruit response to necrotrophy showed activation of the salicylic acid pathway, climaxing in cell death. Transcriptome analysis of C. gloeosporioides infection of fruit reveals its distinct stage-specific lifestyle and the concurrent changing fruit response, deepening our perception of the unfolding fungal-fruit arms and defenses race. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Liu, Lei; Fu, Yuanyuan; Zhu, Fang; Mu, Changkao; Li, Ronghua; Song, Weiwei; Shi, Ce; Ye, Yangfang; Wang, Chunlin
2018-06-05
The swimming crab (Portunus trituberculatus) is among the most economically important seawater crustacean species in Asia. Despite its commercial importance and being well-studied status, genomic and transcriptomic data are scarce for this crab species. In the present study, limb bud tissue was collected at different developmental stages post amputation for transcriptomic analysis. Illumina RNA-sequencing was applied to characterise the limb regeneration transcriptome and identify the most characteristic genes. A total of 289,018 transcripts were obtained by clustering and assembly of clean reads, producing 150,869 unigenes with an average length of 956 bp. Subsequent analysis revealed WNT signalling as the key pathway involved in limb regeneration, with WNT4 a key mediator. Overall, limb regeneration appears to be regulated by multiple signalling pathways, with numerous cell differentiation, muscle growth, moult, metabolism, and immune-related genes upregulated, including WNT4, LAMA, FIP2, FSTL5, TNC, HUS1, SWI5, NCGL, SLC22, PLA2, Tdc2, SMOX, GDH, and SMPD4. This is the first experimental study done on regenerating claws of P. trituberculatus. These findings expand existing sequence resources for crab species, and will likely accelerate research into regeneration and development in crustaceans, particularly functional studies on genes involved in limb regeneration. Copyright © 2018 Elsevier B.V. All rights reserved.
Niu, Jun; Wang, Jia; An, Jiyong; Liu, Lili; Lin, Zixin; Wang, Rui; Wang, Libing; Ma, Chao; Shi, Lingling; Lin, Shanzhi
2016-01-01
Recently, our transcriptomic analysis has identified some functional genes responsible for oil biosynthesis in developing SASK, yet miRNA-mediated regulation for SASK development and oil accumulation is poorly understood. Here, 3 representative periods of 10, 30 and 60 DAF were selected for sRNA sequencing based on the dynamic patterns of growth tendency and oil content of developing SASK. By miRNA transcriptomic analysis, we characterized 296 known and 44 novel miRNAs in developing SASK, among which 36 known and 6 novel miRNAs respond specifically to developing SASK. Importantly, we performed an integrated analysis of mRNA and miRNA transcriptome as well as qRT-PCR detection to identify some key miRNAs and their targets (miR156-SPL, miR160-ARF18, miR164-NAC1, miR171h-SCL6, miR172-AP2, miR395-AUX22B, miR530-P2C37, miR393h-TIR1/AFB2 and psi-miRn5-SnRK2A) potentially involved in developing response and hormone signaling of SASK. Our results provide new insights into the important regulatory function of cross-talk between development response and hormone signaling for SASK oil accumulation. PMID:27762296
Transcriptome Dynamics during Maize Endosperm Development
Feng, Jiaojiao; Xu, Shutu; Wang, Lei; Li, Feifei; Li, Yibo; Zhang, Renhe; Zhang, Xinghua; Xue, Jiquan; Guo, Dongwei
2016-01-01
The endosperm is a major organ of the seed that plays vital roles in determining seed weight and quality. However, genome-wide transcriptome patterns throughout maize endosperm development have not been comprehensively investigated to date. Accordingly, we performed a high-throughput RNA sequencing (RNA-seq) analysis of the maize endosperm transcriptome at 5, 10, 15 and 20 days after pollination (DAP). We found that more than 11,000 protein-coding genes underwent alternative splicing (AS) events during the four developmental stages studied. These genes were mainly involved in intracellular protein transport, signal transmission, cellular carbohydrate metabolism, cellular lipid metabolism, lipid biosynthesis, protein modification, histone modification, cellular amino acid metabolism, and DNA repair. Additionally, 7,633 genes, including 473 transcription factors (TFs), were differentially expressed among the four developmental stages. The differentially expressed TFs were from 50 families, including the bZIP, WRKY, GeBP and ARF families. Further analysis of the stage-specific TFs showed that binding, nucleus and ligand-dependent nuclear receptor activities might be important at 5 DAP, that immune responses, signalling, binding and lumen development are involved at 10 DAP, that protein metabolic processes and the cytoplasm might be important at 15 DAP, and that the responses to various stimuli are different at 20 DAP compared with the other developmental stages. This RNA-seq analysis provides novel, comprehensive insights into the transcriptome dynamics during early endosperm development in maize. PMID:27695101
Niu, Jun; Wang, Jia; An, Jiyong; Liu, Lili; Lin, Zixin; Wang, Rui; Wang, Libing; Ma, Chao; Shi, Lingling; Lin, Shanzhi
2016-10-20
Recently, our transcriptomic analysis has identified some functional genes responsible for oil biosynthesis in developing SASK, yet miRNA-mediated regulation for SASK development and oil accumulation is poorly understood. Here, 3 representative periods of 10, 30 and 60 DAF were selected for sRNA sequencing based on the dynamic patterns of growth tendency and oil content of developing SASK. By miRNA transcriptomic analysis, we characterized 296 known and 44 novel miRNAs in developing SASK, among which 36 known and 6 novel miRNAs respond specifically to developing SASK. Importantly, we performed an integrated analysis of mRNA and miRNA transcriptome as well as qRT-PCR detection to identify some key miRNAs and their targets (miR156-SPL, miR160-ARF18, miR164-NAC1, miR171h-SCL6, miR172-AP2, miR395-AUX22B, miR530-P2C37, miR393h-TIR1/AFB2 and psi-miRn5-SnRK2A) potentially involved in developing response and hormone signaling of SASK. Our results provide new insights into the important regulatory function of cross-talk between development response and hormone signaling for SASK oil accumulation.
Zhang, Jin; Wang, Bing; Dong, Shuanglin; Cao, Depan; Dong, Junfeng; Walker, William B.; Liu, Yang; Wang, Guirong
2015-01-01
To better understand the olfactory mechanisms in the two lepidopteran pest model species, the Helicoverpa armigera and H. assulta, we conducted transcriptome analysis of the adult antennae using Illumina sequencing technology and compared the chemosensory genes between these two related species. Combined with the chemosensory genes we had identified previously in H. armigera by 454 sequencing, we identified 133 putative chemosensory unigenes in H. armigera including 60 odorant receptors (ORs), 19 ionotropic receptors (IRs), 34 odorant binding proteins (OBPs), 18 chemosensory proteins (CSPs), and 2 sensory neuron membrane proteins (SNMPs). Consistent with these results, 131 putative chemosensory genes including 64 ORs, 19 IRs, 29 OBPs, 17 CSPs, and 2 SNMPs were identified through male and female antennal transcriptome analysis in H. assulta. Reverse Transcription-PCR (RT-PCR) was conducted in H. assulta to examine the accuracy of the assembly and annotation of the transcriptome and the expression profile of these unigenes in different tissues. Most of the ORs, IRs and OBPs were enriched in adult antennae, while almost all the CSPs were expressed in antennae as well as legs. We compared the differences of the chemosensory genes between these two species in detail. Our work will surely provide valuable information for further functional studies of pheromones and host volatile recognition genes in these two related species. PMID:25659090
Kawasaki, Regiane; Baraúna, Rafael A; Silva, Artur; Carepo, Marta S P; Oliveira, Rui; Marques, Rodolfo; Ramos, Rommel T J; Schneider, Maria P C
2016-01-01
Exiguobacterium antarcticum B7 is extremophile Gram-positive bacteria able to survive in cold environments. A key factor to understanding cold adaptation processes is related to the modification of fatty acids composing the cell membranes of psychrotrophic bacteria. In our study we show the in silico reconstruction of the fatty acid biosynthesis pathway of E. antarcticum B7. To build the stoichiometric model, a semiautomatic procedure was applied, which integrates genome information using KEGG and RAST/SEED. Constraint-based methods, namely, Flux Balance Analysis (FBA) and elementary modes (EM), were applied. FBA was implemented in the sense of hexadecenoic acid production maximization. To evaluate the influence of the gene expression in the fluxome analysis, FBA was also calculated using the log2FC values obtained in the transcriptome analysis at 0°C and 37°C. The fatty acid biosynthesis pathway showed a total of 13 elementary flux modes, four of which showed routes for the production of hexadecenoic acid. The reconstructed pathway demonstrated the capacity of E. antarcticum B7 to de novo produce fatty acid molecules. Under the influence of the transcriptome, the fluxome was altered, promoting the production of short-chain fatty acids. The calculated models contribute to better understanding of the bacterial adaptation at cold environments.
Ni, Jun; Dong, Lixiang; Jiang, Zhifang; Yang, Xiuli; Chen, Ziying; Wu, Yuhuan; Xu, Maojun
2018-01-01
Ginkgo leaves are raw materials for flavonoid extraction. Thus, the timing of their harvest is important to optimize the extraction efficiency, which benefits the pharmaceutical industry. In this research, we compared the transcriptomes of Ginkgo leaves harvested at midday and midnight. The differentially expressed genes with the highest probabilities in each step of flavonoid biosynthesis were down-regulated at midnight. Furthermore, real-time PCR corroborated the transcriptome results, indicating the decrease in flavonoid biosynthesis at midnight. The flavonoid profiles of Ginkgo leaves harvested at midday and midnight were compared, and the total flavonoid content decreased at midnight. A detailed analysis of individual flavonoids showed that most of their contents were decreased by various degrees. Our results indicated that circadian rhythms affected the flavonoid contents in Ginkgo leaves, which provides valuable information for optimizing their harvesting times to benefit the pharmaceutical industry.
Ma, Yibao; Zhao, Yong; Zhao, Ruiming; Zhang, Weiping; He, Yawen; Wu, Yingliang; Cao, Zhijian; Guo, Lin; Li, Wenxin
2010-07-01
Scorpion venoms contain a vast untapped reservoir of natural products, which have the potential for medicinal value in drug discovery. In this study, toxin components from the scorpion Heterometrus petersii venom were evaluated by transcriptome and proteome analysis.Ten known families of venom peptides and proteins were identified, which include: two families of potassium channel toxins, four families of antimicrobial and cytolytic peptides,and one family from each of the calcium channel toxins, La1-like peptides, phospholipase A2,and the serine proteases. In addition, we also identified 12 atypical families, which include the acid phosphatases, diuretic peptides, and ten orphan families. From the data presented here, the extreme diversity and convergence of toxic components in scorpion venom was uncovered. Our work demonstrates the power of combining transcriptomic and proteomic approaches in the study of animal venoms.
Transcriptome profile of Trichoderma harzianum IOC-3844 induced by sugarcane bagasse.
Horta, Maria Augusta Crivelente; Vicentini, Renato; Delabona, Priscila da Silva; Laborda, Prianda; Crucello, Aline; Freitas, Sindélia; Kuroshu, Reginaldo Massanobu; Polikarpov, Igor; Pradella, José Geraldo da Cruz; Souza, Anete Pereira
2014-01-01
Profiling the transcriptome that underlies biomass degradation by the fungus Trichoderma harzianum allows the identification of gene sequences with potential application in enzymatic hydrolysis processing. In the present study, the transcriptome of T. harzianum IOC-3844 was analyzed using RNA-seq technology. The sequencing generated 14.7 Gbp for downstream analyses. De novo assembly resulted in 32,396 contigs, which were submitted for identification and classified according to their identities. This analysis allowed us to define a principal set of T. harzianum genes that are involved in the degradation of cellulose and hemicellulose and the accessory genes that are involved in the depolymerization of biomass. An additional analysis of expression levels identified a set of carbohydrate-active enzymes that are upregulated under different conditions. The present study provides valuable information for future studies on biomass degradation and contributes to a better understanding of the role of the genes that are involved in this process.
Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng
2016-01-01
Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides.
Sonnack, Laura; Klawonn, Thorsten; Kriehuber, Ralf; Hollert, Henner; Schäfers, Christoph; Fenske, Martina
2018-03-01
Metal toxicity is a global environmental challenge. Fish are particularly prone to metal exposure, which can be lethal or cause sublethal physiological impairments. The objective of this study was to investigate how adverse effects of chronic exposure to non-toxic levels of essential and non-essential metals in early life stage zebrafish may be explained by changes in the transcriptome. We therefore studied the effects of three different metals at low concentrations in zebrafish embryos by transcriptomics analysis. The study design compared exposure effects caused by different metals at different developmental stages (pre-hatch and post-hatch). Wild-type embryos were exposed to solutions of low concentrations of copper (CuSO 4 ), cadmium (CdCl 2 ) and cobalt (CoSO 4 ) until 96h post-fertilization (hpf) and microarray experiments were carried out to determine transcriptome profiles at 48 and 96hpf. We found that the toxic metal cadmium affected the expression of more genes at 96hpf than 48hpf. The opposite effect was observed for the essential metals cobalt and copper, which also showed enrichment of different GO terms. Genes involved in neuromast and motor neuron development were significantly enriched, agreeing with our previous results showing motor neuron and neuromast damage in the embryos. Our data provide evidence that the response of the transcriptome of fish embryos to metal exposure differs for essential and non-essential metals. Copyright © 2017 Elsevier Inc. All rights reserved.
Shah, Faheem Afzal; Wang, Qiaojian; Wang, Zhaocheng; Wu, Lifang
2018-01-01
Pecan is an economically important nut crop tree due to its unique texture and flavor properties. The pecan seed is rich of unsaturated fatty acid and protein. However, little is known about the molecular mechanisms of the biosynthesis of fatty acids in the developing seeds. In this study, transcriptome sequencing of the developing seeds was performed using Illumina sequencing technology. Pecan seed embryos at different developmental stages were collected and sequenced. The transcriptomes of pecan seeds at two key developing stages (PA, the initial stage and PS, the fast oil accumulation stage) were also compared. A total of 82,155 unigenes, with an average length of 1,198 bp from seven independent libraries were generated. After functional annotations, we detected approximately 55,854 CDS, among which, 2,807 were Transcription Factor (TF) coding unigenes. Further, there were 13,325 unigenes that showed a 2-fold or greater expression difference between the two groups of libraries (two developmental stages). After transcriptome analysis, we identified abundant unigenes that could be involved in fatty acid biosynthesis, degradation and some other aspects of seed development in pecan. This study presents a comprehensive dataset of transcriptomic changes during the seed development of pecan. It provides insights in understanding the molecular mechanisms responsible for fatty acid biosynthesis in the seed development. The identification of functional genes will also be useful for the molecular breeding work of pecan. PMID:29694395
Xu, Zheng; Ni, Jun; Shah, Faheem Afzal; Wang, Qiaojian; Wang, Zhaocheng; Wu, Lifang; Fu, Songling
2018-01-01
Pecan is an economically important nut crop tree due to its unique texture and flavor properties. The pecan seed is rich of unsaturated fatty acid and protein. However, little is known about the molecular mechanisms of the biosynthesis of fatty acids in the developing seeds. In this study, transcriptome sequencing of the developing seeds was performed using Illumina sequencing technology. Pecan seed embryos at different developmental stages were collected and sequenced. The transcriptomes of pecan seeds at two key developing stages (PA, the initial stage and PS, the fast oil accumulation stage) were also compared. A total of 82,155 unigenes, with an average length of 1,198 bp from seven independent libraries were generated. After functional annotations, we detected approximately 55,854 CDS, among which, 2,807 were Transcription Factor (TF) coding unigenes. Further, there were 13,325 unigenes that showed a 2-fold or greater expression difference between the two groups of libraries (two developmental stages). After transcriptome analysis, we identified abundant unigenes that could be involved in fatty acid biosynthesis, degradation and some other aspects of seed development in pecan. This study presents a comprehensive dataset of transcriptomic changes during the seed development of pecan. It provides insights in understanding the molecular mechanisms responsible for fatty acid biosynthesis in the seed development. The identification of functional genes will also be useful for the molecular breeding work of pecan.
Karakülah, Gökhan
2017-06-28
Novel transcript discovery through RNA sequencing has substantially improved our understanding of the transcriptome dynamics of biological systems. Endogenous target mimicry (eTM) transcripts, a novel class of regulatory molecules, bind to their target microRNAs (miRNAs) by base pairing and block their biological activity. The objective of this study was to provide a computational analysis framework for the prediction of putative eTM sequences in plants, and as an example, to discover previously un-annotated eTMs in Prunus persica (peach) transcriptome. Therefore, two public peach transcriptome libraries downloaded from Sequence Read Archive (SRA) and a previously published set of long non-coding RNAs (lncRNAs) were investigated with multi-step analysis pipeline, and 44 putative eTMs were found. Additionally, an eTM-miRNA-mRNA regulatory network module associated with peach fruit organ development was built via integration of the miRNA target information and predicted eTM-miRNA interactions. My findings suggest that one of the most widely expressed miRNA families among diverse plant species, miR156, might be potentially sponged by seven putative eTMs. Besides, the study indicates eTMs potentially play roles in the regulation of development processes in peach fruit via targeting specific miRNAs. In conclusion, by following the step-by step instructions provided in this study, novel eTMs can be identified and annotated effectively in public plant transcriptome libraries.
Niu, Donghong; Wang, Fei; Xie, Shumei; Sun, Fanyue; Wang, Ze; Peng, Maoxiao; Li, Jiale
2016-04-01
The razor clam Sinonovacula constricta is an important commercial species. The deficiency of developmental transcriptomic data is becoming the bottleneck of further researches on the mechanisms underlying settlement and metamorphosis in early development. In this study, de novo transcriptome sequencing was performed for S. constricta at different early developmental stages by using Illumina HiSeq 2000 paired-end (PE) sequencing technology. A total of 112,209,077 PE clean reads were generated. De novo assembly generated 249,795 contigs with an average length of 585 bp. Gene annotation resulted in the identification of 22,870 unigene hits against the NCBI database. Eight unique sequences related to metamorphosis were identified and analyzed using real-time PCR. The razor clam reference transcriptome would provide useful information on early developmental and metamorphosis mechanisms and could be used in the genetic breeding of shellfish.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peterson, Elena S.; McCue, Lee Ann; Rutledge, Alexandra C.
2012-04-25
Visual Exploration and Statistics to Promote Annotation (VESPA) is an interactive visual analysis software tool that facilitates the discovery of structural mis-annotations in prokaryotic genomes. VESPA integrates high-throughput peptide-centric proteomics data and oligo-centric or RNA-Seq transcriptomics data into a genomic context. The data may be interrogated via visual analysis across multiple levels of genomic resolution, linked searches, exports and interaction with BLAST to rapidly identify location of interest within the genome and evaluate potential mis-annotations.
Targeted exploration and analysis of large cross-platform human transcriptomic compendia
Zhu, Qian; Wong, Aaron K; Krishnan, Arjun; Aure, Miriam R; Tadych, Alicja; Zhang, Ran; Corney, David C; Greene, Casey S; Bongo, Lars A; Kristensen, Vessela N; Charikar, Moses; Li, Kai; Troyanskaya, Olga G.
2016-01-01
We present SEEK (http://seek.princeton.edu), a query-based search engine across very large transcriptomic data collections, including thousands of human data sets from almost 50 microarray and next-generation sequencing platforms. SEEK uses a novel query-level cross-validation-based algorithm to automatically prioritize data sets relevant to the query and a robust search approach to identify query-coregulated genes, pathways, and processes. SEEK provides cross-platform handling, multi-gene query search, iterative metadata-based search refinement, and extensive visualization-based analysis options. PMID:25581801
Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A; Lyons, Russell E; Salin, Krishna R; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B
2016-05-07
The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world's most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.
Jung, Hyungtaek; Yoon, Byung-Ha; Kim, Woo-Jin; Kim, Dong-Wook; Hurwood, David A.; Lyons, Russell E.; Salin, Krishna R.; Kim, Heui-Soo; Baek, Ilseon; Chand, Vincent; Mather, Peter B.
2016-01-01
The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp) were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium. PMID:27164098
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie
2018-01-01
Abstract Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. PMID:29106630
Strain-Dependent Transcriptome Signatures for Robustness in Lactococcus lactis
Dijkstra, Annereinou R.; Alkema, Wynand; Starrenburg, Marjo J. C.; van Hijum, Sacha A. F. T.; Bron, Peter A.
2016-01-01
Recently, we demonstrated that fermentation conditions have a strong impact on subsequent survival of Lactococcus lactis strain MG1363 during heat and oxidative stress, two important parameters during spray drying. Moreover, employment of a transcriptome-phenotype matching approach revealed groups of genes associated with robustness towards heat and/or oxidative stress. To investigate if other strains have similar or distinct transcriptome signatures for robustness, we applied an identical transcriptome-robustness phenotype matching approach on the L. lactis strains IL1403, KF147 and SK11, which have previously been demonstrated to display highly diverse robustness phenotypes. These strains were subjected to an identical fermentation regime as was performed earlier for strain MG1363 and consisted of twelve conditions, varying in the level of salt and/or oxygen, as well as fermentation temperature and pH. In the exponential phase of growth, cells were harvested for transcriptome analysis and assessment of heat and oxidative stress survival phenotypes. The variation in fermentation conditions resulted in differences in heat and oxidative stress survival of up to five 10-log units. Effects of the fermentation conditions on stress survival of the L. lactis strains were typically strain-dependent, although the fermentation conditions had mainly similar effects on the growth characteristics of the different strains. By association of the transcriptomes and robustness phenotypes highly strain-specific transcriptome signatures for robustness towards heat and oxidative stress were identified, indicating that multiple mechanisms exist to increase robustness and, as a consequence, robustness of each strain requires individual optimization. However, a relatively small overlap in the transcriptome responses of the strains was also identified and this generic transcriptome signature included genes previously associated with stress (ctsR and lplL) and novel genes, including nanE and genes encoding transport proteins. The transcript levels of these genes can function as indicators of robustness and could aid in selection of fermentation parameters, potentially resulting in more optimal robustness during spray drying. PMID:27973578
Inferring Molecular Processes Heterogeneity from Transcriptional Data.
Gogolewski, Krzysztof; Wronowska, Weronika; Lech, Agnieszka; Lesyng, Bogdan; Gambin, Anna
2017-01-01
RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs.
Inferring Molecular Processes Heterogeneity from Transcriptional Data
Wronowska, Weronika; Lesyng, Bogdan; Gambin, Anna
2017-01-01
RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs. PMID:29362714
2011-01-01
Background Amaranthus hypochondriacus, a grain amaranth, is a C4 plant noted by its ability to tolerate stressful conditions and produce highly nutritious seeds. These possess an optimal amino acid balance and constitute a rich source of health-promoting peptides. Although several recent studies, mostly involving subtractive hybridization strategies, have contributed to increase the relatively low number of grain amaranth expressed sequence tags (ESTs), transcriptomic information of this species remains limited, particularly regarding tissue-specific and biotic stress-related genes. Thus, a large scale transcriptome analysis was performed to generate stem- and (a)biotic stress-responsive gene expression profiles in grain amaranth. Results A total of 2,700,168 raw reads were obtained from six 454 pyrosequencing runs, which were assembled into 21,207 high quality sequences (20,408 isotigs + 799 contigs). The average sequence length was 1,064 bp and 930 bp for isotigs and contigs, respectively. Only 5,113 singletons were recovered after quality control. Contigs/isotigs were further incorporated into 15,667 isogroups. All unique sequences were queried against the nr, TAIR, UniRef100, UniRef50 and Amaranthaceae EST databases for annotation. Functional GO annotation was performed with all contigs/isotigs that produced significant hits with the TAIR database. Only 8,260 sequences were found to be homologous when the transcriptomes of A. tuberculatus and A. hypochondriacus were compared, most of which were associated with basic house-keeping processes. Digital expression analysis identified 1,971 differentially expressed genes in response to at least one of four stress treatments tested. These included several multiple-stress-inducible genes that could represent potential candidates for use in the engineering of stress-resistant plants. The transcriptomic data generated from pigmented stems shared similarity with findings reported in developing stems of Arabidopsis and black cottonwood (Populus trichocarpa). Conclusions This study represents the first large-scale transcriptomic analysis of A. hypochondriacus, considered to be a highly nutritious and stress-tolerant crop. Numerous genes were found to be induced in response to (a)biotic stress, many of which could further the understanding of the mechanisms that contribute to multiple stress-resistance in plants, a trait that has potential biotechnological applications in agriculture. PMID:21752295
KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.
Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki
2013-07-09
The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with useful annotation information with easy-to-use web interfaces, which helps researchers to efficiently search for target sequences such as insect resistance-related genes. KONAGAbase will be continuously updated and additional genomic/transcriptomic resources and analysis tools will be provided for further efficient analysis of the mechanism of insecticide resistance and the development of effective insecticides with a novel mode of action for DBM.
Tao, Si-Qi; Cao, Bin; Tian, Cheng-Ming; Liang, Ying-Mei
2017-08-23
Rust fungi constitute the largest group of plant fungal pathogens. However, a paucity of data, including genomic sequences, transcriptome sequences, and associated molecular markers, hinders the development of inhibitory compounds and prevents their analysis from an evolutionary perspective. Gymnosporangium yamadae and G. asiaticum are two closely related rust fungal species, which are ecologically and economically important pathogens that cause apple rust and pear rust, respectively, proved to be devastating to orchards. In this study, we investigated the transcriptomes of these two Gymnosporangium species during the telial stage of their lifecycles. The aim of this study was to understand the evolutionary patterns of these two related fungi and to identify genes that developed by selection. The transcriptomes of G. yamadae and G. asiaticum were generated from a mixture of RNA from three biological replicates of each species. We obtained 49,318 and 54,742 transcripts, with N50 values of 1957 and 1664, for G. yamadae and G. asiaticum, respectively. We also identified a repertoire of candidate effectors and other gene families associated with pathogenicity. A total of 4947 pairs of putative orthologues between the two species were identified. Estimation of the non-synonymous/synonymous substitution rate ratios for these orthologues identified 116 pairs with Ka/Ks values greater than1 that are under positive selection and 170 pairs with Ka/Ks values of 1 that are under neutral selection, whereas the remaining 4661 genes are subjected to purifying selection. We estimate that the divergence time between the two species is approximately 5.2 Mya. This study constitutes a de novo assembly and comparative analysis between the transcriptomes of the two rust species G. yamadae and G. asiaticum. The results identified several orthologous genes, and many expressed genes were identified by annotation. Our analysis of Ka/Ks ratios identified orthologous genes subjected to positive or purifying selection. An evolutionary analysis of these two species provided a relatively precise divergence time. Overall, the information obtained in this study increases the genetic resources available for research on the genetic diversity of the Gymnosporangium genus.
Juranic Lisnic, Vanda; Babic Cac, Marina; Lisnic, Berislav; Trsan, Tihana; Mefferd, Adam; Das Mukhopadhyay, Chitrangada; Cook, Charles H.; Jonjic, Stipan; Trgovcich, Joanne
2013-01-01
Major gaps in our knowledge of pathogen genes and how these gene products interact with host gene products to cause disease represent a major obstacle to progress in vaccine and antiviral drug development for the herpesviruses. To begin to bridge these gaps, we conducted a dual analysis of Murine Cytomegalovirus (MCMV) and host cell transcriptomes during lytic infection. We analyzed the MCMV transcriptome during lytic infection using both classical cDNA cloning and sequencing of viral transcripts and next generation sequencing of transcripts (RNA-Seq). We also investigated the host transcriptome using RNA-Seq combined with differential gene expression analysis, biological pathway analysis, and gene ontology analysis. We identify numerous novel spliced and unspliced transcripts of MCMV. Unexpectedly, the most abundantly transcribed viral genes are of unknown function. We found that the most abundant viral transcript, recently identified as a noncoding RNA regulating cellular microRNAs, also codes for a novel protein. To our knowledge, this is the first viral transcript that functions both as a noncoding RNA and an mRNA. We also report that lytic infection elicits a profound cellular response in fibroblasts. Highly upregulated and induced host genes included those involved in inflammation and immunity, but also many unexpected transcription factors and host genes related to development and differentiation. Many top downregulated and repressed genes are associated with functions whose roles in infection are obscure, including host long intergenic noncoding RNAs, antisense RNAs or small nucleolar RNAs. Correspondingly, many differentially expressed genes cluster in biological pathways that may shed new light on cytomegalovirus pathogenesis. Together, these findings provide new insights into the molecular warfare at the virus-host interface and suggest new areas of research to advance the understanding and treatment of cytomegalovirus-associated diseases. PMID:24086132